Message ID | 20210525132216.1239259-5-kjain@linux.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Add perf interface to expose nvdimm | expand |
Related | show |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch powerpc/merge (112f47a1484ddca610b70cbe4a99f0d0f1701daf) |
snowpatch_ozlabs/build-ppc64le | success | Build succeeded |
snowpatch_ozlabs/build-ppc64be | success | Build succeeded |
snowpatch_ozlabs/build-ppc64e | success | Build succeeded |
snowpatch_ozlabs/build-pmac32 | success | Build succeeded |
snowpatch_ozlabs/checkpatch | success | total: 0 errors, 0 warnings, 0 checks, 145 lines checked |
snowpatch_ozlabs/needsstable | success | Patch has no Fixes tags |
On Tue, May 25, 2021 at 06:52:16PM +0530, Kajol Jain wrote: > Patch here adds cpu hotplug functions to nvdimm pmu. I'm thinking "Patch here" qualifies for "This patch", see Documentation/process/submitting-patches.rst . > It adds cpumask to designate a cpu to make HCALL to > collect the counter data for the nvdimm device and > update ABI documentation accordingly. > > Result in power9 lpar system: > command:# cat /sys/devices/nmem0/cpumask > 0 Is this specific to the papr thing, or should this be in generic nvdimm code?
On 5/25/21 7:46 PM, Peter Zijlstra wrote: > On Tue, May 25, 2021 at 06:52:16PM +0530, Kajol Jain wrote: >> Patch here adds cpu hotplug functions to nvdimm pmu. > > I'm thinking "Patch here" qualifies for "This patch", see > Documentation/process/submitting-patches.rst . > Hi Peter, I will reword this commit message. >> It adds cpumask to designate a cpu to make HCALL to >> collect the counter data for the nvdimm device and >> update ABI documentation accordingly. >> >> Result in power9 lpar system: >> command:# cat /sys/devices/nmem0/cpumask >> 0 > > Is this specific to the papr thing, or should this be in generic nvdimm > code? This code is not specific to papr device and we can move it to generic nvdimm interface. But do we need to add some checks on whether any arch/platform specific driver want that support or we can assume that this will be something needed by all platforms? Thanks, Kajol Jain
On Wed, May 26, 2021 at 12:56:58PM +0530, kajoljain wrote: > On 5/25/21 7:46 PM, Peter Zijlstra wrote: > > On Tue, May 25, 2021 at 06:52:16PM +0530, Kajol Jain wrote: > >> It adds cpumask to designate a cpu to make HCALL to > >> collect the counter data for the nvdimm device and > >> update ABI documentation accordingly. > >> > >> Result in power9 lpar system: > >> command:# cat /sys/devices/nmem0/cpumask > >> 0 > > > > Is this specific to the papr thing, or should this be in generic nvdimm > > code? > > This code is not specific to papr device and we can move it to > generic nvdimm interface. But do we need to add some checks on whether > any arch/platform specific driver want that support or we can assume > that this will be something needed by all platforms? I'm a complete NVDIMM n00b, but to me it would appear they would have to conform to the normal memory hierarchy and would thus always be per-node. Also, if/when deviation from this rule is observed, we can always rework/extend this. For now I think it would make sense to have the per-node ness of the thing expressed in the generic layer.
On 5/26/21 12:56 PM, kajoljain wrote: > > > On 5/25/21 7:46 PM, Peter Zijlstra wrote: >> On Tue, May 25, 2021 at 06:52:16PM +0530, Kajol Jain wrote: >>> Patch here adds cpu hotplug functions to nvdimm pmu. >> >> I'm thinking "Patch here" qualifies for "This patch", see >> Documentation/process/submitting-patches.rst . >> > Hi Peter, > I will reword this commit message. > >>> It adds cpumask to designate a cpu to make HCALL to >>> collect the counter data for the nvdimm device and >>> update ABI documentation accordingly. >>> >>> Result in power9 lpar system: >>> command:# cat /sys/devices/nmem0/cpumask >>> 0 >> >> Is this specific to the papr thing, or should this be in generic nvdimm >> code? > > This code is not specific to papr device and we can move it to > generic nvdimm interface. But do we need to add some checks on whether > any arch/platform specific driver want that support or we can assume > that this will be something needed by all platforms? > It says the cpu that should be used to make the hcall. That makes it PAPR specific. -aneesh
On 5/26/21 2:15 PM, Aneesh Kumar K.V wrote: > On 5/26/21 12:56 PM, kajoljain wrote: >> >> >> On 5/25/21 7:46 PM, Peter Zijlstra wrote: >>> On Tue, May 25, 2021 at 06:52:16PM +0530, Kajol Jain wrote: >>>> Patch here adds cpu hotplug functions to nvdimm pmu. >>> >>> I'm thinking "Patch here" qualifies for "This patch", see >>> Documentation/process/submitting-patches.rst . >>> >> Hi Peter, >> I will reword this commit message. >> >>>> It adds cpumask to designate a cpu to make HCALL to >>>> collect the counter data for the nvdimm device and >>>> update ABI documentation accordingly. >>>> >>>> Result in power9 lpar system: >>>> command:# cat /sys/devices/nmem0/cpumask >>>> 0 >>> >>> Is this specific to the papr thing, or should this be in generic nvdimm >>> code? >> >> This code is not specific to papr device and we can move it to >> generic nvdimm interface. But do we need to add some checks on whether >> any arch/platform specific driver want that support or we can assume >> that this will be something needed by all platforms? >> > > It says the cpu that should be used to make the hcall. That makes it PAPR specific. Hi Aneesh, The hcall in the commit message basically pointing to the method we used to get counter data. But adding cpumask to a PMU is not specific to powerpc. So, Incase other platform/arch want to enable hotplug feature, they can use same code for that and hence we can move it to generic nvdimm interface. Our concerned it mainly about is it right to assume from the common code point of view, if the cpumask attr is null, common code can add the cpumask support to it, or do we need to have explicit flag for the device to request for it. Thanks, Kajol Jain > > -aneesh >
On 5/26/21 2:02 PM, Peter Zijlstra wrote: > On Wed, May 26, 2021 at 12:56:58PM +0530, kajoljain wrote: >> On 5/25/21 7:46 PM, Peter Zijlstra wrote: >>> On Tue, May 25, 2021 at 06:52:16PM +0530, Kajol Jain wrote: > >>>> It adds cpumask to designate a cpu to make HCALL to >>>> collect the counter data for the nvdimm device and >>>> update ABI documentation accordingly. >>>> >>>> Result in power9 lpar system: >>>> command:# cat /sys/devices/nmem0/cpumask >>>> 0 >>> >>> Is this specific to the papr thing, or should this be in generic nvdimm >>> code? >> >> This code is not specific to papr device and we can move it to >> generic nvdimm interface. But do we need to add some checks on whether >> any arch/platform specific driver want that support or we can assume >> that this will be something needed by all platforms? > > I'm a complete NVDIMM n00b, but to me it would appear they would have to > conform to the normal memory hierarchy and would thus always be > per-node. > > Also, if/when deviation from this rule is observed, we can always > rework/extend this. For now I think it would make sense to have the > per-node ness of the thing expressed in the generic layer. > Hi Peter, Thanks for the suggestion, I will send new RFC patchset with these changes. Thanks, Kajol Jain
diff --git a/Documentation/ABI/testing/sysfs-bus-papr-pmem b/Documentation/ABI/testing/sysfs-bus-papr-pmem index 38c4daf65af2..986df1691914 100644 --- a/Documentation/ABI/testing/sysfs-bus-papr-pmem +++ b/Documentation/ABI/testing/sysfs-bus-papr-pmem @@ -76,6 +76,12 @@ Description: (RO) Attribute group to describe the magic bits For example:: noopstat = "event=0x1" +What: /sys/devices/nmemX/cpumask +Date: May 2021 +Contact: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, linux-nvdimm@lists.01.org, +Description: (RO) This sysfs file exposes the cpumask which is designated to make + HCALLs to retrieve nvdimm pmu event counter data. + What: /sys/devices/nmemX/events Date: May 2021 Contact: linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, linux-nvdimm@lists.01.org, diff --git a/arch/powerpc/platforms/pseries/papr_scm.c b/arch/powerpc/platforms/pseries/papr_scm.c index f2d57da98ff4..76121d876b7f 100644 --- a/arch/powerpc/platforms/pseries/papr_scm.c +++ b/arch/powerpc/platforms/pseries/papr_scm.c @@ -339,6 +339,28 @@ static ssize_t drc_pmem_query_stats(struct papr_scm_priv *p, return 0; } +static ssize_t cpumask_show(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct pmu *pmu = dev_get_drvdata(dev); + struct nvdimm_pmu *nd_pmu; + + nd_pmu = container_of(pmu, struct nvdimm_pmu, pmu); + + return cpumap_print_to_pagebuf(true, buf, cpumask_of(nd_pmu->cpu)); +} + +static DEVICE_ATTR_RO(cpumask); + +static struct attribute *nvdimm_cpumask_attrs[] = { + &dev_attr_cpumask.attr, + NULL, +}; + +static const struct attribute_group nvdimm_pmu_cpumask_group = { + .attrs = nvdimm_cpumask_attrs, +}; + PMU_FORMAT_ATTR(event, "config:0-4"); static struct attribute *nvdimm_pmu_format_attr[] = { @@ -459,6 +481,24 @@ static void papr_scm_pmu_del(struct perf_event *event, int flags) papr_scm_pmu_read(event); } +static int nvdimm_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node) +{ + struct nvdimm_pmu *nd_pmu; + int target; + + nd_pmu = hlist_entry_safe(node, struct nvdimm_pmu, node); + + if (cpu != nd_pmu->cpu) + return 0; + + target = cpumask_last(cpu_active_mask); + if (target < 0 || target >= nr_cpu_ids) + return -1; + + nd_pmu->cpu = target; + return 0; +} + static ssize_t device_show_string(struct device *dev, struct device_attribute *attr, char *buf) { @@ -603,6 +643,7 @@ static int papr_scm_pmu_check_events(struct papr_scm_priv *p, struct nvdimm_pmu /* Fill attribute groups for the nvdimm pmu device */ nd_pmu->attr_groups[NVDIMM_PMU_FORMAT_ATTR] = &nvdimm_pmu_format_group; nd_pmu->attr_groups[NVDIMM_PMU_EVENT_ATTR] = nvdimm_pmu_events_group; + nd_pmu->attr_groups[NVDIMM_PMU_CPUMASK_ATTR] = &nvdimm_pmu_cpumask_group; nd_pmu->attr_groups[NVDIMM_PMU_NULL_ATTR] = NULL; kfree(single_stats); @@ -652,6 +693,20 @@ static void papr_scm_pmu_register(struct papr_scm_priv *p) nd_pmu->read = papr_scm_pmu_read; nd_pmu->add = papr_scm_pmu_add; nd_pmu->del = papr_scm_pmu_del; + nd_pmu->cpu = raw_smp_processor_id(); + + rc = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN, + "perf/nvdimm:online", + NULL, nvdimm_pmu_offline_cpu); + if (rc < 0) + goto pmu_cpuhp_setup_err; + + nd_pmu->cpuhp_state = rc; + + /* Register the pmu instance for cpu hotplug */ + rc = cpuhp_state_add_instance_nocalls(nd_pmu->cpuhp_state, &nd_pmu->node); + if (rc) + goto cpuhp_instance_err; rc = register_nvdimm_pmu(nd_pmu, p->pdev); if (rc) @@ -665,6 +720,10 @@ static void papr_scm_pmu_register(struct papr_scm_priv *p) return; pmu_register_err: + cpuhp_state_remove_instance_nocalls(nd_pmu->cpuhp_state, &nd_pmu->node); +cpuhp_instance_err: + cpuhp_remove_multi_state(nd_pmu->cpuhp_state); +pmu_cpuhp_setup_err: nvdimm_pmu_mem_free(nd_pmu); kfree(p->nvdimm_events_map); pmu_check_events_err: @@ -675,6 +734,8 @@ static void papr_scm_pmu_register(struct papr_scm_priv *p) static void nvdimm_pmu_uinit(struct nvdimm_pmu *nd_pmu) { + cpuhp_state_remove_instance_nocalls(nd_pmu->cpuhp_state, &nd_pmu->node); + cpuhp_remove_multi_state(nd_pmu->cpuhp_state); unregister_nvdimm_pmu(&nd_pmu->pmu); nvdimm_pmu_mem_free(nd_pmu); kfree(nd_pmu); diff --git a/include/linux/nd.h b/include/linux/nd.h index a0e0619256be..177795413ab3 100644 --- a/include/linux/nd.h +++ b/include/linux/nd.h @@ -28,7 +28,8 @@ enum nvdimm_claim_class { /* Event attribute array index */ #define NVDIMM_PMU_FORMAT_ATTR 0 #define NVDIMM_PMU_EVENT_ATTR 1 -#define NVDIMM_PMU_NULL_ATTR 2 +#define NVDIMM_PMU_CPUMASK_ATTR 2 +#define NVDIMM_PMU_NULL_ATTR 3 /** * struct nvdimm_pmu - data structure for nvdimm perf driver @@ -37,7 +38,10 @@ enum nvdimm_claim_class { * @pmu: pmu data structure for nvdimm performance stats. * @dev: nvdimm device pointer. * @functions(event_init/add/del/read): platform specific pmu functions. - * @attr_groups: data structure for events and formats. + * @attr_groups: data structure for events, formats and cpumask + * @cpu: designated cpu for counter access. + * @node: node for cpu hotplug notifier link. + * @cpuhp_state: state for cpu hotplug notification. */ struct nvdimm_pmu { const char *name; @@ -49,10 +53,13 @@ struct nvdimm_pmu { void (*read)(struct perf_event *event); /* * Attribute groups for the nvdimm pmu. Index 0 used for - * format attribute, index 1 used for event attribute and - * index 2 kept as NULL. + * format attribute, index 1 used for event attribute, + * index 2 used for cpusmask attribute and index 3 kept as NULL. */ - const struct attribute_group *attr_groups[3]; + const struct attribute_group *attr_groups[4]; + int cpu; + struct hlist_node node; + enum cpuhp_state cpuhp_state; }; int register_nvdimm_pmu(struct nvdimm_pmu *nvdimm, struct platform_device *pdev);
Patch here adds cpu hotplug functions to nvdimm pmu. It adds cpumask to designate a cpu to make HCALL to collect the counter data for the nvdimm device and update ABI documentation accordingly. Result in power9 lpar system: command:# cat /sys/devices/nmem0/cpumask 0 Signed-off-by: Kajol Jain <kjain@linux.ibm.com> --- Documentation/ABI/testing/sysfs-bus-papr-pmem | 6 ++ arch/powerpc/platforms/pseries/papr_scm.c | 61 +++++++++++++++++++ include/linux/nd.h | 17 ++++-- 3 files changed, 79 insertions(+), 5 deletions(-)