[{"id":1768475,"web_url":"http://patchwork.ozlabs.org/comment/1768475/","msgid":"<87fubpaa1o.fsf@linaro.org>","list_archive_url":null,"date":"2017-09-14T09:45:07","subject":"Re: [PATCH v2 16/28] arm64/sve: Probe SVE capabilities and usable\n\tvector lengths","submitter":{"id":39532,"url":"http://patchwork.ozlabs.org/api/people/39532/","name":"Alex Bennée","email":"alex.bennee@linaro.org"},"content":"Dave Martin <Dave.Martin@arm.com> writes:\n\n> This patch uses the cpufeatures framework to determine common SVE\n> capabilities and vector lengths, and configures the runtime SVE\n> support code appropriately.\n>\n> ZCR_ELx is not really a feature register, but it is convenient to\n> use it as a template for recording the maximum vector length\n> supported by a CPU, using the LEN field.  This field is similar to\n> a feature field in that it is a contiguous bitfield for which we\n> want to determine the minimum system-wide value.  This patch adds\n> ZCR as a pseudo-register in cpuinfo/cpufeatures, with appropriate\n> custom code to populate it.  Finding the minimum supported value of\n> the LEN field is left to the cpufeatures framework in the usual\n> way.\n>\n> The meaning of ID_AA64ZFR0_EL1 is not architecturally defined yet,\n> so for now we just require it to be zero.\n>\n> Note that much of this code is dormant and SVE still won't be used\n> yet, since system_supports_sve() remains hardwired to false.\n>\n> Signed-off-by: Dave Martin <Dave.Martin@arm.com>\n> Cc: Alex Bennée <alex.bennee@linaro.org>\n> Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>\n>\n> ---\n>\n> Changes since v1\n> ----------------\n>\n> Requested by Alex Bennée:\n>\n> * Thin out BUG_ON()s:\n> Redundant BUG_ON()s and ones that just check invariants are removed.\n> Important sanity-checks are migrated to WARN_ON()s, with some\n> minimal best-effort patch-up code.\n>\n> Other changes related to Alex Bennée's comments:\n>\n> * Migrate away from magic numbers for converting VL to VQ.\n>\n> Requested by Suzuki Poulose:\n>\n> * Make sve_vq_map __ro_after_init.\n>\n> Other changes related to Suzuki Poulose's comments:\n>\n> * Rely on cpufeatures for not attempting to update the vq map after boot.\n> ---\n>  arch/arm64/include/asm/cpu.h        |   4 ++\n>  arch/arm64/include/asm/cpufeature.h |  29 ++++++++++\n>  arch/arm64/include/asm/fpsimd.h     |  10 ++++\n>  arch/arm64/kernel/cpufeature.c      |  50 +++++++++++++++++\n>  arch/arm64/kernel/cpuinfo.c         |   6 ++\n>  arch/arm64/kernel/fpsimd.c          | 106 +++++++++++++++++++++++++++++++++++-\n>  6 files changed, 202 insertions(+), 3 deletions(-)\n>\n> diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h\n> index 889226b..8839227 100644\n> --- a/arch/arm64/include/asm/cpu.h\n> +++ b/arch/arm64/include/asm/cpu.h\n> @@ -41,6 +41,7 @@ struct cpuinfo_arm64 {\n>  \tu64\t\treg_id_aa64mmfr2;\n>  \tu64\t\treg_id_aa64pfr0;\n>  \tu64\t\treg_id_aa64pfr1;\n> +\tu64\t\treg_id_aa64zfr0;\n>\n>  \tu32\t\treg_id_dfr0;\n>  \tu32\t\treg_id_isar0;\n> @@ -59,6 +60,9 @@ struct cpuinfo_arm64 {\n>  \tu32\t\treg_mvfr0;\n>  \tu32\t\treg_mvfr1;\n>  \tu32\t\treg_mvfr2;\n> +\n> +\t/* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */\n> +\tu64\t\treg_zcr;\n>  };\n>\n>  DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data);\n> diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h\n> index 4ea3441..d98e7ba 100644\n> --- a/arch/arm64/include/asm/cpufeature.h\n> +++ b/arch/arm64/include/asm/cpufeature.h\n> @@ -10,7 +10,9 @@\n>  #define __ASM_CPUFEATURE_H\n>\n>  #include <asm/cpucaps.h>\n> +#include <asm/fpsimd.h>\n>  #include <asm/hwcap.h>\n> +#include <asm/sigcontext.h>\n>  #include <asm/sysreg.h>\n>\n>  /*\n> @@ -223,6 +225,13 @@ static inline bool id_aa64pfr0_32bit_el0(u64 pfr0)\n>  \treturn val == ID_AA64PFR0_EL0_32BIT_64BIT;\n>  }\n>\n> +static inline bool id_aa64pfr0_sve(u64 pfr0)\n> +{\n> +\tu32 val = cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_SVE_SHIFT);\n> +\n> +\treturn val > 0;\n> +}\n> +\n>  void __init setup_cpu_features(void);\n>\n>  void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,\n> @@ -267,6 +276,26 @@ static inline bool system_supports_sve(void)\n>  \treturn false;\n>  }\n>\n> +/*\n> + * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE\n> + * vector length.\n> + * Use only if SVE is present.  This function clobbers the SVE vector length.\n> + */\n\n:nit whitespace formatting.\n\n> +static u64 __maybe_unused read_zcr_features(void)\n> +{\n> +\tu64 zcr;\n> +\tunsigned int vq_max;\n> +\n> +\twrite_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);\n\nI'm confused, why are we writing something here? You mention clobbering\nthe SVE vector length but what was the point?\n\n> +\n> +\tzcr = read_sysreg_s(SYS_ZCR_EL1);\n> +\tzcr &= ~(u64)ZCR_ELx_LEN_MASK;\n> +\tvq_max = sve_vq_from_vl(sve_get_vl());\n> +\tzcr |= vq_max - 1;\n> +\n> +\treturn zcr;\n> +}\n> +\n>  #endif /* __ASSEMBLY__ */\n>\n>  #endif\n> diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h\n> index 32c8e19..6c22624 100644\n> --- a/arch/arm64/include/asm/fpsimd.h\n> +++ b/arch/arm64/include/asm/fpsimd.h\n> @@ -92,12 +92,22 @@ extern void fpsimd_dup_sve(struct task_struct *dst,\n>  extern int sve_set_vector_length(struct task_struct *task,\n>  \t\t\t\t unsigned long vl, unsigned long flags);\n>\n> +extern void __init sve_init_vq_map(void);\n> +extern void sve_update_vq_map(void);\n> +extern int sve_verify_vq_map(void);\n> +extern void __init sve_setup(void);\n> +\n>  #else /* ! CONFIG_ARM64_SVE */\n>\n>  static void __maybe_unused sve_alloc(struct task_struct *task) { }\n>  static void __maybe_unused fpsimd_release_thread(struct task_struct *task) { }\n>  static void __maybe_unused fpsimd_dup_sve(struct task_struct *dst,\n>  \t\t\t\t\t  struct task_struct const *src) { }\n> +static void __maybe_unused sve_init_vq_map(void) { }\n> +static void __maybe_unused sve_update_vq_map(void) { }\n> +static int __maybe_unused sve_verify_vq_map(void) { return 0; }\n> +static void __maybe_unused sve_setup(void) { }\n> +\n>  #endif /* ! CONFIG_ARM64_SVE */\n>\n>  /* For use by EFI runtime services calls only */\n> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c\n> index 43ba8df..c30bb6b 100644\n> --- a/arch/arm64/kernel/cpufeature.c\n> +++ b/arch/arm64/kernel/cpufeature.c\n> @@ -27,6 +27,7 @@\n>  #include <asm/cpu.h>\n>  #include <asm/cpufeature.h>\n>  #include <asm/cpu_ops.h>\n> +#include <asm/fpsimd.h>\n>  #include <asm/mmu_context.h>\n>  #include <asm/processor.h>\n>  #include <asm/sysreg.h>\n> @@ -283,6 +284,12 @@ static const struct arm64_ftr_bits ftr_id_dfr0[] = {\n>  \tARM64_FTR_END,\n>  };\n>\n> +static const struct arm64_ftr_bits ftr_zcr[] = {\n> +\tARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE,\n> +\t\tZCR_ELx_LEN_SHIFT, ZCR_ELx_LEN_SIZE, 0),\t/* LEN */\n> +\tARM64_FTR_END,\n> +};\n> +\n>  /*\n>   * Common ftr bits for a 32bit register with all hidden, strict\n>   * attributes, with 4bit feature fields and a default safe value of\n> @@ -349,6 +356,7 @@ static const struct __ftr_reg_entry {\n>  \t/* Op1 = 0, CRn = 0, CRm = 4 */\n>  \tARM64_FTR_REG(SYS_ID_AA64PFR0_EL1, ftr_id_aa64pfr0),\n>  \tARM64_FTR_REG(SYS_ID_AA64PFR1_EL1, ftr_raz),\n> +\tARM64_FTR_REG(SYS_ID_AA64ZFR0_EL1, ftr_raz),\n>\n>  \t/* Op1 = 0, CRn = 0, CRm = 5 */\n>  \tARM64_FTR_REG(SYS_ID_AA64DFR0_EL1, ftr_id_aa64dfr0),\n> @@ -363,6 +371,9 @@ static const struct __ftr_reg_entry {\n>  \tARM64_FTR_REG(SYS_ID_AA64MMFR1_EL1, ftr_id_aa64mmfr1),\n>  \tARM64_FTR_REG(SYS_ID_AA64MMFR2_EL1, ftr_id_aa64mmfr2),\n>\n> +\t/* Op1 = 0, CRn = 1, CRm = 2 */\n> +\tARM64_FTR_REG(SYS_ZCR_EL1, ftr_zcr),\n> +\n>  \t/* Op1 = 3, CRn = 0, CRm = 0 */\n>  \t{ SYS_CTR_EL0, &arm64_ftr_reg_ctrel0 },\n>  \tARM64_FTR_REG(SYS_DCZID_EL0, ftr_dczid),\n> @@ -500,6 +511,7 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info)\n>  \tinit_cpu_ftr_reg(SYS_ID_AA64MMFR2_EL1, info->reg_id_aa64mmfr2);\n>  \tinit_cpu_ftr_reg(SYS_ID_AA64PFR0_EL1, info->reg_id_aa64pfr0);\n>  \tinit_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1);\n> +\tinit_cpu_ftr_reg(SYS_ID_AA64ZFR0_EL1, info->reg_id_aa64zfr0);\n>\n>  \tif (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) {\n>  \t\tinit_cpu_ftr_reg(SYS_ID_DFR0_EL1, info->reg_id_dfr0);\n> @@ -520,6 +532,10 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info)\n>  \t\tinit_cpu_ftr_reg(SYS_MVFR2_EL1, info->reg_mvfr2);\n>  \t}\n>\n> +\tif (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) {\n> +\t\tinit_cpu_ftr_reg(SYS_ZCR_EL1, info->reg_zcr);\n> +\t\tsve_init_vq_map();\n> +\t}\n>  }\n>\n>  static void update_cpu_ftr_reg(struct arm64_ftr_reg *reg, u64 new)\n> @@ -623,6 +639,9 @@ void update_cpu_features(int cpu,\n>  \ttaint |= check_update_ftr_reg(SYS_ID_AA64PFR1_EL1, cpu,\n>  \t\t\t\t      info->reg_id_aa64pfr1, boot->reg_id_aa64pfr1);\n>\n> +\ttaint |= check_update_ftr_reg(SYS_ID_AA64ZFR0_EL1, cpu,\n> +\t\t\t\t      info->reg_id_aa64zfr0, boot->reg_id_aa64zfr0);\n> +\n>  \t/*\n>  \t * If we have AArch32, we care about 32-bit features for compat.\n>  \t * If the system doesn't support AArch32, don't update them.\n> @@ -670,6 +689,14 @@ void update_cpu_features(int cpu,\n>  \t\t\t\t\tinfo->reg_mvfr2, boot->reg_mvfr2);\n>  \t}\n>\n> +\tif (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) {\n> +\t\ttaint |= check_update_ftr_reg(SYS_ZCR_EL1, cpu,\n> +\t\t\t\t\tinfo->reg_zcr, boot->reg_zcr);\n> +\n> +\t\tif (!sys_caps_initialised)\n> +\t\t\tsve_update_vq_map();\n> +\t}\n> +\n>  \t/*\n>  \t * Mismatched CPU features are a recipe for disaster. Don't even\n>  \t * pretend to support them.\n> @@ -1097,6 +1124,23 @@ verify_local_cpu_features(const struct arm64_cpu_capabilities *caps)\n>  \t}\n>  }\n>\n> +static void verify_sve_features(void)\n> +{\n> +\tu64 safe_zcr = read_sanitised_ftr_reg(SYS_ZCR_EL1);\n> +\tu64 zcr = read_zcr_features();\n> +\n> +\tunsigned int safe_len = safe_zcr & ZCR_ELx_LEN_MASK;\n> +\tunsigned int len = zcr & ZCR_ELx_LEN_MASK;\n> +\n> +\tif (len < safe_len || sve_verify_vq_map()) {\n> +\t\tpr_crit(\"CPU%d: SVE: required vector length(s) missing\\n\",\n> +\t\t\tsmp_processor_id());\n> +\t\tcpu_die_early();\n> +\t}\n> +\n> +\t/* Add checks on other ZCR bits here if necessary */\n> +}\n> +\n>  /*\n>   * Run through the enabled system capabilities and enable() it on this CPU.\n>   * The capabilities were decided based on the available CPUs at the boot time.\n> @@ -1110,8 +1154,12 @@ static void verify_local_cpu_capabilities(void)\n>  \tverify_local_cpu_errata_workarounds();\n>  \tverify_local_cpu_features(arm64_features);\n>  \tverify_local_elf_hwcaps(arm64_elf_hwcaps);\n> +\n>  \tif (system_supports_32bit_el0())\n>  \t\tverify_local_elf_hwcaps(compat_elf_hwcaps);\n> +\n> +\tif (system_supports_sve())\n> +\t\tverify_sve_features();\n>  }\n>\n>  void check_local_cpu_capabilities(void)\n> @@ -1189,6 +1237,8 @@ void __init setup_cpu_features(void)\n>  \tif (system_supports_32bit_el0())\n>  \t\tsetup_elf_hwcaps(compat_elf_hwcaps);\n>\n> +\tsve_setup();\n> +\n>  \t/* Advertise that we have computed the system capabilities */\n>  \tset_sys_caps_initialised();\n>\n> diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c\n> index 3118859..be260e8 100644\n> --- a/arch/arm64/kernel/cpuinfo.c\n> +++ b/arch/arm64/kernel/cpuinfo.c\n> @@ -19,6 +19,7 @@\n>  #include <asm/cpu.h>\n>  #include <asm/cputype.h>\n>  #include <asm/cpufeature.h>\n> +#include <asm/fpsimd.h>\n>\n>  #include <linux/bitops.h>\n>  #include <linux/bug.h>\n> @@ -326,6 +327,7 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)\n>  \tinfo->reg_id_aa64mmfr2 = read_cpuid(ID_AA64MMFR2_EL1);\n>  \tinfo->reg_id_aa64pfr0 = read_cpuid(ID_AA64PFR0_EL1);\n>  \tinfo->reg_id_aa64pfr1 = read_cpuid(ID_AA64PFR1_EL1);\n> +\tinfo->reg_id_aa64zfr0 = read_cpuid(ID_AA64ZFR0_EL1);\n>\n>  \t/* Update the 32bit ID registers only if AArch32 is implemented */\n>  \tif (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) {\n> @@ -348,6 +350,10 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)\n>  \t\tinfo->reg_mvfr2 = read_cpuid(MVFR2_EL1);\n>  \t}\n>\n> +\tif (IS_ENABLED(CONFIG_ARM64_SVE) &&\n> +\t    id_aa64pfr0_sve(info->reg_id_aa64pfr0))\n> +\t\tinfo->reg_zcr = read_zcr_features();\n> +\n>  \tcpuinfo_detect_icache_policy(info);\n>  }\n>\n> diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c\n> index 713476e..cea05a7 100644\n> --- a/arch/arm64/kernel/fpsimd.c\n> +++ b/arch/arm64/kernel/fpsimd.c\n> @@ -110,19 +110,19 @@\n>  static DEFINE_PER_CPU(struct fpsimd_state *, fpsimd_last_state);\n>\n>  /* Default VL for tasks that don't set it explicitly: */\n> -static int sve_default_vl = SVE_VL_MIN;\n> +static int sve_default_vl = -1;\n>\n>  #ifdef CONFIG_ARM64_SVE\n>\n>  /* Maximum supported vector length across all CPUs (initially poisoned) */\n>  int __ro_after_init sve_max_vl = -1;\n>  /* Set of available vector lengths, as vq_to_bit(vq): */\n> -static DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);\n> +static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);\n>\n>  #else /* ! CONFIG_ARM64_SVE */\n>\n>  /* Dummy declaration for code that will be optimised out: */\n> -extern DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);\n> +extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);\n>\n>  #endif /* ! CONFIG_ARM64_SVE */\n>\n> @@ -387,6 +387,103 @@ int sve_set_vector_length(struct task_struct *task,\n>  \treturn 0;\n>  }\n>\n> +static unsigned long *sve_alloc_vq_map(void)\n> +{\n> +\treturn kzalloc(BITS_TO_LONGS(SVE_VQ_MAX) * sizeof(unsigned long),\n> +\t\t       GFP_KERNEL);\n> +}\n> +\n> +static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))\n> +{\n> +\tunsigned int vq, vl;\n> +\tunsigned long zcr;\n> +\n> +\tzcr = ZCR_ELx_LEN_MASK;\n> +\tzcr = read_sysreg_s(SYS_ZCR_EL1) & ~zcr;\n> +\n> +\tfor (vq = SVE_VQ_MAX; vq >= SVE_VQ_MIN; --vq) {\n> +\t\twrite_sysreg_s(zcr | (vq - 1), SYS_ZCR_EL1); /* self-syncing */\n> +\t\tvl = sve_get_vl();\n> +\t\tvq = sve_vq_from_vl(vl); /* skip intervening lengths */\n> +\t\tset_bit(vq_to_bit(vq), map);\n> +\t}\n> +}\n> +\n> +void __init sve_init_vq_map(void)\n> +{\n> +\tsve_probe_vqs(sve_vq_map);\n> +}\n> +\n> +/*\n> + * If we haven't committed to the set of supported VQs yet, filter out\n> + * those not supported by the current CPU.\n> + */\n> +void sve_update_vq_map(void)\n> +{\n> +\tunsigned long *map;\n> +\n> +\tmap = sve_alloc_vq_map();\n> +\tsve_probe_vqs(map);\n> +\tbitmap_and(sve_vq_map, sve_vq_map, map, SVE_VQ_MAX);\n> +\tkfree(map);\n> +}\n> +\n> +/* Check whether the current CPU supports all VQs in the committed set */\n> +int sve_verify_vq_map(void)\n> +{\n> +\tint ret = 0;\n> +\tunsigned long *map = sve_alloc_vq_map();\n> +\n> +\tsve_probe_vqs(map);\n> +\tbitmap_andnot(map, sve_vq_map, map, SVE_VQ_MAX);\n> +\tif (!bitmap_empty(map, SVE_VQ_MAX)) {\n> +\t\tpr_warn(\"SVE: cpu%d: Required vector length(s) missing\\n\",\n> +\t\t\tsmp_processor_id());\n> +\t\tret = -EINVAL;\n> +\t}\n> +\n> +\tkfree(map);\n> +\n> +\treturn ret;\n> +}\n> +\n> +void __init sve_setup(void)\n> +{\n> +\tu64 zcr;\n> +\n> +\tif (!system_supports_sve())\n> +\t\treturn;\n> +\n> +\t/*\n> +\t * The SVE architecture mandates support for 128-bit vectors,\n> +\t * so sve_vq_map must have at least SVE_VQ_MIN set.\n> +\t * If something went wrong, at least try to patch it up:\n> +\t */\n> +\tif (WARN_ON(!test_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map)))\n> +\t\tset_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map);\n> +\n> +\tzcr = read_sanitised_ftr_reg(SYS_ZCR_EL1);\n> +\tsve_max_vl = sve_vl_from_vq((zcr & ZCR_ELx_LEN_MASK) + 1);\n> +\n> +\t/*\n> +\t * Sanity-check that the max VL we determined through CPU features\n> +\t * corresponds properly to sve_vq_map.  If not, do our best:\n> +\t */\n> +\tif (WARN_ON(sve_max_vl != find_supported_vector_length(sve_max_vl)))\n> +\t\tsve_max_vl = find_supported_vector_length(sve_max_vl);\n> +\n> +\t/*\n> +\t * For the default VL, pick the maximum supported value <= 64.\n> +\t * VL == 64 is guaranteed not to grow the signal frame.\n> +\t */\n> +\tsve_default_vl = find_supported_vector_length(64);\n> +\n> +\tpr_info(\"SVE: maximum available vector length %u bytes per vector\\n\",\n> +\t\tsve_max_vl);\n> +\tpr_info(\"SVE: default vector length %u bytes per vector\\n\",\n> +\t\tsve_default_vl);\n> +}\n> +\n>  void fpsimd_release_thread(struct task_struct *dead_task)\n>  {\n>  \tsve_free(dead_task);\n> @@ -502,6 +599,9 @@ void fpsimd_flush_thread(void)\n>  \t\t * This is where we ensure that all user tasks have a valid\n>  \t\t * vector length configured: no kernel task can become a user\n>  \t\t * task without an exec and hence a call to this function.\n> +\t\t * By the time the first call to this function is made, all\n> +\t\t * early hardware probing is complete, so sve_default_vl\n> +\t\t * should be valid.\n>  \t\t * If a bug causes this to go wrong, we make some noise and\n>  \t\t * try to fudge thread.sve_vl to a safe value here.\n>  \t\t */\n\n\nOtherwise:\n\nReviewed-by: Alex Bennée <alex.bennee@linaro.org>\n\n--\nAlex Bennée","headers":{"Return-Path":"<libc-alpha-return-84602-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-84602-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"jQMevPnF\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xtDDM0plCz9s7g\n\tfor <incoming@patchwork.ozlabs.org>;\n\tThu, 14 Sep 2017 19:45:38 +1000 (AEST)","(qmail 117645 invoked by alias); 14 Sep 2017 09:45:29 -0000","(qmail 117550 invoked by uid 89); 14 Sep 2017 09:45:18 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:references:from:to:cc:subject:in-reply-to:date\n\t:message-id:mime-version:content-type:content-transfer-encoding;\n\tq=dns; s=default; b=kOaXXX6pgIUE2H4xw70KzUn7p/gakX+sjejrE3+viC8\n\tnyIM7S5YzPCbc3Sh+658QLiWY4xhRhPZA0tSkiv2SNYrYMpy4S73U3sQi8ZoPeCc\n\t/60yVZRc09xKVfJxUStIHixDN5wCOyEx3QZKCJdBOLtjj3t3kgldeHhJizpPfjjw\n\t=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:references:from:to:cc:subject:in-reply-to:date\n\t:message-id:mime-version:content-type:content-transfer-encoding;\n\ts=default; bh=Lt/v8FkiV506EbcQGm3TUcL9HWA=; b=jQMevPnFG19jv89Ks\n\tRHUag4qF3LgAYQY9xqW5Wga+L5xJ0I4o0OOqeo6v9pRIdXde3ZAYeASnHz5wwUK4\n\tjsQdnpho64MNsJu65pvnpvg2H0JTgG9zADXAXARkoedm1IU95U+Zg0eYt/U4/Bx8\n\t/mDwbTkXK0ncHXKtmZGklhAVKY=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-26.4 required=5.0 tests=AWL, BAYES_00,\n\tGIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3,\n\tRCVD_IN_DNSWL_NONE,\n\tSPF_PASS autolearn=ham version=3.3.2 spammy=advertise, dormant","X-HELO":"mail-wm0-f43.google.com","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:references:user-agent:from:to:cc:subject\n\t:in-reply-to:date:message-id:mime-version:content-transfer-encoding; \n\tbh=oGwGtoQPXHYfMHB03E4pbNUC4oIoQ3RI3WD0JKy6Sik=;\n\tb=YmCa+mJ50QxWqZYFc6VrBx7wBE4h3qbWfkzC41hCfCs5zAnTVXqww5kcVf122k6gvB\n\tSCZb3ZX/8WRAfuD1sN1RrQ2UNuHBxoU8qPB0HVZvNdo9xz3OPzXysoSkLkpllzDkcZ0p\n\tqgDqgV0TO2nuYhrG77FzDf43undpssRULfoJUC1GgkiMnvOu+86BKgSDLQuuiwUaZFh7\n\tZgXWcTd6+ehie0FC/Tyl7rEM4WuUUVejnfygzErDDA/dID+RgRLxuMnMekN5Y0JqOojG\n\tiQfRO4RlUp3hHZG9OcE5yKC5Cbj/LqdxfUF86yR+nHlfX6MF7iMHSZbEatdpXRUPyXjp\n\tEPvw==","X-Gm-Message-State":"AHPjjUjVnv6gtf0VsuSQ8P7zR23tGPb4llbL8uVoSX/JWYdhmUWtD9KP\n\t9UrNm5lynBfH/+he","X-Google-Smtp-Source":"AOwi7QCGXAVf85PlEdqpTVfWMbSiNSH2dnqMDGEx8NWl2CeV6taqYGkNVM6gpNANt1dSzycq24CJjQ==","X-Received":"by 10.28.208.72 with SMTP id h69mr1473326wmg.134.1505382308727; \n\tThu, 14 Sep 2017 02:45:08 -0700 (PDT)","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-17-git-send-email-Dave.Martin@arm.com>","User-agent":"mu4e 0.9.19; emacs 25.2.50.3","From":"Alex =?utf-8?q?Benn=C3=A9e?= <alex.bennee@linaro.org>","To":"Dave Martin <Dave.Martin@arm.com>","Cc":"linux-arm-kernel@lists.infradead.org,\n\tCatalin Marinas <catalin.marinas@arm.com>,\n\tWill Deacon <will.deacon@arm.com>,\n\tArd Biesheuvel <ard.biesheuvel@linaro.org>,\n\tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>,\n\tkvmarm@lists.cs.columbia.edu, libc-alpha@sourceware.org,\n\tlinux-arch@vger.kernel.org, Suzuki K Poulose <Suzuki.Poulose@arm.com>","Subject":"Re: [PATCH v2 16/28] arm64/sve: Probe SVE capabilities and usable\n\tvector lengths","In-reply-to":"<1504198860-12951-17-git-send-email-Dave.Martin@arm.com>","Date":"Thu, 14 Sep 2017 10:45:07 +0100","Message-ID":"<87fubpaa1o.fsf@linaro.org>","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Transfer-Encoding":"8bit"}},{"id":1777058,"web_url":"http://patchwork.ozlabs.org/comment/1777058/","msgid":"<20170928142212.GB3611@e103592.cambridge.arm.com>","list_archive_url":null,"date":"2017-09-28T14:22:12","subject":"Re: [PATCH v2 16/28] arm64/sve: Probe SVE capabilities and usable\n\tvector lengths","submitter":{"id":26612,"url":"http://patchwork.ozlabs.org/api/people/26612/","name":"Dave Martin","email":"Dave.Martin@arm.com"},"content":"On Thu, Sep 14, 2017 at 10:45:07AM +0100, Alex Bennée wrote:\n> \n> Dave Martin <Dave.Martin@arm.com> writes:\n> \n> > This patch uses the cpufeatures framework to determine common SVE\n> > capabilities and vector lengths, and configures the runtime SVE\n> > support code appropriately.\n> >\n> > ZCR_ELx is not really a feature register, but it is convenient to\n> > use it as a template for recording the maximum vector length\n> > supported by a CPU, using the LEN field.  This field is similar to\n> > a feature field in that it is a contiguous bitfield for which we\n> > want to determine the minimum system-wide value.  This patch adds\n> > ZCR as a pseudo-register in cpuinfo/cpufeatures, with appropriate\n> > custom code to populate it.  Finding the minimum supported value of\n> > the LEN field is left to the cpufeatures framework in the usual\n> > way.\n> >\n> > The meaning of ID_AA64ZFR0_EL1 is not architecturally defined yet,\n> > so for now we just require it to be zero.\n> >\n> > Note that much of this code is dormant and SVE still won't be used\n> > yet, since system_supports_sve() remains hardwired to false.\n> >\n> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>\n> > Cc: Alex Bennée <alex.bennee@linaro.org>\n> > Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>\n> >\n> > ---\n> >\n> > Changes since v1\n> > ----------------\n> >\n> > Requested by Alex Bennée:\n> >\n> > * Thin out BUG_ON()s:\n> > Redundant BUG_ON()s and ones that just check invariants are removed.\n> > Important sanity-checks are migrated to WARN_ON()s, with some\n> > minimal best-effort patch-up code.\n> >\n> > Other changes related to Alex Bennée's comments:\n> >\n> > * Migrate away from magic numbers for converting VL to VQ.\n> >\n> > Requested by Suzuki Poulose:\n> >\n> > * Make sve_vq_map __ro_after_init.\n> >\n> > Other changes related to Suzuki Poulose's comments:\n> >\n> > * Rely on cpufeatures for not attempting to update the vq map after boot.\n> > ---\n> >  arch/arm64/include/asm/cpu.h        |   4 ++\n> >  arch/arm64/include/asm/cpufeature.h |  29 ++++++++++\n> >  arch/arm64/include/asm/fpsimd.h     |  10 ++++\n> >  arch/arm64/kernel/cpufeature.c      |  50 +++++++++++++++++\n> >  arch/arm64/kernel/cpuinfo.c         |   6 ++\n> >  arch/arm64/kernel/fpsimd.c          | 106 +++++++++++++++++++++++++++++++++++-\n> >  6 files changed, 202 insertions(+), 3 deletions(-)\n> >\n> > diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h\n> > index 889226b..8839227 100644\n> > --- a/arch/arm64/include/asm/cpu.h\n> > +++ b/arch/arm64/include/asm/cpu.h\n> > @@ -41,6 +41,7 @@ struct cpuinfo_arm64 {\n> >  \tu64\t\treg_id_aa64mmfr2;\n> >  \tu64\t\treg_id_aa64pfr0;\n> >  \tu64\t\treg_id_aa64pfr1;\n> > +\tu64\t\treg_id_aa64zfr0;\n> >\n> >  \tu32\t\treg_id_dfr0;\n> >  \tu32\t\treg_id_isar0;\n> > @@ -59,6 +60,9 @@ struct cpuinfo_arm64 {\n> >  \tu32\t\treg_mvfr0;\n> >  \tu32\t\treg_mvfr1;\n> >  \tu32\t\treg_mvfr2;\n> > +\n> > +\t/* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */\n> > +\tu64\t\treg_zcr;\n> >  };\n> >\n> >  DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data);\n> > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h\n> > index 4ea3441..d98e7ba 100644\n> > --- a/arch/arm64/include/asm/cpufeature.h\n> > +++ b/arch/arm64/include/asm/cpufeature.h\n> > @@ -10,7 +10,9 @@\n> >  #define __ASM_CPUFEATURE_H\n> >\n> >  #include <asm/cpucaps.h>\n> > +#include <asm/fpsimd.h>\n> >  #include <asm/hwcap.h>\n> > +#include <asm/sigcontext.h>\n> >  #include <asm/sysreg.h>\n> >\n> >  /*\n> > @@ -223,6 +225,13 @@ static inline bool id_aa64pfr0_32bit_el0(u64 pfr0)\n> >  \treturn val == ID_AA64PFR0_EL0_32BIT_64BIT;\n> >  }\n> >\n> > +static inline bool id_aa64pfr0_sve(u64 pfr0)\n> > +{\n> > +\tu32 val = cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_SVE_SHIFT);\n> > +\n> > +\treturn val > 0;\n> > +}\n> > +\n> >  void __init setup_cpu_features(void);\n> >\n> >  void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,\n> > @@ -267,6 +276,26 @@ static inline bool system_supports_sve(void)\n> >  \treturn false;\n> >  }\n> >\n> > +/*\n> > + * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE\n> > + * vector length.\n> > + * Use only if SVE is present.  This function clobbers the SVE vector length.\n> > + */\n> \n> :nit whitespace formatting.\n\nI'll add some newlines now to make this cleaner.\n\n/*\n * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE\n * vector length.\n *\n * Use only if SVE is present.\n * This function clobbers the SVE vector length.\n */\n\nOK?\n\n> \n> > +static u64 __maybe_unused read_zcr_features(void)\n> > +{\n> > +\tu64 zcr;\n> > +\tunsigned int vq_max;\n> > +\n> > +\twrite_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);\n> \n> I'm confused, why are we writing something here? You mention clobbering\n> the SVE vector length but what was the point?\n\nHmm, this deserves a comment -- coming back to this code, I had to think\nabout it.  Are the following extra comments sufficient explanation?\n\n\t/*\n\t * Set the maximum possible VL, and write zeroes to all other\n\t * bits to see if they stick.\n\t */\n\twrite_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);\n\n\tzcr = read_sysreg_s(SYS_ZCR_EL1);\n\tzcr &= ~(u64)ZCR_ELx_LEN_MASK; /* flag up sticky 1s outside LEN field */\n\tvq_max = sve_vq_from_vl(sve_get_vl());\n\tzcr |= vq_max - 1; /* set LEN field to maximum effective value */\n\n\n[...]\n\n> Otherwise:\n> \n> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>\n\nI'll wait on your responses to the above first.\n\nCheers\n---Dave","headers":{"Return-Path":"<libc-alpha-return-85077-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-85077-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"Mk4fS9Pk\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3y2xjH5n6Xz9tXd\n\tfor <incoming@patchwork.ozlabs.org>;\n\tFri, 29 Sep 2017 00:22:27 +1000 (AEST)","(qmail 16260 invoked by alias); 28 Sep 2017 14:22:21 -0000","(qmail 16250 invoked by uid 89); 28 Sep 2017 14:22:21 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:content-transfer-encoding\n\t:in-reply-to; q=dns; s=default; b=XqaBOrnVHgsTXao5byQLd8FzGC6CT2\n\tGTZtLDGMMtxMhXsQmQ6ss/yacykbMnluYPJZOUaQfAv+RWgMZDui3zwPHiJhLjQ/\n\tEI2UxfLZEYEz+dpzotuMx5VFIauH/uXr+RzK2k1v1tKi+pMqDIN+p2vuK+HHtcKx\n\tEGAWUvzUndtZY=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:content-transfer-encoding\n\t:in-reply-to; s=default; bh=ieKhRRNO0HR8WE/RLPRTOHHQhnA=; b=Mk4f\n\tS9PkxAWSkVZNga5sGyZvdfim/Uge525wo9ihKMSjYfYer2YwVm1gqGR05RUPYSWb\n\tZuqXn9X9ooqE+S6sdD+4Iea4HPzyk/U7sXiXDmfj8k6vGlLgww1PfwSAY1dVtFAz\n\t0xRHon+3MKakk1N2CJZJUcb7Bn8TfjiRKpkx9Uk=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0,\n\tGIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RP_MATCHES_RCVD,\n\tSPF_PASS autolearn=ham version=3.3.2 spammy=Dave, responses,\n\tsticky, dormant","X-HELO":"foss.arm.com","Date":"Thu, 28 Sep 2017 15:22:12 +0100","From":"Dave Martin <Dave.Martin@arm.com>","To":"Alex =?iso-8859-1?q?Benn=E9e?= <alex.bennee@linaro.org>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org,\n\tArd Biesheuvel <ard.biesheuvel@linaro.org>,\n\tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tCatalin Marinas <catalin.marinas@arm.com>,\n\tSuzuki K Poulose <Suzuki.Poulose@arm.com>,\n\tWill Deacon <will.deacon@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>,\n\tkvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 16/28] arm64/sve: Probe SVE capabilities and usable\n\tvector lengths","Message-ID":"<20170928142212.GB3611@e103592.cambridge.arm.com>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-17-git-send-email-Dave.Martin@arm.com>\n\t<87fubpaa1o.fsf@linaro.org>","MIME-Version":"1.0","Content-Type":"text/plain; charset=iso-8859-1","Content-Disposition":"inline","Content-Transfer-Encoding":"8bit","In-Reply-To":"<87fubpaa1o.fsf@linaro.org>","User-Agent":"Mutt/1.5.23 (2014-03-12)"}},{"id":1777186,"web_url":"http://patchwork.ozlabs.org/comment/1777186/","msgid":"<87k20id90f.fsf@linaro.org>","list_archive_url":null,"date":"2017-09-28T17:32:16","subject":"Re: [PATCH v2 16/28] arm64/sve: Probe SVE capabilities and usable\n\tvector lengths","submitter":{"id":39532,"url":"http://patchwork.ozlabs.org/api/people/39532/","name":"Alex Bennée","email":"alex.bennee@linaro.org"},"content":"Dave Martin <Dave.Martin@arm.com> writes:\n\n> On Thu, Sep 14, 2017 at 10:45:07AM +0100, Alex Bennée wrote:\n>>\n>> Dave Martin <Dave.Martin@arm.com> writes:\n>>\n>> > This patch uses the cpufeatures framework to determine common SVE\n>> > capabilities and vector lengths, and configures the runtime SVE\n>> > support code appropriately.\n>> >\n>> > ZCR_ELx is not really a feature register, but it is convenient to\n>> > use it as a template for recording the maximum vector length\n>> > supported by a CPU, using the LEN field.  This field is similar to\n>> > a feature field in that it is a contiguous bitfield for which we\n>> > want to determine the minimum system-wide value.  This patch adds\n>> > ZCR as a pseudo-register in cpuinfo/cpufeatures, with appropriate\n>> > custom code to populate it.  Finding the minimum supported value of\n>> > the LEN field is left to the cpufeatures framework in the usual\n>> > way.\n>> >\n>> > The meaning of ID_AA64ZFR0_EL1 is not architecturally defined yet,\n>> > so for now we just require it to be zero.\n>> >\n>> > Note that much of this code is dormant and SVE still won't be used\n>> > yet, since system_supports_sve() remains hardwired to false.\n>> >\n>> > Signed-off-by: Dave Martin <Dave.Martin@arm.com>\n>> > Cc: Alex Bennée <alex.bennee@linaro.org>\n>> > Cc: Suzuki K Poulose <Suzuki.Poulose@arm.com>\n>> >\n>> > ---\n>> >\n>> > Changes since v1\n>> > ----------------\n>> >\n>> > Requested by Alex Bennée:\n>> >\n>> > * Thin out BUG_ON()s:\n>> > Redundant BUG_ON()s and ones that just check invariants are removed.\n>> > Important sanity-checks are migrated to WARN_ON()s, with some\n>> > minimal best-effort patch-up code.\n>> >\n>> > Other changes related to Alex Bennée's comments:\n>> >\n>> > * Migrate away from magic numbers for converting VL to VQ.\n>> >\n>> > Requested by Suzuki Poulose:\n>> >\n>> > * Make sve_vq_map __ro_after_init.\n>> >\n>> > Other changes related to Suzuki Poulose's comments:\n>> >\n>> > * Rely on cpufeatures for not attempting to update the vq map after boot.\n>> > ---\n>> >  arch/arm64/include/asm/cpu.h        |   4 ++\n>> >  arch/arm64/include/asm/cpufeature.h |  29 ++++++++++\n>> >  arch/arm64/include/asm/fpsimd.h     |  10 ++++\n>> >  arch/arm64/kernel/cpufeature.c      |  50 +++++++++++++++++\n>> >  arch/arm64/kernel/cpuinfo.c         |   6 ++\n>> >  arch/arm64/kernel/fpsimd.c          | 106 +++++++++++++++++++++++++++++++++++-\n>> >  6 files changed, 202 insertions(+), 3 deletions(-)\n>> >\n>> > diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h\n>> > index 889226b..8839227 100644\n>> > --- a/arch/arm64/include/asm/cpu.h\n>> > +++ b/arch/arm64/include/asm/cpu.h\n>> > @@ -41,6 +41,7 @@ struct cpuinfo_arm64 {\n>> >  \tu64\t\treg_id_aa64mmfr2;\n>> >  \tu64\t\treg_id_aa64pfr0;\n>> >  \tu64\t\treg_id_aa64pfr1;\n>> > +\tu64\t\treg_id_aa64zfr0;\n>> >\n>> >  \tu32\t\treg_id_dfr0;\n>> >  \tu32\t\treg_id_isar0;\n>> > @@ -59,6 +60,9 @@ struct cpuinfo_arm64 {\n>> >  \tu32\t\treg_mvfr0;\n>> >  \tu32\t\treg_mvfr1;\n>> >  \tu32\t\treg_mvfr2;\n>> > +\n>> > +\t/* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */\n>> > +\tu64\t\treg_zcr;\n>> >  };\n>> >\n>> >  DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data);\n>> > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h\n>> > index 4ea3441..d98e7ba 100644\n>> > --- a/arch/arm64/include/asm/cpufeature.h\n>> > +++ b/arch/arm64/include/asm/cpufeature.h\n>> > @@ -10,7 +10,9 @@\n>> >  #define __ASM_CPUFEATURE_H\n>> >\n>> >  #include <asm/cpucaps.h>\n>> > +#include <asm/fpsimd.h>\n>> >  #include <asm/hwcap.h>\n>> > +#include <asm/sigcontext.h>\n>> >  #include <asm/sysreg.h>\n>> >\n>> >  /*\n>> > @@ -223,6 +225,13 @@ static inline bool id_aa64pfr0_32bit_el0(u64 pfr0)\n>> >  \treturn val == ID_AA64PFR0_EL0_32BIT_64BIT;\n>> >  }\n>> >\n>> > +static inline bool id_aa64pfr0_sve(u64 pfr0)\n>> > +{\n>> > +\tu32 val = cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_SVE_SHIFT);\n>> > +\n>> > +\treturn val > 0;\n>> > +}\n>> > +\n>> >  void __init setup_cpu_features(void);\n>> >\n>> >  void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,\n>> > @@ -267,6 +276,26 @@ static inline bool system_supports_sve(void)\n>> >  \treturn false;\n>> >  }\n>> >\n>> > +/*\n>> > + * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE\n>> > + * vector length.\n>> > + * Use only if SVE is present.  This function clobbers the SVE vector length.\n>> > + */\n>>\n>> :nit whitespace formatting.\n>\n> I'll add some newlines now to make this cleaner.\n>\n> /*\n>  * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE\n>  * vector length.\n>  *\n>  * Use only if SVE is present.\n>  * This function clobbers the SVE vector length.\n>  */\n>\n> OK?\n\nYep.\n>\n>>\n>> > +static u64 __maybe_unused read_zcr_features(void)\n>> > +{\n>> > +\tu64 zcr;\n>> > +\tunsigned int vq_max;\n>> > +\n>> > +\twrite_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);\n>>\n>> I'm confused, why are we writing something here? You mention clobbering\n>> the SVE vector length but what was the point?\n>\n> Hmm, this deserves a comment -- coming back to this code, I had to think\n> about it.  Are the following extra comments sufficient explanation?\n>\n> \t/*\n> \t * Set the maximum possible VL, and write zeroes to all other\n> \t * bits to see if they stick.\n> \t */\n> \twrite_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);\n>\n> \tzcr = read_sysreg_s(SYS_ZCR_EL1);\n> \tzcr &= ~(u64)ZCR_ELx_LEN_MASK; /* flag up sticky 1s outside LEN field */\n> \tvq_max = sve_vq_from_vl(sve_get_vl());\n> \tzcr |= vq_max - 1; /* set LEN field to maximum effective value */\n>\n>\n> [...]\n\nOK that makes more sense. Thanks.\n>\n>> Otherwise:\n>>\n>> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>\n>\n> I'll wait on your responses to the above first.\n\nStill good ;-)\n\n>\n> Cheers\n> ---Dave\n\n\n--\nAlex Bennée","headers":{"Return-Path":"<libc-alpha-return-85086-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-85086-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"hBfAcyln\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3y31wg6Jqhz9t5Y\n\tfor <incoming@patchwork.ozlabs.org>;\n\tFri, 29 Sep 2017 03:32:35 +1000 (AEST)","(qmail 124605 invoked by alias); 28 Sep 2017 17:32:29 -0000","(qmail 124593 invoked by uid 89); 28 Sep 2017 17:32:28 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:references:from:to:cc:subject:in-reply-to:date\n\t:message-id:mime-version:content-type:content-transfer-encoding;\n\tq=dns; s=default; b=UYBCYffqU7B9PKlHcEC/qKvwtRnHkBTcQrX+xg0wDF6\n\tCWrMdDSpDkva4tEl/Dp4Kre77aD6U9xPVJmuJxc0f7sl1nkGpoU0faNcpPuriVk5\n\ttorfCI9ePwJ1jTDcHcljQ7KCMdDDtJJzzMFDLo8Wz+8gM/pbtN3r/+6E59ijTSk4\n\t=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:references:from:to:cc:subject:in-reply-to:date\n\t:message-id:mime-version:content-type:content-transfer-encoding;\n\ts=default; bh=Wq9iW/GslE5Wr5ezpfOXB/cN/Uk=; b=hBfAcylnZVclDUnnu\n\tF795126iyTrNQeZDiHBMOv+T5OU0LYW6fUkE8lp8/4Cv1hdkySQ7W+gcMaJl3Y3l\n\tVUnaxVf6z3UUfAA0GIveuJVvT0uura1WQrL00oNrdWYYc9taYfxarqLKsbcv09kI\n\tHX9qftYsNqEkBFLJ+o7gmII02k=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-26.1 required=5.0 tests=AWL, BAYES_00,\n\tGIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3,\n\tRCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM,\n\tSPF_PASS autolearn=ham version=3.3.2 spammy=","X-HELO":"mail-wr0-f169.google.com","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n\td=1e100.net; s=20161025;\n\th=x-gm-message-state:references:user-agent:from:to:cc:subject\n\t:in-reply-to:date:message-id:mime-version:content-transfer-encoding; \n\tbh=ImZg7oDhmgG9/NZucP+g+uwl56SkIKpBJtgyVZ8DtIo=;\n\tb=FDw9Vnup0kzwVY+fCLYFxVR8huOWQHOQkVQd4U1hbAPITCSDg8SVQbn4s2YT9KBymS\n\tu3r2CYAuW19G4KLApL+dYAaUC1iG2Gk/DUB2S0ocKhVg/6UAMV8YlFMr1/Qm7ZUVgmBq\n\ttqXb0M+cIZcK8ErhR5dwR6dGgN2mxVZ8tPWJ0TLt8F76APXuEIWUGQ+lxZYySlbqwyDa\n\tdw//qOCCS1BrKYGq5aUF+h6sHSGbWpPPZ1Ro0EV5ZPI+gErn4enTek4PN4OAskyMeFPt\n\trlRF1mAWKqeva4BAJz6X7AqLeKCRLhftZL6fm+NVUbA2BSxzyIQ/xVZ6VzcslV+3/22o\n\t1Gbg==","X-Gm-Message-State":"AHPjjUgQ1aszTundt2Rs1hfiAI0N7T2U2vgbs/kLHkvzwfDAw5s8saLV\n\tAobP9iSSt7kr72WO5/gt951UrQ==","X-Google-Smtp-Source":"AOwi7QD8aJaSeYoTC1G42nwrGQQS3yNcLL856d+nFpZnMbAzHxEWDZZ4e3+/iN1po9oObi/WRUB8tQ==","X-Received":"by 10.223.182.11 with SMTP id f11mr5298469wre.112.1506619944417; \n\tThu, 28 Sep 2017 10:32:24 -0700 (PDT)","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-17-git-send-email-Dave.Martin@arm.com>\n\t<87fubpaa1o.fsf@linaro.org>\n\t<20170928142212.GB3611@e103592.cambridge.arm.com>","User-agent":"mu4e 0.9.19; emacs 26.0.60","From":"Alex =?utf-8?q?Benn=C3=A9e?= <alex.bennee@linaro.org>","To":"Dave Martin <Dave.Martin@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org,\n\tArd Biesheuvel <ard.biesheuvel@linaro.org>,\n\tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tCatalin Marinas <catalin.marinas@arm.com>,\n\tSuzuki K Poulose <Suzuki.Poulose@arm.com>,\n\tWill Deacon <will.deacon@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>,\n\tkvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 16/28] arm64/sve: Probe SVE capabilities and usable\n\tvector lengths","In-reply-to":"<20170928142212.GB3611@e103592.cambridge.arm.com>","Date":"Thu, 28 Sep 2017 18:32:16 +0100","Message-ID":"<87k20id90f.fsf@linaro.org>","MIME-Version":"1.0","Content-Type":"text/plain; charset=utf-8","Content-Transfer-Encoding":"8bit"}}]