{"id":831243,"url":"http://patchwork.ozlabs.org/api/1.2/patches/831243/?format=json","web_url":"http://patchwork.ozlabs.org/project/linux-imx/patch/1509101470-7881-17-git-send-email-Dave.Martin@arm.com/","project":{"id":19,"url":"http://patchwork.ozlabs.org/api/1.2/projects/19/?format=json","name":"Linux IMX development","link_name":"linux-imx","list_id":"linux-imx-kernel.lists.patchwork.ozlabs.org","list_email":"linux-imx-kernel@lists.patchwork.ozlabs.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<1509101470-7881-17-git-send-email-Dave.Martin@arm.com>","list_archive_url":null,"date":"2017-10-27T10:50:58","name":"[v4,16/28] arm64/sve: Probe SVE capabilities and usable vector lengths","commit_ref":null,"pull_url":null,"state":"new","archived":false,"hash":"613d703764128a72aae0d618ae99a452c2506fe7","submitter":{"id":26612,"url":"http://patchwork.ozlabs.org/api/1.2/people/26612/?format=json","name":"Dave Martin","email":"Dave.Martin@arm.com"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/linux-imx/patch/1509101470-7881-17-git-send-email-Dave.Martin@arm.com/mbox/","series":[{"id":10556,"url":"http://patchwork.ozlabs.org/api/1.2/series/10556/?format=json","web_url":"http://patchwork.ozlabs.org/project/linux-imx/list/?series=10556","date":"2017-10-27T10:50:43","name":"ARM Scalable Vector Extension (SVE)","version":4,"mbox":"http://patchwork.ozlabs.org/series/10556/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/831243/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/831243/checks/","tags":{},"related":[],"headers":{"Return-Path":"<linux-arm-kernel-bounces+incoming-imx=patchwork.ozlabs.org@lists.infradead.org>","X-Original-To":"incoming-imx@patchwork.ozlabs.org","Delivered-To":"patchwork-incoming-imx@bilbo.ozlabs.org","Authentication-Results":["ozlabs.org;\n\tspf=none (mailfrom) smtp.mailfrom=lists.infradead.org\n\t(client-ip=65.50.211.133; helo=bombadil.infradead.org;\n\tenvelope-from=linux-arm-kernel-bounces+incoming-imx=patchwork.ozlabs.org@lists.infradead.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (2048-bit key;\n\tunprotected) header.d=lists.infradead.org\n\theader.i=@lists.infradead.org\n\theader.b=\"f+HrdPm5\"; dkim-atps=neutral"],"Received":["from bombadil.infradead.org (bombadil.infradead.org\n\t[65.50.211.133])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3yNgpX5Pv6z9rxj\n\tfor <incoming-imx@patchwork.ozlabs.org>;\n\tFri, 27 Oct 2017 21:58:28 +1100 (AEDT)","from localhost ([127.0.0.1] helo=bombadil.infradead.org)\n\tby bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux))\n\tid 1e82LK-0001W5-S4; Fri, 27 Oct 2017 10:58:26 +0000","from foss.arm.com ([217.140.101.70])\n\tby bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux))\n\tid 1e82FP-00045Z-Qm for linux-arm-kernel@lists.infradead.org;\n\tFri, 27 Oct 2017 10:52:44 +0000","from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])\n\tby usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D040C165D;\n\tFri, 27 Oct 2017 03:51:50 -0700 (PDT)","from e103592.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com\n\t[10.72.51.249])\n\tby usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id\n\t211023F24A; Fri, 27 Oct 2017 03:51:48 -0700 (PDT)"],"DKIM-Signature":"v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed;\n\td=lists.infradead.org; s=bombadil.20170209; h=Sender:\n\tContent-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post:\n\tList-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To:\n\tMessage-Id:Date:Subject:To:From:Reply-To:Content-ID:Content-Description:\n\tResent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:\n\tList-Owner; bh=Aybv/qGELhB1OOJ29SdNuUyWMA9MhdKuGfhMxMpZMD8=;\n\tb=f+HrdPm5+ZtZ3Q\n\tg15s1nrRAz4J89S5KUEKM06HP65gy4ti4VYxhPgBMCFPu0YvTMrKzz8STw2iObcI4Lw8TNvyRJSFE\n\tYryq6RaHg31cijtW4DjQMeuANCOXx0NyPSPZft0h/BYKcPuCMlFYGBCtG+AVAz2DagneItaJxCpXZ\n\txB6XKleOLexgqbN+n0D34azFJMCOC8Ak8qEV5POrFeMyVJ4lpRtM35KbtZr0lEH62n0cgkeFjs+5F\n\tGf9SD94rWYIWAmabMnccnooReiJgZOoCxN7NxbJnKn4VjoZ4N7yi423hSFNP3KqKJyl8olXhQxFv6\n\tQlrGM5KG7IN9j1JuehdA==;","From":"Dave Martin <Dave.Martin@arm.com>","To":"linux-arm-kernel@lists.infradead.org","Subject":"[PATCH v4 16/28] arm64/sve: Probe SVE capabilities and usable vector\n\tlengths","Date":"Fri, 27 Oct 2017 11:50:58 +0100","Message-Id":"<1509101470-7881-17-git-send-email-Dave.Martin@arm.com>","X-Mailer":"git-send-email 2.1.4","In-Reply-To":"<1509101470-7881-1-git-send-email-Dave.Martin@arm.com>","References":"<1509101470-7881-1-git-send-email-Dave.Martin@arm.com>","MIME-Version":"1.0","X-CRM114-Version":"20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 ","X-CRM114-CacheID":"sfid-20171027_035220_375011_CB2C8499 ","X-CRM114-Status":"GOOD (  24.18  )","X-Spam-Score":"-6.9 (------)","X-Spam-Report":"SpamAssassin version 3.4.1 on bombadil.infradead.org summary:\n\tContent analysis details:   (-6.9 points)\n\tpts rule name              description\n\t---- ----------------------\n\t--------------------------------------------------\n\t-5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at http://www.dnswl.org/,\n\thigh trust [217.140.101.70 listed in list.dnswl.org]\n\t-0.0 SPF_PASS               SPF: sender matches SPF record\n\t-0.0 RP_MATCHES_RCVD Envelope sender domain matches handover relay\n\tdomain\n\t-1.9 BAYES_00               BODY: Bayes spam probability is 0 to 1%\n\t[score: 0.0000]","X-BeenThere":"linux-arm-kernel@lists.infradead.org","X-Mailman-Version":"2.1.21","Precedence":"list","List-Unsubscribe":"<http://lists.infradead.org/mailman/options/linux-arm-kernel>,\n\t<mailto:linux-arm-kernel-request@lists.infradead.org?subject=unsubscribe>","List-Archive":"<http://lists.infradead.org/pipermail/linux-arm-kernel/>","List-Post":"<mailto:linux-arm-kernel@lists.infradead.org>","List-Help":"<mailto:linux-arm-kernel-request@lists.infradead.org?subject=help>","List-Subscribe":"<http://lists.infradead.org/mailman/listinfo/linux-arm-kernel>,\n\t<mailto:linux-arm-kernel-request@lists.infradead.org?subject=subscribe>","Cc":"linux-arch@vger.kernel.org, Okamoto Takayuki <tokamoto@jp.fujitsu.com>,\n\tlibc-alpha@sourceware.org, Ard Biesheuvel <ard.biesheuvel@linaro.org>, \n\tSzabolcs Nagy <szabolcs.nagy@arm.com>, \n\tCatalin Marinas <catalin.marinas@arm.com>,\n\tWill Deacon <will.deacon@arm.com>, =?utf-8?q?Alex_Benn=C3=A9e?=\n\t<alex.bennee@linaro.org>,  kvmarm@lists.cs.columbia.edu","Content-Type":"text/plain; charset=\"utf-8\"","Content-Transfer-Encoding":"base64","Sender":"\"linux-arm-kernel\" <linux-arm-kernel-bounces@lists.infradead.org>","Errors-To":"linux-arm-kernel-bounces+incoming-imx=patchwork.ozlabs.org@lists.infradead.org","List-Id":"linux-imx-kernel.lists.patchwork.ozlabs.org"},"content":"This patch uses the cpufeatures framework to determine common SVE\ncapabilities and vector lengths, and configures the runtime SVE\nsupport code appropriately.\n\nZCR_ELx is not really a feature register, but it is convenient to\nuse it as a template for recording the maximum vector length\nsupported by a CPU, using the LEN field.  This field is similar to\na feature field in that it is a contiguous bitfield for which we\nwant to determine the minimum system-wide value.  This patch adds\nZCR as a pseudo-register in cpuinfo/cpufeatures, with appropriate\ncustom code to populate it.  Finding the minimum supported value of\nthe LEN field is left to the cpufeatures framework in the usual\nway.\n\nThe meaning of ID_AA64ZFR0_EL1 is not architecturally defined yet,\nso for now we just require it to be zero.\n\nNote that much of this code is dormant and SVE still won't be used\nyet, since system_supports_sve() remains hardwired to false.\n\nSigned-off-by: Dave Martin <Dave.Martin@arm.com>\nReviewed-by: Suzuki K Poulose <suzuki.poulose@arm.com>\nCc: Alex Bennée <alex.bennee@linaro.org>\nCc: Catalin Marinas <catalin.marinas@arm.com>\n\n---\n\n**Dropped** Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>\n**Dropped at v3** Reviewed-by: Alex Bennée <alex.bennee@linaro.org>\n\nThe change requested by Suzuki (see below) is not quite trivial,\nthough he was happy for me to apply his Reviewed-by once the change\nwas made.\n\nChanges since v3\n----------------\n\nRequested by Catalin Marinas:\n\n * Replace __maybe_unused functions with static inlines.\n\nRequested by Suzuki Poulose:\n\n * Don't bother to probe for supported vector lengths if we already\n   decided SVE is not supported.\n---\n arch/arm64/include/asm/cpu.h        |   4 ++\n arch/arm64/include/asm/cpufeature.h |  36 ++++++++++++\n arch/arm64/include/asm/fpsimd.h     |  14 +++++\n arch/arm64/kernel/cpufeature.c      |  52 ++++++++++++++++\n arch/arm64/kernel/cpuinfo.c         |   6 ++\n arch/arm64/kernel/fpsimd.c          | 114 +++++++++++++++++++++++++++++++++++-\n 6 files changed, 223 insertions(+), 3 deletions(-)","diff":"diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h\nindex 889226b..8839227 100644\n--- a/arch/arm64/include/asm/cpu.h\n+++ b/arch/arm64/include/asm/cpu.h\n@@ -41,6 +41,7 @@ struct cpuinfo_arm64 {\n \tu64\t\treg_id_aa64mmfr2;\n \tu64\t\treg_id_aa64pfr0;\n \tu64\t\treg_id_aa64pfr1;\n+\tu64\t\treg_id_aa64zfr0;\n \n \tu32\t\treg_id_dfr0;\n \tu32\t\treg_id_isar0;\n@@ -59,6 +60,9 @@ struct cpuinfo_arm64 {\n \tu32\t\treg_mvfr0;\n \tu32\t\treg_mvfr1;\n \tu32\t\treg_mvfr2;\n+\n+\t/* pseudo-ZCR for recording maximum ZCR_EL1 LEN value: */\n+\tu64\t\treg_zcr;\n };\n \n DECLARE_PER_CPU(struct cpuinfo_arm64, cpu_data);\ndiff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h\nindex 4ea3441..9b27e8c 100644\n--- a/arch/arm64/include/asm/cpufeature.h\n+++ b/arch/arm64/include/asm/cpufeature.h\n@@ -10,7 +10,9 @@\n #define __ASM_CPUFEATURE_H\n \n #include <asm/cpucaps.h>\n+#include <asm/fpsimd.h>\n #include <asm/hwcap.h>\n+#include <asm/sigcontext.h>\n #include <asm/sysreg.h>\n \n /*\n@@ -223,6 +225,13 @@ static inline bool id_aa64pfr0_32bit_el0(u64 pfr0)\n \treturn val == ID_AA64PFR0_EL0_32BIT_64BIT;\n }\n \n+static inline bool id_aa64pfr0_sve(u64 pfr0)\n+{\n+\tu32 val = cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_SVE_SHIFT);\n+\n+\treturn val > 0;\n+}\n+\n void __init setup_cpu_features(void);\n \n void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,\n@@ -267,6 +276,33 @@ static inline bool system_supports_sve(void)\n \treturn false;\n }\n \n+/*\n+ * Read the pseudo-ZCR used by cpufeatures to identify the supported SVE\n+ * vector length.\n+ *\n+ * Use only if SVE is present.\n+ * This function clobbers the SVE vector length.\n+ */\n+static inline u64 read_zcr_features(void)\n+{\n+\tu64 zcr;\n+\tunsigned int vq_max;\n+\n+\t/*\n+\t * Set the maximum possible VL, and write zeroes to all other\n+\t * bits to see if they stick.\n+\t */\n+\tsve_kernel_enable(NULL);\n+\twrite_sysreg_s(ZCR_ELx_LEN_MASK, SYS_ZCR_EL1);\n+\n+\tzcr = read_sysreg_s(SYS_ZCR_EL1);\n+\tzcr &= ~(u64)ZCR_ELx_LEN_MASK; /* find sticky 1s outside LEN field */\n+\tvq_max = sve_vq_from_vl(sve_get_vl());\n+\tzcr |= vq_max - 1; /* set LEN field to maximum effective value */\n+\n+\treturn zcr;\n+}\n+\n #endif /* __ASSEMBLY__ */\n \n #endif\ndiff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h\nindex 86f550c..d8e0dc9 100644\n--- a/arch/arm64/include/asm/fpsimd.h\n+++ b/arch/arm64/include/asm/fpsimd.h\n@@ -78,6 +78,7 @@ extern void sve_save_state(void *state, u32 *pfpsr);\n extern void sve_load_state(void const *state, u32 const *pfpsr,\n \t\t\t   unsigned long vq_minus_1);\n extern unsigned int sve_get_vl(void);\n+extern int sve_kernel_enable(void *);\n \n extern int __ro_after_init sve_max_vl;\n \n@@ -90,10 +91,23 @@ extern void fpsimd_release_task(struct task_struct *task);\n extern int sve_set_vector_length(struct task_struct *task,\n \t\t\t\t unsigned long vl, unsigned long flags);\n \n+/*\n+ * Probing and setup functions.\n+ * Calls to these functions must be serialised with one another.\n+ */\n+extern void __init sve_init_vq_map(void);\n+extern void sve_update_vq_map(void);\n+extern int sve_verify_vq_map(void);\n+extern void __init sve_setup(void);\n+\n #else /* ! CONFIG_ARM64_SVE */\n \n static inline void sve_alloc(struct task_struct *task) { }\n static inline void fpsimd_release_task(struct task_struct *task) { }\n+static inline void sve_init_vq_map(void) { }\n+static inline void sve_update_vq_map(void) { }\n+static inline int sve_verify_vq_map(void) { return 0; }\n+static inline void sve_setup(void) { }\n \n #endif /* ! CONFIG_ARM64_SVE */\n \ndiff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c\nindex e226799..2154373 100644\n--- a/arch/arm64/kernel/cpufeature.c\n+++ b/arch/arm64/kernel/cpufeature.c\n@@ -27,6 +27,7 @@\n #include <asm/cpu.h>\n #include <asm/cpufeature.h>\n #include <asm/cpu_ops.h>\n+#include <asm/fpsimd.h>\n #include <asm/mmu_context.h>\n #include <asm/processor.h>\n #include <asm/sysreg.h>\n@@ -287,6 +288,12 @@ static const struct arm64_ftr_bits ftr_id_dfr0[] = {\n \tARM64_FTR_END,\n };\n \n+static const struct arm64_ftr_bits ftr_zcr[] = {\n+\tARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE,\n+\t\tZCR_ELx_LEN_SHIFT, ZCR_ELx_LEN_SIZE, 0),\t/* LEN */\n+\tARM64_FTR_END,\n+};\n+\n /*\n  * Common ftr bits for a 32bit register with all hidden, strict\n  * attributes, with 4bit feature fields and a default safe value of\n@@ -353,6 +360,7 @@ static const struct __ftr_reg_entry {\n \t/* Op1 = 0, CRn = 0, CRm = 4 */\n \tARM64_FTR_REG(SYS_ID_AA64PFR0_EL1, ftr_id_aa64pfr0),\n \tARM64_FTR_REG(SYS_ID_AA64PFR1_EL1, ftr_raz),\n+\tARM64_FTR_REG(SYS_ID_AA64ZFR0_EL1, ftr_raz),\n \n \t/* Op1 = 0, CRn = 0, CRm = 5 */\n \tARM64_FTR_REG(SYS_ID_AA64DFR0_EL1, ftr_id_aa64dfr0),\n@@ -367,6 +375,9 @@ static const struct __ftr_reg_entry {\n \tARM64_FTR_REG(SYS_ID_AA64MMFR1_EL1, ftr_id_aa64mmfr1),\n \tARM64_FTR_REG(SYS_ID_AA64MMFR2_EL1, ftr_id_aa64mmfr2),\n \n+\t/* Op1 = 0, CRn = 1, CRm = 2 */\n+\tARM64_FTR_REG(SYS_ZCR_EL1, ftr_zcr),\n+\n \t/* Op1 = 3, CRn = 0, CRm = 0 */\n \t{ SYS_CTR_EL0, &arm64_ftr_reg_ctrel0 },\n \tARM64_FTR_REG(SYS_DCZID_EL0, ftr_dczid),\n@@ -504,6 +515,7 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info)\n \tinit_cpu_ftr_reg(SYS_ID_AA64MMFR2_EL1, info->reg_id_aa64mmfr2);\n \tinit_cpu_ftr_reg(SYS_ID_AA64PFR0_EL1, info->reg_id_aa64pfr0);\n \tinit_cpu_ftr_reg(SYS_ID_AA64PFR1_EL1, info->reg_id_aa64pfr1);\n+\tinit_cpu_ftr_reg(SYS_ID_AA64ZFR0_EL1, info->reg_id_aa64zfr0);\n \n \tif (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) {\n \t\tinit_cpu_ftr_reg(SYS_ID_DFR0_EL1, info->reg_id_dfr0);\n@@ -524,6 +536,10 @@ void __init init_cpu_features(struct cpuinfo_arm64 *info)\n \t\tinit_cpu_ftr_reg(SYS_MVFR2_EL1, info->reg_mvfr2);\n \t}\n \n+\tif (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) {\n+\t\tinit_cpu_ftr_reg(SYS_ZCR_EL1, info->reg_zcr);\n+\t\tsve_init_vq_map();\n+\t}\n }\n \n static void update_cpu_ftr_reg(struct arm64_ftr_reg *reg, u64 new)\n@@ -627,6 +643,9 @@ void update_cpu_features(int cpu,\n \ttaint |= check_update_ftr_reg(SYS_ID_AA64PFR1_EL1, cpu,\n \t\t\t\t      info->reg_id_aa64pfr1, boot->reg_id_aa64pfr1);\n \n+\ttaint |= check_update_ftr_reg(SYS_ID_AA64ZFR0_EL1, cpu,\n+\t\t\t\t      info->reg_id_aa64zfr0, boot->reg_id_aa64zfr0);\n+\n \t/*\n \t * If we have AArch32, we care about 32-bit features for compat.\n \t * If the system doesn't support AArch32, don't update them.\n@@ -674,6 +693,16 @@ void update_cpu_features(int cpu,\n \t\t\t\t\tinfo->reg_mvfr2, boot->reg_mvfr2);\n \t}\n \n+\tif (id_aa64pfr0_sve(info->reg_id_aa64pfr0)) {\n+\t\ttaint |= check_update_ftr_reg(SYS_ZCR_EL1, cpu,\n+\t\t\t\t\tinfo->reg_zcr, boot->reg_zcr);\n+\n+\t\t/* Probe vector lengths, unless we already gave up on SVE */\n+\t\tif (id_aa64pfr0_sve(read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1)) &&\n+\t\t    !sys_caps_initialised)\n+\t\t\tsve_update_vq_map();\n+\t}\n+\n \t/*\n \t * Mismatched CPU features are a recipe for disaster. Don't even\n \t * pretend to support them.\n@@ -1106,6 +1135,23 @@ verify_local_cpu_features(const struct arm64_cpu_capabilities *caps)\n \t}\n }\n \n+static void verify_sve_features(void)\n+{\n+\tu64 safe_zcr = read_sanitised_ftr_reg(SYS_ZCR_EL1);\n+\tu64 zcr = read_zcr_features();\n+\n+\tunsigned int safe_len = safe_zcr & ZCR_ELx_LEN_MASK;\n+\tunsigned int len = zcr & ZCR_ELx_LEN_MASK;\n+\n+\tif (len < safe_len || sve_verify_vq_map()) {\n+\t\tpr_crit(\"CPU%d: SVE: required vector length(s) missing\\n\",\n+\t\t\tsmp_processor_id());\n+\t\tcpu_die_early();\n+\t}\n+\n+\t/* Add checks on other ZCR bits here if necessary */\n+}\n+\n /*\n  * Run through the enabled system capabilities and enable() it on this CPU.\n  * The capabilities were decided based on the available CPUs at the boot time.\n@@ -1119,8 +1165,12 @@ static void verify_local_cpu_capabilities(void)\n \tverify_local_cpu_errata_workarounds();\n \tverify_local_cpu_features(arm64_features);\n \tverify_local_elf_hwcaps(arm64_elf_hwcaps);\n+\n \tif (system_supports_32bit_el0())\n \t\tverify_local_elf_hwcaps(compat_elf_hwcaps);\n+\n+\tif (system_supports_sve())\n+\t\tverify_sve_features();\n }\n \n void check_local_cpu_capabilities(void)\n@@ -1198,6 +1248,8 @@ void __init setup_cpu_features(void)\n \tif (system_supports_32bit_el0())\n \t\tsetup_elf_hwcaps(compat_elf_hwcaps);\n \n+\tsve_setup();\n+\n \t/* Advertise that we have computed the system capabilities */\n \tset_sys_caps_initialised();\n \ndiff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c\nindex 1ff1c5a..58da504 100644\n--- a/arch/arm64/kernel/cpuinfo.c\n+++ b/arch/arm64/kernel/cpuinfo.c\n@@ -19,6 +19,7 @@\n #include <asm/cpu.h>\n #include <asm/cputype.h>\n #include <asm/cpufeature.h>\n+#include <asm/fpsimd.h>\n \n #include <linux/bitops.h>\n #include <linux/bug.h>\n@@ -331,6 +332,7 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)\n \tinfo->reg_id_aa64mmfr2 = read_cpuid(ID_AA64MMFR2_EL1);\n \tinfo->reg_id_aa64pfr0 = read_cpuid(ID_AA64PFR0_EL1);\n \tinfo->reg_id_aa64pfr1 = read_cpuid(ID_AA64PFR1_EL1);\n+\tinfo->reg_id_aa64zfr0 = read_cpuid(ID_AA64ZFR0_EL1);\n \n \t/* Update the 32bit ID registers only if AArch32 is implemented */\n \tif (id_aa64pfr0_32bit_el0(info->reg_id_aa64pfr0)) {\n@@ -353,6 +355,10 @@ static void __cpuinfo_store_cpu(struct cpuinfo_arm64 *info)\n \t\tinfo->reg_mvfr2 = read_cpuid(MVFR2_EL1);\n \t}\n \n+\tif (IS_ENABLED(CONFIG_ARM64_SVE) &&\n+\t    id_aa64pfr0_sve(info->reg_id_aa64pfr0))\n+\t\tinfo->reg_zcr = read_zcr_features();\n+\n \tcpuinfo_detect_icache_policy(info);\n }\n \ndiff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c\nindex 476c637..703e9d7 100644\n--- a/arch/arm64/kernel/fpsimd.c\n+++ b/arch/arm64/kernel/fpsimd.c\n@@ -113,19 +113,19 @@\n static DEFINE_PER_CPU(struct fpsimd_state *, fpsimd_last_state);\n \n /* Default VL for tasks that don't set it explicitly: */\n-static int sve_default_vl = SVE_VL_MIN;\n+static int sve_default_vl = -1;\n \n #ifdef CONFIG_ARM64_SVE\n \n /* Maximum supported vector length across all CPUs (initially poisoned) */\n int __ro_after_init sve_max_vl = -1;\n /* Set of available vector lengths, as vq_to_bit(vq): */\n-static DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);\n+static __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);\n \n #else /* ! CONFIG_ARM64_SVE */\n \n /* Dummy declaration for code that will be optimised out: */\n-extern DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);\n+extern __ro_after_init DECLARE_BITMAP(sve_vq_map, SVE_VQ_MAX);\n \n #endif /* ! CONFIG_ARM64_SVE */\n \n@@ -495,6 +495,111 @@ int sve_set_vector_length(struct task_struct *task,\n }\n \n /*\n+ * Bitmap for temporary storage of the per-CPU set of supported vector lengths\n+ * during secondary boot.\n+ */\n+static DECLARE_BITMAP(sve_secondary_vq_map, SVE_VQ_MAX);\n+\n+static void sve_probe_vqs(DECLARE_BITMAP(map, SVE_VQ_MAX))\n+{\n+\tunsigned int vq, vl;\n+\tunsigned long zcr;\n+\n+\tbitmap_zero(map, SVE_VQ_MAX);\n+\n+\tzcr = ZCR_ELx_LEN_MASK;\n+\tzcr = read_sysreg_s(SYS_ZCR_EL1) & ~zcr;\n+\n+\tfor (vq = SVE_VQ_MAX; vq >= SVE_VQ_MIN; --vq) {\n+\t\twrite_sysreg_s(zcr | (vq - 1), SYS_ZCR_EL1); /* self-syncing */\n+\t\tvl = sve_get_vl();\n+\t\tvq = sve_vq_from_vl(vl); /* skip intervening lengths */\n+\t\tset_bit(vq_to_bit(vq), map);\n+\t}\n+}\n+\n+void __init sve_init_vq_map(void)\n+{\n+\tsve_probe_vqs(sve_vq_map);\n+}\n+\n+/*\n+ * If we haven't committed to the set of supported VQs yet, filter out\n+ * those not supported by the current CPU.\n+ */\n+void sve_update_vq_map(void)\n+{\n+\tsve_probe_vqs(sve_secondary_vq_map);\n+\tbitmap_and(sve_vq_map, sve_vq_map, sve_secondary_vq_map, SVE_VQ_MAX);\n+}\n+\n+/* Check whether the current CPU supports all VQs in the committed set */\n+int sve_verify_vq_map(void)\n+{\n+\tint ret = 0;\n+\n+\tsve_probe_vqs(sve_secondary_vq_map);\n+\tbitmap_andnot(sve_secondary_vq_map, sve_vq_map, sve_secondary_vq_map,\n+\t\t      SVE_VQ_MAX);\n+\tif (!bitmap_empty(sve_secondary_vq_map, SVE_VQ_MAX)) {\n+\t\tpr_warn(\"SVE: cpu%d: Required vector length(s) missing\\n\",\n+\t\t\tsmp_processor_id());\n+\t\tret = -EINVAL;\n+\t}\n+\n+\treturn ret;\n+}\n+\n+/*\n+ * Enable SVE for EL1.\n+ * Intended for use by the cpufeatures code during CPU boot.\n+ */\n+int sve_kernel_enable(void *__always_unused p)\n+{\n+\twrite_sysreg(read_sysreg(CPACR_EL1) | CPACR_EL1_ZEN_EL1EN, CPACR_EL1);\n+\tisb();\n+\n+\treturn 0;\n+}\n+\n+void __init sve_setup(void)\n+{\n+\tu64 zcr;\n+\n+\tif (!system_supports_sve())\n+\t\treturn;\n+\n+\t/*\n+\t * The SVE architecture mandates support for 128-bit vectors,\n+\t * so sve_vq_map must have at least SVE_VQ_MIN set.\n+\t * If something went wrong, at least try to patch it up:\n+\t */\n+\tif (WARN_ON(!test_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map)))\n+\t\tset_bit(vq_to_bit(SVE_VQ_MIN), sve_vq_map);\n+\n+\tzcr = read_sanitised_ftr_reg(SYS_ZCR_EL1);\n+\tsve_max_vl = sve_vl_from_vq((zcr & ZCR_ELx_LEN_MASK) + 1);\n+\n+\t/*\n+\t * Sanity-check that the max VL we determined through CPU features\n+\t * corresponds properly to sve_vq_map.  If not, do our best:\n+\t */\n+\tif (WARN_ON(sve_max_vl != find_supported_vector_length(sve_max_vl)))\n+\t\tsve_max_vl = find_supported_vector_length(sve_max_vl);\n+\n+\t/*\n+\t * For the default VL, pick the maximum supported value <= 64.\n+\t * VL == 64 is guaranteed not to grow the signal frame.\n+\t */\n+\tsve_default_vl = find_supported_vector_length(64);\n+\n+\tpr_info(\"SVE: maximum available vector length %u bytes per vector\\n\",\n+\t\tsve_max_vl);\n+\tpr_info(\"SVE: default vector length %u bytes per vector\\n\",\n+\t\tsve_default_vl);\n+}\n+\n+/*\n  * Called from the put_task_struct() path, which cannot get here\n  * unless dead_task is really dead and not schedulable.\n  */\n@@ -629,6 +734,9 @@ void fpsimd_flush_thread(void)\n \t\t * This is where we ensure that all user tasks have a valid\n \t\t * vector length configured: no kernel task can become a user\n \t\t * task without an exec and hence a call to this function.\n+\t\t * By the time the first call to this function is made, all\n+\t\t * early hardware probing is complete, so sve_default_vl\n+\t\t * should be valid.\n \t\t * If a bug causes this to go wrong, we make some noise and\n \t\t * try to fudge thread.sve_vl to a safe value here.\n \t\t */\n","prefixes":["v4","16/28"]}