[{"id":1767930,"web_url":"http://patchwork.ozlabs.org/comment/1767930/","msgid":"<20170913143325.hi4cfcajbts3bbao@localhost>","list_archive_url":null,"date":"2017-09-13T14:33:25","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":938,"url":"http://patchwork.ozlabs.org/api/people/938/","name":"Catalin Marinas","email":"catalin.marinas@arm.com"},"content":"On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:\n> +/*\n> + * Handle SVE state across fork():\n> + *\n> + * dst and src must not end up with aliases of the same sve_state.\n> + * Because a task cannot fork except in a syscall, we can discard SVE\n> + * state for dst here, so long as we take care to retain the FPSIMD\n> + * subset of the state if SVE is in use.  Reallocation of the SVE state\n> + * will be deferred until dst tries to use SVE.\n> + */\n> +void fpsimd_dup_sve(struct task_struct *dst, struct task_struct const *src)\n> +{\n> +\tif (test_and_clear_tsk_thread_flag(dst, TIF_SVE)) {\n> +\t\tWARN_ON(dst->mm && !in_syscall(task_pt_regs(dst)));\n> +\t\tsve_to_fpsimd(dst);\n> +\t}\n> +\n> +\tdst->thread.sve_state = NULL;\n> +}\n\nI first thought the thread flags are not visible in dst yet since\ndup_task_struct() calls arch_dup_task_struct() before\nsetup_thread_stack(). However, at the end of the last year we enabled\nCONFIG_THREAD_INFO_IN_TASK_STRUCT. But I don't particularly like relying\non this.\n\nAnyway, IIUC we don't need sve_to_fpsimd() here. The\narch_dup_task_struct() already called fpsimd_preserve_current_state()\nfor src, so the FPSIMD state (which we care about) is transferred during\nthe *dst = *src assignment. So you'd only need the last statement,\npossibly with a different function name like fpsimd_erase_sve (and maybe\nmake the function static inline in the header).\n\n[...]\n>  int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)\n> @@ -246,6 +247,9 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)\n>  \tif (current->mm)\n>  \t\tfpsimd_preserve_current_state();\n>  \t*dst = *src;\n> +\n> +\tfpsimd_dup_sve(dst, src);\n> +\n>  \treturn 0;\n>  }","headers":{"Return-Path":"<libc-alpha-return-84561-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-84561-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"GqDHti8t\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xskgC12KQz9sNr\n\tfor <incoming@patchwork.ozlabs.org>;\n\tThu, 14 Sep 2017 00:33:42 +1000 (AEST)","(qmail 31314 invoked by alias); 13 Sep 2017 14:33:36 -0000","(qmail 31300 invoked by uid 89); 13 Sep 2017 14:33:36 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=bTVo\n\t+a8QjZK4X0JVSIT3tMBmdKxz4GfR0hy8wGRhVNOWcoMgsiCuqQlbCfJy6MFjU2+x\n\tki5tjOfTi/UNTKZkkJt9Xlvc8+Oq6S36GAzIW6uvTn38E1w4xDIZcG94dj1TMRKd\n\t0zTHu/V6HnYgk961+c6wjY+2GOOApyBTRga9ZUU=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=6THYz1jqOU\n\t8K5AnqIe9FBCV24ts=; b=GqDHti8tjeViGK92E0pvv4ugLpvqotCswSwXkZPr73\n\txveLnJpO27QX/6eC7t1uSdUc6077C+n8PzP50ySRuppoRARIbax2OXqmDRKZBdPk\n\tBR6dkDeuh4NcJNZA5b1LzwaMDqkHRCN0GdPwuFsleLtPOMohR1NMwNd6qiKyhlAk\n\tk=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-1.9 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=","X-HELO":"foss.arm.com","Date":"Wed, 13 Sep 2017 07:33:25 -0700","From":"Catalin Marinas <catalin.marinas@arm.com>","To":"Dave Martin <Dave.Martin@arm.com>","Cc":"linux-arm-kernel@lists.infradead.org, linux-arch@vger.kernel.org,\n\tlibc-alpha@sourceware.org, \tArd Biesheuvel <ard.biesheuvel@linaro.org>,\n\tSzabolcs Nagy <szabolcs.nagy@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Richard Sandiford\n\t<richard.sandiford@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, \tkvmarm@lists.cs.columbia.edu","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20170913143325.hi4cfcajbts3bbao@localhost>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>","User-Agent":"NeoMutt/20170113 (1.7.2)"}},{"id":1768047,"web_url":"http://patchwork.ozlabs.org/comment/1768047/","msgid":"<20170913172605.hjrwjf34kyy7coh4@localhost>","list_archive_url":null,"date":"2017-09-13T17:26:05","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":938,"url":"http://patchwork.ozlabs.org/api/people/938/","name":"Catalin Marinas","email":"catalin.marinas@arm.com"},"content":"On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:\n> +el0_sve_acc:\n> +\t/*\n> +\t * Scalable Vector Extension access\n> +\t */\n> +\tenable_dbg\n> +\tct_user_exit\n> +\tmov\tx0, x25\n> +\tmov\tx1, sp\n> +\tbl\tdo_sve_acc\n> +\tb\tret_to_user\n\nI think do_sve_acc() runs with interrupts disabled. We may have some\nhigh latency for large SVE states.\n\n> +/*\n> + * Trapped SVE access\n> + */\n> +void do_sve_acc(unsigned int esr, struct pt_regs *regs)\n> +{\n> +\t/* Even if we chose not to use SVE, the hardware could still trap: */\n> +\tif (unlikely(!system_supports_sve()) || WARN_ON(is_compat_task())) {\n> +\t\tforce_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);\n> +\t\treturn;\n> +\t}\n> +\n> +\ttask_fpsimd_save();\n> +\n> +\tsve_alloc(current);\n> +\tfpsimd_to_sve(current);\n> +\tif (test_and_set_thread_flag(TIF_SVE))\n> +\t\tWARN_ON(1); /* SVE access shouldn't have trapped */\n> +\n> +\ttask_fpsimd_load();\n> +}\n\nWhen this function is entered, do we expect TIF_SVE to always be\ncleared? It's worth adding a comment on the expected conditions. If\nthat's the case, task_fpsimd_save() would only save the FPSIMD state\nwhich is fine. However, you subsequently transfer the FPSIMD state to\nSVE, set TIF_SVE and restore the full SVE state. If we don't care about\nthe SVE state here, can we call task_fpsimd_load() *before* setting\nTIF_SVE?\n\nI may as well have confused myself with the state bouncing between\nFPSIMD and SVE (more reasons to document the data flow better ;)).","headers":{"Return-Path":"<libc-alpha-return-84568-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-84568-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"GXgKDV7K\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xspVL6rMYz9sNV\n\tfor <incoming@patchwork.ozlabs.org>;\n\tThu, 14 Sep 2017 03:26:18 +1000 (AEST)","(qmail 66069 invoked by alias); 13 Sep 2017 17:26:12 -0000","(qmail 65382 invoked by uid 89); 13 Sep 2017 17:26:11 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=cWqT\n\tFELJC1U569u6ZDJ+qc4CHgIoVmXnAWLSgOzAf2L+HHSNC9k1vzvzb9Zk6YLRll49\n\tMazZysYFO5qkHSG90O9soK64MH5pHfjUfkhkn9It1QnfO4qgU+8LDTE3rEOqYGtc\n\tnoCulz2VmgcadXWTNt36quzfsLBbQLStouu8nd4=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=CrAhhienmj\n\tmI/ZBZJ4rHB/8jxFg=; b=GXgKDV7KmeM7olkFZ9hPoOPPbiwRGQokv4dRHaHVq2\n\t2eac3eXfBSJRnA4E5M/EWvvjJ0C3wWn+60Cu3/QQ7Nky2UP1C4N00KPTxnvjwAwH\n\tzYrFxv/8xdh5xrKcicDuu6ZCj7qeWegILGAMI1x2VBvGhwdiVCBDSr4LPuo4+yVN\n\to=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-1.9 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD,\n\tSPF_PASS autolearn=ham version=3.3.2 spammy=Hx-languages-length:1503,\n\tstates, transfer","X-HELO":"foss.arm.com","Date":"Wed, 13 Sep 2017 10:26:05 -0700","From":"Catalin Marinas <catalin.marinas@arm.com>","To":"Dave Martin <Dave.Martin@arm.com>","Cc":"linux-arm-kernel@lists.infradead.org, Will Deacon <will.deacon@arm.com>, \n\tArd Biesheuvel <ard.biesheuvel@linaro.org>, Alex =?iso-8859-1?q?Benn?=\n\t=?iso-8859-1?q?=E9e?= <alex.bennee@linaro.org>,\n\tSzabolcs Nagy <szabolcs.nagy@arm.com>, Richard Sandiford\n\t<richard.sandiford@arm.com>, kvmarm@lists.cs.columbia.edu,\n\tlibc-alpha@sourceware.org, \tlinux-arch@vger.kernel.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20170913172605.hjrwjf34kyy7coh4@localhost>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>","User-Agent":"NeoMutt/20170113 (1.7.2)"}},{"id":1768095,"web_url":"http://patchwork.ozlabs.org/comment/1768095/","msgid":"<20170913191707.GD23415@e103592.cambridge.arm.com>","list_archive_url":null,"date":"2017-09-13T19:17:07","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":26612,"url":"http://patchwork.ozlabs.org/api/people/26612/","name":"Dave Martin","email":"Dave.Martin@arm.com"},"content":"On Wed, Sep 13, 2017 at 10:26:05AM -0700, Catalin Marinas wrote:\n> On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:\n> > +el0_sve_acc:\n> > +\t/*\n> > +\t * Scalable Vector Extension access\n> > +\t */\n> > +\tenable_dbg\n> > +\tct_user_exit\n> > +\tmov\tx0, x25\n> > +\tmov\tx1, sp\n> > +\tbl\tdo_sve_acc\n> > +\tb\tret_to_user\n> \n> I think do_sve_acc() runs with interrupts disabled. We may have some\n> high latency for large SVE states.\n\nHistorically I wanted to play safe.\n\nI meant to change this to enable_dbg_and_irq now, but it looks like I\nforgot to do it.\n\nI'll double-check that do_sve_acc() does nothing that makes it unsafe to\nenable irqs, but it should be OK.  It's certainly _intended_ to be\nirq-safe.\n\n> > +/*\n> > + * Trapped SVE access\n> > + */\n> > +void do_sve_acc(unsigned int esr, struct pt_regs *regs)\n> > +{\n> > +\t/* Even if we chose not to use SVE, the hardware could still trap: */\n> > +\tif (unlikely(!system_supports_sve()) || WARN_ON(is_compat_task())) {\n> > +\t\tforce_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);\n> > +\t\treturn;\n> > +\t}\n> > +\n> > +\ttask_fpsimd_save();\n> > +\n> > +\tsve_alloc(current);\n> > +\tfpsimd_to_sve(current);\n> > +\tif (test_and_set_thread_flag(TIF_SVE))\n> > +\t\tWARN_ON(1); /* SVE access shouldn't have trapped */\n> > +\n> > +\ttask_fpsimd_load();\n> > +}\n> \n> When this function is entered, do we expect TIF_SVE to always be\n> cleared? It's worth adding a comment on the expected conditions. If\n\nYes, and this is required for correctness, as you observe.\n\nI had a BUG_ON() here which I removed, but it makes sense to add a\ncomment to capture the precondition here, and how it is satisfied.\n\n> that's the case, task_fpsimd_save() would only save the FPSIMD state\n> which is fine. However, you subsequently transfer the FPSIMD state to\n> SVE, set TIF_SVE and restore the full SVE state. If we don't care about\n> the SVE state here, can we call task_fpsimd_load() *before* setting\n> TIF_SVE?\n\nThere should be no way to reach this code with TIF_SVE set, unless\ntask_fpsimd_load() sets the CPACR trap bit wrongly, or the hardware is\nbroken -- either of which is a bug.\n\n> I may as well have confused myself with the state bouncing between\n> FPSIMD and SVE (more reasons to document the data flow better ;)).\n\nAgreed, I think this does need more explanation...  I'm currently\ntrying to figure out the best way to describe it.\n\nCheers\n---Dave","headers":{"Return-Path":"<libc-alpha-return-84571-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-84571-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"RRZV7JMD\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xsryS6jyMz9ryQ\n\tfor <incoming@patchwork.ozlabs.org>;\n\tThu, 14 Sep 2017 05:17:20 +1000 (AEST)","(qmail 105856 invoked by alias); 13 Sep 2017 19:17:15 -0000","(qmail 105844 invoked by uid 89); 13 Sep 2017 19:17:14 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=xrZI\n\tkUltI9bZnpDoBARHAT521pjDPy5SN3PYC3e9G012EhQKp7t22F4oW+diF5ZTPYNQ\n\tVg3kSNCFap2QNA1F1llnD4garZgJFfV/n3LLI/gesMtA/7mDq4RfpeLoP915HadR\n\t5d0nG0DJBUHoGJ3NOu9Wi4KfylxCg0ui4yp0ktk=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=0s4oYOzOOd\n\tqqRP0FLc84NZu4BEw=; b=RRZV7JMD11ctXlxpPI9HdS9Wl5gGbK2lOzKRCbdpSa\n\tJV+nQfmrv1XdIlKAX61O0o0zD+v8a3QdzCXksFnKmOQQdPgmxjHUSi8087dN1Cz/\n\tXJEP/KyRloF43cuSSpraYsKE77A8UlEWVkVIcb4aGZiGgZwRykmKfT9lqrKYDq2m\n\tY=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-1.9 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD,\n\tSPF_PASS autolearn=ham version=3.3.2 spammy=Hx-languages-length:2420,\n\tstates","X-HELO":"foss.arm.com","Date":"Wed, 13 Sep 2017 20:17:07 +0100","From":"Dave Martin <Dave.Martin@arm.com>","To":"Catalin Marinas <catalin.marinas@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org, Ard Biesheuvel\n\t<ard.biesheuvel@linaro.org>, \tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, kvmarm@lists.cs.columbia.edu,\n\tlinux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20170913191707.GD23415@e103592.cambridge.arm.com>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>\n\t<20170913172605.hjrwjf34kyy7coh4@localhost>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20170913172605.hjrwjf34kyy7coh4@localhost>","User-Agent":"Mutt/1.5.23 (2014-03-12)"}},{"id":1768224,"web_url":"http://patchwork.ozlabs.org/comment/1768224/","msgid":"<20170913222129.vi3lidawdp33zzjd@localhost>","list_archive_url":null,"date":"2017-09-13T22:21:29","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":938,"url":"http://patchwork.ozlabs.org/api/people/938/","name":"Catalin Marinas","email":"catalin.marinas@arm.com"},"content":"On Wed, Sep 13, 2017 at 08:17:07PM +0100, Dave P Martin wrote:\n> On Wed, Sep 13, 2017 at 10:26:05AM -0700, Catalin Marinas wrote:\n> > On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:\n> > > +/*\n> > > + * Trapped SVE access\n> > > + */\n> > > +void do_sve_acc(unsigned int esr, struct pt_regs *regs)\n> > > +{\n> > > +\t/* Even if we chose not to use SVE, the hardware could still trap: */\n> > > +\tif (unlikely(!system_supports_sve()) || WARN_ON(is_compat_task())) {\n> > > +\t\tforce_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);\n> > > +\t\treturn;\n> > > +\t}\n> > > +\n> > > +\ttask_fpsimd_save();\n> > > +\n> > > +\tsve_alloc(current);\n> > > +\tfpsimd_to_sve(current);\n> > > +\tif (test_and_set_thread_flag(TIF_SVE))\n> > > +\t\tWARN_ON(1); /* SVE access shouldn't have trapped */\n> > > +\n> > > +\ttask_fpsimd_load();\n> > > +}\n> > \n> > When this function is entered, do we expect TIF_SVE to always be\n> > cleared? It's worth adding a comment on the expected conditions. If\n> \n> Yes, and this is required for correctness, as you observe.\n> \n> I had a BUG_ON() here which I removed, but it makes sense to add a\n> comment to capture the precondition here, and how it is satisfied.\n> \n> > that's the case, task_fpsimd_save() would only save the FPSIMD state\n> > which is fine. However, you subsequently transfer the FPSIMD state to\n> > SVE, set TIF_SVE and restore the full SVE state. If we don't care about\n> > the SVE state here, can we call task_fpsimd_load() *before* setting\n> > TIF_SVE?\n> \n> There should be no way to reach this code with TIF_SVE set, unless\n> task_fpsimd_load() sets the CPACR trap bit wrongly, or the hardware is\n> broken -- either of which is a bug.\n\nThanks for confirming my assumptions. What I meant was rewriting the\nabove function as:\n\n\t/* reset the SVE state (other than FPSIMD) */\n\ttask_fpsimd_save();\n\ttask_fpsimd_load();\n\n\tsve_alloc(current);\n\tset_thread_flag(TIF_SVE);","headers":{"Return-Path":"<libc-alpha-return-84581-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-84581-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"chTONC7P\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xsx3C1nwMz9sxR\n\tfor <incoming@patchwork.ozlabs.org>;\n\tThu, 14 Sep 2017 08:21:43 +1000 (AEST)","(qmail 121655 invoked by alias); 13 Sep 2017 22:21:35 -0000","(qmail 119991 invoked by uid 89); 13 Sep 2017 22:21:34 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=og8h\n\tlcqdEgHTPPWFRSzB37tuxvnVshzigEdV/nB/1K7IShRZgU1QhYqz0R/6Z2hzsWwM\n\tJVj5ayk6ifSKn852A5uKPUbWVxFrozQoBR5i/1ZMlJR0ZtZR9RjC/mWwz2YqB2oB\n\tV5L6RIx2tOWCJls0t9TnTSbp7wDRGPasGruZPPo=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=hLpPDx+B7v\n\t2nOm3FHwdn3PaQ5LA=; b=chTONC7PSDvtpvepz0gpyE4648HIUL+PS0urEU44uM\n\t3p5P8VcVK21ObR715Ov5mcQt9S2p8c7UVX1oukIoSDBJeljYYwlEwU9f698UBblZ\n\tT6yK3lfmblKFP5u/cYUz5TOgDLjyZcLBa+9Vo9YsLZofL7eGWsF38RxBmisR8OCa\n\t4=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-1.9 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD,\n\tSPF_PASS autolearn=ham version=3.3.2 spammy=transfer","X-HELO":"foss.arm.com","Date":"Wed, 13 Sep 2017 15:21:29 -0700","From":"Catalin Marinas <catalin.marinas@arm.com>","To":"Dave Martin <Dave.Martin@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org, Ard Biesheuvel\n\t<ard.biesheuvel@linaro.org>, \tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, kvmarm@lists.cs.columbia.edu,\n\tlinux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20170913222129.vi3lidawdp33zzjd@localhost>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>\n\t<20170913172605.hjrwjf34kyy7coh4@localhost>\n\t<20170913191707.GD23415@e103592.cambridge.arm.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20170913191707.GD23415@e103592.cambridge.arm.com>","User-Agent":"NeoMutt/20170113 (1.7.2)"}},{"id":1768799,"web_url":"http://patchwork.ozlabs.org/comment/1768799/","msgid":"<20170914193944.GA24231@e103592.cambridge.arm.com>","list_archive_url":null,"date":"2017-09-14T19:40:41","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":26612,"url":"http://patchwork.ozlabs.org/api/people/26612/","name":"Dave Martin","email":"Dave.Martin@arm.com"},"content":"On Wed, Sep 13, 2017 at 03:21:29PM -0700, Catalin Marinas wrote:\n> On Wed, Sep 13, 2017 at 08:17:07PM +0100, Dave P Martin wrote:\n> > On Wed, Sep 13, 2017 at 10:26:05AM -0700, Catalin Marinas wrote:\n> > > On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:\n> > > > +/*\n> > > > + * Trapped SVE access\n> > > > + */\n> > > > +void do_sve_acc(unsigned int esr, struct pt_regs *regs)\n> > > > +{\n> > > > +\t/* Even if we chose not to use SVE, the hardware could still trap: */\n> > > > +\tif (unlikely(!system_supports_sve()) || WARN_ON(is_compat_task())) {\n> > > > +\t\tforce_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);\n> > > > +\t\treturn;\n> > > > +\t}\n> > > > +\n> > > > +\ttask_fpsimd_save();\n> > > > +\n> > > > +\tsve_alloc(current);\n> > > > +\tfpsimd_to_sve(current);\n> > > > +\tif (test_and_set_thread_flag(TIF_SVE))\n> > > > +\t\tWARN_ON(1); /* SVE access shouldn't have trapped */\n> > > > +\n> > > > +\ttask_fpsimd_load();\n> > > > +}\n> > > \n> > > When this function is entered, do we expect TIF_SVE to always be\n> > > cleared? It's worth adding a comment on the expected conditions. If\n> > \n> > Yes, and this is required for correctness, as you observe.\n> > \n> > I had a BUG_ON() here which I removed, but it makes sense to add a\n> > comment to capture the precondition here, and how it is satisfied.\n> > \n> > > that's the case, task_fpsimd_save() would only save the FPSIMD state\n> > > which is fine. However, you subsequently transfer the FPSIMD state to\n> > > SVE, set TIF_SVE and restore the full SVE state. If we don't care about\n> > > the SVE state here, can we call task_fpsimd_load() *before* setting\n> > > TIF_SVE?\n> > \n> > There should be no way to reach this code with TIF_SVE set, unless\n> > task_fpsimd_load() sets the CPACR trap bit wrongly, or the hardware is\n> > broken -- either of which is a bug.\n> \n> Thanks for confirming my assumptions. What I meant was rewriting the\n> above function as:\n> \n> \t/* reset the SVE state (other than FPSIMD) */\n> \ttask_fpsimd_save();\n> \ttask_fpsimd_load();\n\nI think this works, but can you explain your rationale?\n\nI think the main effect of your suggestion is that it is cheaper, due\nto eliminating some unnecessary load/store operations.\n\nWe could go one better, and do\n\n\tmov\tv0.16b, v0.16b\n\tmov\tv1.16b, v1.16b\n\t// ...\n\tmov\tv31.16b, v31.16b\n\nwhich doesn't require any memory access.\n\nBut I still prefer to zero p0..p15, ffr for cleanliness, even though\nthe SVE programmer's model doesn't require this (unlike for the Z-reg\nhigh bits where we do need to zero them in order not to violate the\nprogrammer's model).\n\nCurrently sve_alloc()+task_fpsimd_load() ensures that all the non-FPSIMD\nregs are zeroed too, in addition to the Z-reg high bits.\n\nSo we might want a special-purpose helper -- if so, we can do it all\nwith no memory access.\n\n\tpfalse\tp0.b\n\t// ..\n\tpfalse\tp15.b\n\twrffr\tp0.b\n\nThis would allow the memset-zero an sve_alloc() to be removed, but I\nwould need to check what other code is relying on it.\n\nI guess I hadn't done this because I viewed it as an optimisation.\n\nThoughts?\n\nCheers\n---Dave","headers":{"Return-Path":"<libc-alpha-return-84631-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-84631-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"Op5ah2Uw\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xtTRJ1QNHz9s7f\n\tfor <incoming@patchwork.ozlabs.org>;\n\tFri, 15 Sep 2017 05:40:59 +1000 (AEST)","(qmail 90167 invoked by alias); 14 Sep 2017 19:40:54 -0000","(qmail 90154 invoked by uid 89); 14 Sep 2017 19:40:54 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=s0te\n\t1qRECcqFp/HyJ+tWdjK9T/3eD5oqiAVaivw77IAz6MvsKcdhXzyN85vdExXkA5Ji\n\tGKi41mCkoselk1Hj6nJzXUR3OhKmHqNuWo4VWKxVTJzEeG5Z+6pcRDJv0cLp+lP/\n\tjWYmVlRwwYvZMNqPsTioH1VaJo6ZN0bl5REDpw4=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=29HbtYoY7U\n\tvOJ3BKqnKOs1g3+F4=; b=Op5ah2UwDqPA8BW9b3aFrUaoRNwcRjevyYBcJ2egAB\n\t7YGcd6IpI0ntTeunFYZmUbs3iURKxxemsL04iglHURsaSABiFIA8ccK289dNarxC\n\tNPFqDpuM68p6kU5Qyj+0Ekd8euUCPF4ffoCuSH1OwzTpXoLwkPYQQrnmcuuU4A89\n\t0=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-1.9 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=","X-HELO":"foss.arm.com","Date":"Thu, 14 Sep 2017 20:40:41 +0100","From":"Dave Martin <Dave.Martin@arm.com>","To":"Catalin Marinas <catalin.marinas@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org, Ard Biesheuvel\n\t<ard.biesheuvel@linaro.org>, \tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, kvmarm@lists.cs.columbia.edu,\n\tlinux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20170914193944.GA24231@e103592.cambridge.arm.com>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>\n\t<20170913172605.hjrwjf34kyy7coh4@localhost>\n\t<20170913191707.GD23415@e103592.cambridge.arm.com>\n\t<20170913222129.vi3lidawdp33zzjd@localhost>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20170913222129.vi3lidawdp33zzjd@localhost>","User-Agent":"Mutt/1.5.23 (2014-03-12)"}},{"id":1768807,"web_url":"http://patchwork.ozlabs.org/comment/1768807/","msgid":"<20170914195555.GB24231@e103592.cambridge.arm.com>","list_archive_url":null,"date":"2017-09-14T19:55:56","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":26612,"url":"http://patchwork.ozlabs.org/api/people/26612/","name":"Dave Martin","email":"Dave.Martin@arm.com"},"content":"On Wed, Sep 13, 2017 at 07:33:25AM -0700, Catalin Marinas wrote:\n> On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:\n> > +/*\n> > + * Handle SVE state across fork():\n> > + *\n> > + * dst and src must not end up with aliases of the same sve_state.\n> > + * Because a task cannot fork except in a syscall, we can discard SVE\n> > + * state for dst here, so long as we take care to retain the FPSIMD\n> > + * subset of the state if SVE is in use.  Reallocation of the SVE state\n> > + * will be deferred until dst tries to use SVE.\n> > + */\n> > +void fpsimd_dup_sve(struct task_struct *dst, struct task_struct const *src)\n> > +{\n> > +\tif (test_and_clear_tsk_thread_flag(dst, TIF_SVE)) {\n> > +\t\tWARN_ON(dst->mm && !in_syscall(task_pt_regs(dst)));\n> > +\t\tsve_to_fpsimd(dst);\n> > +\t}\n> > +\n> > +\tdst->thread.sve_state = NULL;\n> > +}\n> \n> I first thought the thread flags are not visible in dst yet since\n> dup_task_struct() calls arch_dup_task_struct() before\n> setup_thread_stack(). However, at the end of the last year we enabled\n> CONFIG_THREAD_INFO_IN_TASK_STRUCT. But I don't particularly like relying\n> on this.\n\nHmmm, I see your point, but there are some sequencing issues here.\n\n> Anyway, IIUC we don't need sve_to_fpsimd() here. The\n> arch_dup_task_struct() already called fpsimd_preserve_current_state()\n\nI consider SVE discard as an optional side effect of task_fpsimd_save(),\nnot something that is guaranteed to happen -- the decision about whether\nto do so may become more intelligent later on.  So, for src, we may\ndiscard SVE (because syscall), but for dst we must NULL .sve_state (and\ntherefore clear TIF_SVE) simply to avoid aliasing of src->sve_state and\ndst->sve_state.\n\nThe latter requires operating on the thread_flags of dst.  I'll need to\ncheck whether there's another suitable hook for updating the thread flags\nof dst, if we aren't confident that they will always have been\ninitialised by the time arch_dup_task_struct() is called.\n\n\nEither way, there would be an intra-thread ordering requirement between\nthe task_struct and thread_info updates here, _if_ dst were schedulable:\ndst:TIF_SVE must be cleared before dst->sve_state is NULLed.\n\ndst is not schedulable until fork is done though, so maybe this doesn't\nreally matter...\n\n> for src, so the FPSIMD state (which we care about) is transferred during\n> the *dst = *src assignment. So you'd only need the last statement,\n> possibly with a different function name like fpsimd_erase_sve (and maybe\n> make the function static inline in the header).\n\nNot quite: TIF_SVE must be cleared so that a context switch or\nkernel_neon_begin() after dst is scheduled doesn't try to save state in\nthe (NULL) dst->sve_state.\n\n> \n> [...]\n> >  int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)\n> > @@ -246,6 +247,9 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)\n> >  \tif (current->mm)\n> >  \t\tfpsimd_preserve_current_state();\n> >  \t*dst = *src;\n> > +\n> > +\tfpsimd_dup_sve(dst, src);\n> > +\n> >  \treturn 0;\n> >  }\n\n\nCheers\n---Dave","headers":{"Return-Path":"<libc-alpha-return-84632-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-84632-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"sY0YVucJ\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xtTmn5Yj4z9s82\n\tfor <incoming@patchwork.ozlabs.org>;\n\tFri, 15 Sep 2017 05:56:09 +1000 (AEST)","(qmail 50536 invoked by alias); 14 Sep 2017 19:56:03 -0000","(qmail 50525 invoked by uid 89); 14 Sep 2017 19:56:03 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=cavw\n\tI2nMVn4W4NgNRlMC5R1ZuHuEkdKsxF2o4w1Zg+Bn0zZ6Vro2dU5GK9w6PVemMt+C\n\t/hPDpCfTCCaWkT74GBXPZ+nzq7UeFLTuaEaIWzGS5uhGTL87XqOFndIDfYSUkq+p\n\tFOTYGXMVL78d8pZRy0nmxIiYNKo6hKOcWV5gOxk=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=NTOjWBAm32\n\tq2Ye0W+AoMCtVARtA=; b=sY0YVucJiRf1/rhztW9MU3nqrp4RAqh5k3j2v3kq1a\n\tmML3PlRzsn06l1D+hMgeaZHbUokr2Mfwc6Zim/nCaUUeMayjaM03VRd1nEPaMb6L\n\tVwcrG9PeWbR7duJlbkGbYjJxbRfFhOjIqe+KLkQEkaL1Z/Jfo3HivaQW4yU6olPO\n\tI=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-1.9 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD,\n\tSPF_PASS autolearn=ham version=3.3.2 spammy=Hx-languages-length:3086","X-HELO":"foss.arm.com","Date":"Thu, 14 Sep 2017 20:55:56 +0100","From":"Dave Martin <Dave.Martin@arm.com>","To":"Catalin Marinas <catalin.marinas@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org, Ard Biesheuvel\n\t<ard.biesheuvel@linaro.org>, \tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, kvmarm@lists.cs.columbia.edu,\n\tlinux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20170914195555.GB24231@e103592.cambridge.arm.com>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>\n\t<20170913143325.hi4cfcajbts3bbao@localhost>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20170913143325.hi4cfcajbts3bbao@localhost>","User-Agent":"Mutt/1.5.23 (2014-03-12)"}},{"id":1771220,"web_url":"http://patchwork.ozlabs.org/comment/1771220/","msgid":"<20170919171332.sjlhwnxqklmb5wsx@armageddon.cambridge.arm.com>","list_archive_url":null,"date":"2017-09-19T17:13:33","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":938,"url":"http://patchwork.ozlabs.org/api/people/938/","name":"Catalin Marinas","email":"catalin.marinas@arm.com"},"content":"On Thu, Sep 14, 2017 at 08:40:41PM +0100, Dave P Martin wrote:\n> On Wed, Sep 13, 2017 at 03:21:29PM -0700, Catalin Marinas wrote:\n> > On Wed, Sep 13, 2017 at 08:17:07PM +0100, Dave P Martin wrote:\n> > > On Wed, Sep 13, 2017 at 10:26:05AM -0700, Catalin Marinas wrote:\n> > > > On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:\n> > > > > +/*\n> > > > > + * Trapped SVE access\n> > > > > + */\n> > > > > +void do_sve_acc(unsigned int esr, struct pt_regs *regs)\n> > > > > +{\n> > > > > +\t/* Even if we chose not to use SVE, the hardware could still trap: */\n> > > > > +\tif (unlikely(!system_supports_sve()) || WARN_ON(is_compat_task())) {\n> > > > > +\t\tforce_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);\n> > > > > +\t\treturn;\n> > > > > +\t}\n> > > > > +\n> > > > > +\ttask_fpsimd_save();\n> > > > > +\n> > > > > +\tsve_alloc(current);\n> > > > > +\tfpsimd_to_sve(current);\n> > > > > +\tif (test_and_set_thread_flag(TIF_SVE))\n> > > > > +\t\tWARN_ON(1); /* SVE access shouldn't have trapped */\n> > > > > +\n> > > > > +\ttask_fpsimd_load();\n> > > > > +}\n> > > > \n> > > > When this function is entered, do we expect TIF_SVE to always be\n> > > > cleared? It's worth adding a comment on the expected conditions. If\n> > > \n> > > Yes, and this is required for correctness, as you observe.\n> > > \n> > > I had a BUG_ON() here which I removed, but it makes sense to add a\n> > > comment to capture the precondition here, and how it is satisfied.\n> > > \n> > > > that's the case, task_fpsimd_save() would only save the FPSIMD state\n> > > > which is fine. However, you subsequently transfer the FPSIMD state to\n> > > > SVE, set TIF_SVE and restore the full SVE state. If we don't care about\n> > > > the SVE state here, can we call task_fpsimd_load() *before* setting\n> > > > TIF_SVE?\n> > > \n> > > There should be no way to reach this code with TIF_SVE set, unless\n> > > task_fpsimd_load() sets the CPACR trap bit wrongly, or the hardware is\n> > > broken -- either of which is a bug.\n> > \n> > Thanks for confirming my assumptions. What I meant was rewriting the\n> > above function as:\n> > \n> > \t/* reset the SVE state (other than FPSIMD) */\n> > \ttask_fpsimd_save();\n> > \ttask_fpsimd_load();\n> \n> I think this works, but can you explain your rationale?\n> \n> I think the main effect of your suggestion is that it is cheaper, due\n> to eliminating some unnecessary load/store operations.\n\nMy rationale was to avoid copying between the in-memory FPSIMD and SVE\nstate.\n\n> We could go one better, and do\n> \n> \tmov\tv0.16b, v0.16b\n> \tmov\tv1.16b, v1.16b\n> \t// ...\n> \tmov\tv31.16b, v31.16b\n> \n> which doesn't require any memory access.\n\nYes, that's even better.\n\n> But I still prefer to zero p0..p15, ffr for cleanliness, even though\n> the SVE programmer's model doesn't require this (unlike for the Z-reg\n> high bits where we do need to zero them in order not to violate the\n> programmer's model).\n\nI missed the px, ffr aspect. Can you not have a clear_sve_state() (or a\nbetter name) function to zero the predicate regs, ffr and the top bits\nof the vectors?\n\n> Currently sve_alloc()+task_fpsimd_load() ensures that all the non-FPSIMD\n> regs are zeroed too, in addition to the Z-reg high bits.\n\nYes, just wondering if this can be implemented with less memory accesses\nsince the SVE state is irrelevant at this stage.\n\n> \n> So we might want a special-purpose helper -- if so, we can do it all\n> with no memory access.\n> \n> \tpfalse\tp0.b\n> \t// ..\n> \tpfalse\tp15.b\n> \twrffr\tp0.b\n> \n> This would allow the memset-zero an sve_alloc() to be removed, but I\n> would need to check what other code is relying on it.\n> \n> I guess I hadn't done this because I viewed it as an optimisation.\n\nIt looked like some low-hanging optimisation to slightly accelerate the\nallocation of the SVE state on access, though I'm also worried I don't\nfully understand all the corner cases (like what happens if we allow\ninterrupts during this function and get preempted).\n\nAnyway, I'm fine to leave this as it is for now and try to optimise it\nlater with additional patches on top.","headers":{"Return-Path":"<libc-alpha-return-84753-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-84753-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"AD7UYeQG\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xxTx845C2z9sMN\n\tfor <incoming@patchwork.ozlabs.org>;\n\tWed, 20 Sep 2017 03:13:48 +1000 (AEST)","(qmail 97891 invoked by alias); 19 Sep 2017 17:13:41 -0000","(qmail 97846 invoked by uid 89); 19 Sep 2017 17:13:40 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=E5DI\n\tISLgpNKCDKVRHLMJLOunRnKIqnYqZDSEKUEVbYODYn4nqis/eadEIArpWA8RrZjg\n\t+eNC3M+eSNspnQ/qcE9rc/zlN5NrEcsE8Z4DJXrYv2P3m0NvMJba6NHqJgbiwwJV\n\t4GKJhzlyd1yKPP0/6tjRHaPV9MQM3CcgWYeJlK0=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=wAlMARFO5F\n\tT9dPzUBon9QHl87lA=; b=AD7UYeQGrv/LV2dBk5V84anx711cTTgqakorwZFX0u\n\tTFePxZ5VkQSe1Ztv2PTl4KWwB11YN3l6HWNu50tauQkR619bKNwB8NVzqk+jvAxU\n\tVdCgUVYatET0qGVD/JVVusGSu9YwsonHluLUhmJa3bEDelNPEeYMUjBGStBtqk2b\n\tE=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-1.9 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD,\n\tSPF_PASS autolearn=ham version=3.3.2 spammy=cheaper, 43pm,\n\t05am, 43PM","X-HELO":"foss.arm.com","Date":"Tue, 19 Sep 2017 18:13:33 +0100","From":"Catalin Marinas <catalin.marinas@arm.com>","To":"Dave Martin <Dave.Martin@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org, Ard Biesheuvel\n\t<ard.biesheuvel@linaro.org>, \tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, kvmarm@lists.cs.columbia.edu,\n\tlinux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20170919171332.sjlhwnxqklmb5wsx@armageddon.cambridge.arm.com>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>\n\t<20170913172605.hjrwjf34kyy7coh4@localhost>\n\t<20170913191707.GD23415@e103592.cambridge.arm.com>\n\t<20170913222129.vi3lidawdp33zzjd@localhost>\n\t<20170914193944.GA24231@e103592.cambridge.arm.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20170914193944.GA24231@e103592.cambridge.arm.com>","User-Agent":"NeoMutt/20170113 (1.7.2)"}},{"id":1771885,"web_url":"http://patchwork.ozlabs.org/comment/1771885/","msgid":"<20170920135856.dfuowttwi4bk4nqb@localhost>","list_archive_url":null,"date":"2017-09-20T13:58:56","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":938,"url":"http://patchwork.ozlabs.org/api/people/938/","name":"Catalin Marinas","email":"catalin.marinas@arm.com"},"content":"On Thu, Sep 14, 2017 at 08:55:56PM +0100, Dave P Martin wrote:\n> On Wed, Sep 13, 2017 at 07:33:25AM -0700, Catalin Marinas wrote:\n> > On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:\n> > > +/*\n> > > + * Handle SVE state across fork():\n> > > + *\n> > > + * dst and src must not end up with aliases of the same sve_state.\n> > > + * Because a task cannot fork except in a syscall, we can discard SVE\n> > > + * state for dst here, so long as we take care to retain the FPSIMD\n> > > + * subset of the state if SVE is in use.  Reallocation of the SVE state\n> > > + * will be deferred until dst tries to use SVE.\n> > > + */\n> > > +void fpsimd_dup_sve(struct task_struct *dst, struct task_struct const *src)\n> > > +{\n> > > +\tif (test_and_clear_tsk_thread_flag(dst, TIF_SVE)) {\n> > > +\t\tWARN_ON(dst->mm && !in_syscall(task_pt_regs(dst)));\n> > > +\t\tsve_to_fpsimd(dst);\n> > > +\t}\n> > > +\n> > > +\tdst->thread.sve_state = NULL;\n> > > +}\n> > \n> > I first thought the thread flags are not visible in dst yet since\n> > dup_task_struct() calls arch_dup_task_struct() before\n> > setup_thread_stack(). However, at the end of the last year we enabled\n> > CONFIG_THREAD_INFO_IN_TASK_STRUCT. But I don't particularly like relying\n> > on this.\n> \n> Hmmm, I see your point, but there are some sequencing issues here.\n> \n> > Anyway, IIUC we don't need sve_to_fpsimd() here. The\n> > arch_dup_task_struct() already called fpsimd_preserve_current_state()\n> \n> I consider SVE discard as an optional side effect of task_fpsimd_save(),\n> not something that is guaranteed to happen -- the decision about whether\n> to do so may become more intelligent later on.  So, for src, we may\n> discard SVE (because syscall), but for dst we must NULL .sve_state (and\n> therefore clear TIF_SVE) simply to avoid aliasing of src->sve_state and\n> dst->sve_state.\n\nMy point was that the SVE state of src is already preserved at this\npoint and copied into dst. You don't need the sve_to_fpsimd(dst) again\nwhich basically does the same copying of the src SVE saved state into\nthe FPSIMD one in dst. This has already been done in\narch_dup_task_struct() by the combination of\nfpsimd_preserve_current_state() and *dst = *src (and, of course,\nclearing TIF_SVE in dst).\n\nI don't think the TIF_SVE clearing in src is just a side effect of\ntask_fpsimd_save() here but rather a requirement. When returning from\nfork(), both src and dst would need to have the same state. However,\nyour fpsimd_dup_sve() implementation makes it very clear that the SVE\nstate is lost in dst. This is only allowed if we also lose it in src (as\na result of a syscall). So making dst->sve_state = NULL requires that\nTIF_SVE is also cleared in both src and dst. Alternatively, you'd have\nto allocate a new state here and copy the full src SVE state across to\ndst, together with setting TIF_SVE (that's not necessary, however, since\nwe get here as a result of a syscall).\n\n> > for src, so the FPSIMD state (which we care about) is transferred during\n> > the *dst = *src assignment. So you'd only need the last statement,\n> > possibly with a different function name like fpsimd_erase_sve (and maybe\n> > make the function static inline in the header).\n> \n> Not quite: TIF_SVE must be cleared so that a context switch or\n> kernel_neon_begin() after dst is scheduled doesn't try to save state in\n> the (NULL) dst->sve_state.\n\nYes, TIF_SVE must also be cleared in dst when dst->sve_state = NULL (I\nmay have forgotten to mention this).","headers":{"Return-Path":"<libc-alpha-return-84785-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-84785-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"bn/fK4TL\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xy1ZD37rlz9s7h\n\tfor <incoming@patchwork.ozlabs.org>;\n\tWed, 20 Sep 2017 23:59:16 +1000 (AEST)","(qmail 104410 invoked by alias); 20 Sep 2017 13:59:09 -0000","(qmail 102987 invoked by uid 89); 20 Sep 2017 13:59:08 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=aQnZ\n\txZVl3H43aHL8ZDFigcVKZjNaBEmK+7If0yRwt9DTuLtpHGFmm/nqhVUgeBaR4/ds\n\tClIMEHy6PFbi6fqPyyGtr80reoaoFqVP9Q8ICD4FZkx0+OmNFaJMigr1oqAZVDzk\n\tc9VmPCF3qk7cLa3RwUONXqZWCYOUKPIyOkLXh4s=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=4rz7vgx7hr\n\tfVoDWBDYYpGNYUzvM=; b=bn/fK4TL3MPkLL6Msiv+4euase3zz7xFwJcJEXelyq\n\tDQwpbZNw8ec2TaJ8zDLYpTXLpQymeAyCh90Fv+lLXtCHgDW0N0izQo/zJnpGjPpF\n\txzJD77wKOkg0Oes8hxRVEHLgYd2MswqmU9wR60mw1RYTBHB+2PgpP0SytdBPBTuG\n\tw=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=4.1 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD, SPAM_BODY,\n\tSPF_PASS autolearn=no version=3.3.2 spammy=forgotten","X-HELO":"foss.arm.com","Date":"Wed, 20 Sep 2017 14:58:56 +0100","From":"Catalin Marinas <catalin.marinas@arm.com>","To":"Dave Martin <Dave.Martin@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org, Ard Biesheuvel\n\t<ard.biesheuvel@linaro.org>, \tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, kvmarm@lists.cs.columbia.edu,\n\tlinux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20170920135856.dfuowttwi4bk4nqb@localhost>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>\n\t<20170913143325.hi4cfcajbts3bbao@localhost>\n\t<20170914195555.GB24231@e103592.cambridge.arm.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20170914195555.GB24231@e103592.cambridge.arm.com>","User-Agent":"NeoMutt/20170113 (1.7.2)"}},{"id":1778841,"web_url":"http://patchwork.ozlabs.org/comment/1778841/","msgid":"<20171003111100.GQ3611@e103592.cambridge.arm.com>","list_archive_url":null,"date":"2017-10-03T11:11:01","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":26612,"url":"http://patchwork.ozlabs.org/api/people/26612/","name":"Dave Martin","email":"Dave.Martin@arm.com"},"content":"On Wed, Sep 20, 2017 at 02:58:56PM +0100, Catalin Marinas wrote:\n> On Thu, Sep 14, 2017 at 08:55:56PM +0100, Dave P Martin wrote:\n> > On Wed, Sep 13, 2017 at 07:33:25AM -0700, Catalin Marinas wrote:\n> > > On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:\n> > > > +/*\n> > > > + * Handle SVE state across fork():\n> > > > + *\n> > > > + * dst and src must not end up with aliases of the same sve_state.\n> > > > + * Because a task cannot fork except in a syscall, we can discard SVE\n> > > > + * state for dst here, so long as we take care to retain the FPSIMD\n> > > > + * subset of the state if SVE is in use.  Reallocation of the SVE state\n> > > > + * will be deferred until dst tries to use SVE.\n> > > > + */\n> > > > +void fpsimd_dup_sve(struct task_struct *dst, struct task_struct const *src)\n> > > > +{\n> > > > +\tif (test_and_clear_tsk_thread_flag(dst, TIF_SVE)) {\n> > > > +\t\tWARN_ON(dst->mm && !in_syscall(task_pt_regs(dst)));\n> > > > +\t\tsve_to_fpsimd(dst);\n> > > > +\t}\n> > > > +\n> > > > +\tdst->thread.sve_state = NULL;\n> > > > +}\n> > > \n> > > I first thought the thread flags are not visible in dst yet since\n> > > dup_task_struct() calls arch_dup_task_struct() before\n> > > setup_thread_stack(). However, at the end of the last year we enabled\n> > > CONFIG_THREAD_INFO_IN_TASK_STRUCT. But I don't particularly like relying\n> > > on this.\n> > \n> > Hmmm, I see your point, but there are some sequencing issues here.\n> > \n> > > Anyway, IIUC we don't need sve_to_fpsimd() here. The\n> > > arch_dup_task_struct() already called fpsimd_preserve_current_state()\n> > \n> > I consider SVE discard as an optional side effect of task_fpsimd_save(),\n> > not something that is guaranteed to happen -- the decision about whether\n> > to do so may become more intelligent later on.  So, for src, we may\n> > discard SVE (because syscall), but for dst we must NULL .sve_state (and\n> > therefore clear TIF_SVE) simply to avoid aliasing of src->sve_state and\n> > dst->sve_state.\n> \n> My point was that the SVE state of src is already preserved at this\n> point and copied into dst. You don't need the sve_to_fpsimd(dst) again\n> which basically does the same copying of the src SVE saved state into\n> the FPSIMD one in dst. This has already been done in\n> arch_dup_task_struct() by the combination of\n> fpsimd_preserve_current_state() and *dst = *src (and, of course,\n> clearing TIF_SVE in dst).\n> \n> I don't think the TIF_SVE clearing in src is just a side effect of\n> task_fpsimd_save() here but rather a requirement. When returning from\n> fork(), both src and dst would need to have the same state. However,\n> your fpsimd_dup_sve() implementation makes it very clear that the SVE\n> state is lost in dst. This is only allowed if we also lose it in src (as\n> a result of a syscall). So making dst->sve_state = NULL requires that\n> TIF_SVE is also cleared in both src and dst. Alternatively, you'd have\n> to allocate a new state here and copy the full src SVE state across to\n> dst, together with setting TIF_SVE (that's not necessary, however, since\n> we get here as a result of a syscall).\n\nThe currently intended ABI is that the SVE bits are unspecified after a\nsyscall, so it is legitimate (though perhaps surprising) for different\nthings to happen in dst and src.\n\nThis complicates things a lot though, just to avoid the next SVE usage\nexception in src after the fork.\n\n\nIt should be simpler to do what you suggest and discard the SVE state of\nsrc unconditionally before the copy: then we really are just cloning the\nthread apart from the need to set dst->thread.sve_state to NULL.\n\nfpsimd_preserve_current_state() does not necessarily write back to\ncurrent->thread.fpsmid_state though: at the moment, it does do this as a\nside effect of task_fpsimd_save() because we happen to be in a syscall\n(i.e., fork).  \n\nWhat we really want is unconditional discarding of the state.  This\nwasn't needed anywhere else yet, so there's no explicit helper for it.\nBut it makes sense to add one.\n\nWhat about refactoring along these lines:\n\n\nfpsimd.c:\n/* Unconditionally discard the SVE state */\nvoid task_sve_discard(struct task_struct *task)\n{\n\tif (!system_supports_sve())\n\t\treturn;\n\n\tlocal_bh_disable();\n\tif (test_and_clear_tsk_thread_flag(task, TIF_SVE))\n\t\tsve_to_fpsimd(task);\n\tlocal_bh_enable();\n}\n\nprocess.c:\nint arch_dup_task_struct(sturct task_struct *dst, struct task_struct *src)\n{\n\tif (current->mm) {\n\t\tfpsimd_preserve_current_state();\n\t\ttask_sve_discard(src);\n\t}\n\n\t*dst = *src;\n\n\tdst->thread.sve_state = NULL;\n}\n\n\nThis also avoids having to touch dst's thread flags, since now we\nare just cloning the task except for assigning NULL to\ndst->thread.sve_state.\n\n\n> > > for src, so the FPSIMD state (which we care about) is transferred during\n> > > the *dst = *src assignment. So you'd only need the last statement,\n> > > possibly with a different function name like fpsimd_erase_sve (and maybe\n> > > make the function static inline in the header).\n> > \n> > Not quite: TIF_SVE must be cleared so that a context switch or\n> > kernel_neon_begin() after dst is scheduled doesn't try to save state in\n> > the (NULL) dst->sve_state.\n> \n> Yes, TIF_SVE must also be cleared in dst when dst->sve_state = NULL (I\n> may have forgotten to mention this).\n\nWith the above factoring, the constraint \"TIF_SVE implies sve_state\nvalid\" is then never false, because TIF_SVE is already cleared before\nthe NULL assignment.\n\nWhat do you think?\n\nCheers\n---Dave","headers":{"Return-Path":"<libc-alpha-return-85281-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-85281-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"fxJt6ycE\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3y5xDM5wXsz9sxR\n\tfor <incoming@patchwork.ozlabs.org>;\n\tTue,  3 Oct 2017 22:11:15 +1100 (AEDT)","(qmail 76422 invoked by alias); 3 Oct 2017 11:11:10 -0000","(qmail 75469 invoked by uid 89); 3 Oct 2017 11:11:08 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=wCwC\n\trp3m6a6ThC/MOJBYxYR+1X2l4znqv5hNYcnwPFNGhlCSvfnfAei4mvl7OE9uTK+T\n\t8XOlKlBlN+hRiozyhHXzgXEQfrMzMU/qmDLHcg8psNSAIDzQu+BjwTeYu6HYDtMK\n\tClXCPHN9AQB4BUPnl0EeuNOdRH/8tEBwjAELb0I=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=fduX9390vy\n\tpP1twnSiaangMuu80=; b=fxJt6ycEz4N63RIsLLm5ol/W8TkCJ8F1MwSvT86/P7\n\ttOXoYJyMf0Rxkiv7NX3xHf0MopiEY9PMLSByg/yNbCrdU6QsW2T5Y6sK+zUOFsC6\n\t+roOBnn7M51iwDAiMsHzLSBpBVA9EkHRWYqAdad4bJuorWCipOcVtOd9AXscXmV/\n\tU=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-1.9 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD,\n\tSPF_PASS autolearn=ham version=3.3.2 spammy=clearing, apart","X-HELO":"foss.arm.com","Date":"Tue, 3 Oct 2017 12:11:01 +0100","From":"Dave Martin <Dave.Martin@arm.com>","To":"Catalin Marinas <catalin.marinas@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org, Ard Biesheuvel\n\t<ard.biesheuvel@linaro.org>, \tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, kvmarm@lists.cs.columbia.edu,\n\tlinux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20171003111100.GQ3611@e103592.cambridge.arm.com>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>\n\t<20170913143325.hi4cfcajbts3bbao@localhost>\n\t<20170914195555.GB24231@e103592.cambridge.arm.com>\n\t<20170920135856.dfuowttwi4bk4nqb@localhost>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20170920135856.dfuowttwi4bk4nqb@localhost>","User-Agent":"Mutt/1.5.23 (2014-03-12)"}},{"id":1778850,"web_url":"http://patchwork.ozlabs.org/comment/1778850/","msgid":"<20171003113302.GR3611@e103592.cambridge.arm.com>","list_archive_url":null,"date":"2017-10-03T11:33:03","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":26612,"url":"http://patchwork.ozlabs.org/api/people/26612/","name":"Dave Martin","email":"Dave.Martin@arm.com"},"content":"On Wed, Sep 13, 2017 at 07:33:25AM -0700, Catalin Marinas wrote:\n> On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:\n> > +/*\n> > + * Handle SVE state across fork():\n> > + *\n> > + * dst and src must not end up with aliases of the same sve_state.\n> > + * Because a task cannot fork except in a syscall, we can discard SVE\n> > + * state for dst here, so long as we take care to retain the FPSIMD\n> > + * subset of the state if SVE is in use.  Reallocation of the SVE state\n> > + * will be deferred until dst tries to use SVE.\n> > + */\n> > +void fpsimd_dup_sve(struct task_struct *dst, struct task_struct const *src)\n> > +{\n> > +\tif (test_and_clear_tsk_thread_flag(dst, TIF_SVE)) {\n> > +\t\tWARN_ON(dst->mm && !in_syscall(task_pt_regs(dst)));\n> > +\t\tsve_to_fpsimd(dst);\n> > +\t}\n> > +\n> > +\tdst->thread.sve_state = NULL;\n> > +}\n> \n> I first thought the thread flags are not visible in dst yet since\n> dup_task_struct() calls arch_dup_task_struct() before\n> setup_thread_stack(). However, at the end of the last year we enabled\n> CONFIG_THREAD_INFO_IN_TASK_STRUCT. But I don't particularly like relying\n> on this.\n> \n> Anyway, IIUC we don't need sve_to_fpsimd() here. The\n> arch_dup_task_struct() already called fpsimd_preserve_current_state()\n> for src, so the FPSIMD state (which we care about) is transferred during\n> the *dst = *src assignment. So you'd only need the last statement,\n> possibly with a different function name like fpsimd_erase_sve (and maybe\n> make the function static inline in the header).\n\nRegarding the intended meanings of the thread flags, does this help?\n\n--8<--\n\nTIF_FOREIGN_FPSTATE's meaning is expanded to cover SVE, but otherwise\nunchanged:\n\n * If a task is running and !TIF_FOREIGN_FPSTATE, then the the CPU\n   registers of the CPU the task is running on contain the authoritative\n   FPSIMD/SVE state of the task.  The backing memory may be stale.\n\n * Otherwise (i.e., task not running, or task running and\n   TIF_FOREIGN_FPSTATE), the task's FPSIMD/SVE backing memory is\n   authoritative.  If additionally per_cpu(fpsimd_last_state,\n   task->fpsimd_state.cpu) == &task->fpsimd_state.cpu, then\n   task->fpsimd_state.cpu's registers are also up to date for task, but\n   not authorititive: the current FPSIMD/SVE state may be read from\n   them, but they must not be written.\n \nThe FPSIMD/SVE backing memory is selected by TIF_SVE:\n\n * TIF_SVE set: Zn (incorporating Vn in bits[127:0]), Pn and FFR are\n   stored in task->thread.sve_state, formatted appropriately for vector\n   length task->thread.sve_vl.  task->thread.sve_state must point to a\n   valid buffer at least sve_state_size(task) bytes in size.\n\n * TIF_SVE clear: Vn are stored in task->fpsimd_state; Zn[max : 128] are\n   logically zero[*] but not stored anywhere; Pn, FFR are not stored and\n   have unspecified values from userspace's point of view.\n   task->thread.sve_state does not need to be non-null, valid or any\n   particular size: it must not be dereferenced.\n\n   In practice I don't exploit the \"unspecifiedness\" much.  The Zn high\n   bits, Pn and FFR are all zeroed when setting TIF_SVE again:\n   sve_alloc() is the common path for this.\n\n * FPSR and FPCR are always stored in task->fpsimd_state irrespctive of\n   whether TIF_SVE is clear or set, since these are not vector length\n   dependent.\n\n\n[*] theoretically unspecified, which is what I've written in sve.txt.\nHowever, on any FPSIMD Vn write the upper bits of the corresponding Zn\nmust become logically zero in order to conform to the SVE programmer's\nmodel.  It's not feasible to track which Vn have been written\nindividually because that would involve trapping every FPSIMD insn until\nall possible Vn have been written.  So the kernel ensures that the Zn\nhigh bits become zero.\n\nMaybe we should just guarantee \"zero-or-preserved\" behaviour for\nuserspace.  This may close down some future optimisation opportunities,\nbut maybe it's better to be simple.\n\nThoughts?\n\n[...]\n\nCheers\n---Dave","headers":{"Return-Path":"<libc-alpha-return-85283-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-85283-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"Jl0c//iD\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3y5xjn54k2z9t2Q\n\tfor <incoming@patchwork.ozlabs.org>;\n\tTue,  3 Oct 2017 22:33:17 +1100 (AEDT)","(qmail 99743 invoked by alias); 3 Oct 2017 11:33:10 -0000","(qmail 99728 invoked by uid 89); 3 Oct 2017 11:33:10 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=TbsQ\n\tgzuMeqePUYt42QPFYkoYVV+Gox9AvBbr3Y1aMHGRsvITMDDbSoDKkj88T0jqobKZ\n\t/CAQQHGhnTZfW9fYx9B0yjvTCFO4YNyYekEM4+jSSgozGzsLaK5XdCN4DHnJ/YaJ\n\tvVUUcubIbKxvELAyfYT9W49amGMqo/DS3yq0LJ8=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=EJBSbF6LmU\n\tBqiOl2ycBmxUQqBQI=; b=Jl0c//iD4A70icRVKnW6Pk2HflvG9KZryKH78NKs/K\n\tEPlpNiVrYIZ3sapV62D0nr94dlKiZ8EcLFspcgWHFdSFyPkuGe8FP+a936s6Il7g\n\tbbMvAdbDjcCU8c4fsmP1DkRldl0EFx7l7iP3Hu0E39Wn58QQHW212Pk0VNSLMIl5\n\t0=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-1.9 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD,\n\tSPF_PASS autolearn=ham version=3.3.2 spammy=conform, tasks,\n\tauthoritative","X-HELO":"foss.arm.com","Date":"Tue, 3 Oct 2017 12:33:03 +0100","From":"Dave Martin <Dave.Martin@arm.com>","To":"Catalin Marinas <catalin.marinas@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org, Ard Biesheuvel\n\t<ard.biesheuvel@linaro.org>, \tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, kvmarm@lists.cs.columbia.edu,\n\tlinux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20171003113302.GR3611@e103592.cambridge.arm.com>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>\n\t<20170913143325.hi4cfcajbts3bbao@localhost>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20170913143325.hi4cfcajbts3bbao@localhost>","User-Agent":"Mutt/1.5.23 (2014-03-12)"}},{"id":1780005,"web_url":"http://patchwork.ozlabs.org/comment/1780005/","msgid":"<20171004172955.tyooqxdoo73iq66u@armageddon.cambridge.arm.com>","list_archive_url":null,"date":"2017-10-04T17:29:56","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":938,"url":"http://patchwork.ozlabs.org/api/people/938/","name":"Catalin Marinas","email":"catalin.marinas@arm.com"},"content":"On Tue, Oct 03, 2017 at 12:11:01PM +0100, Dave P Martin wrote:\n> On Wed, Sep 20, 2017 at 02:58:56PM +0100, Catalin Marinas wrote:\n> > On Thu, Sep 14, 2017 at 08:55:56PM +0100, Dave P Martin wrote:\n> > > On Wed, Sep 13, 2017 at 07:33:25AM -0700, Catalin Marinas wrote:\n> > > > On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:\n> > > > > +/*\n> > > > > + * Handle SVE state across fork():\n> > > > > + *\n> > > > > + * dst and src must not end up with aliases of the same sve_state.\n> > > > > + * Because a task cannot fork except in a syscall, we can discard SVE\n> > > > > + * state for dst here, so long as we take care to retain the FPSIMD\n> > > > > + * subset of the state if SVE is in use.  Reallocation of the SVE state\n> > > > > + * will be deferred until dst tries to use SVE.\n> > > > > + */\n> > > > > +void fpsimd_dup_sve(struct task_struct *dst, struct task_struct const *src)\n> > > > > +{\n> > > > > +\tif (test_and_clear_tsk_thread_flag(dst, TIF_SVE)) {\n> > > > > +\t\tWARN_ON(dst->mm && !in_syscall(task_pt_regs(dst)));\n> > > > > +\t\tsve_to_fpsimd(dst);\n> > > > > +\t}\n> > > > > +\n> > > > > +\tdst->thread.sve_state = NULL;\n> > > > > +}\n> > > > \n> > > > I first thought the thread flags are not visible in dst yet since\n> > > > dup_task_struct() calls arch_dup_task_struct() before\n> > > > setup_thread_stack(). However, at the end of the last year we enabled\n> > > > CONFIG_THREAD_INFO_IN_TASK_STRUCT. But I don't particularly like relying\n> > > > on this.\n> > > \n> > > Hmmm, I see your point, but there are some sequencing issues here.\n> > > \n> > > > Anyway, IIUC we don't need sve_to_fpsimd() here. The\n> > > > arch_dup_task_struct() already called fpsimd_preserve_current_state()\n> > > \n> > > I consider SVE discard as an optional side effect of task_fpsimd_save(),\n> > > not something that is guaranteed to happen -- the decision about whether\n> > > to do so may become more intelligent later on.  So, for src, we may\n> > > discard SVE (because syscall), but for dst we must NULL .sve_state (and\n> > > therefore clear TIF_SVE) simply to avoid aliasing of src->sve_state and\n> > > dst->sve_state.\n> > \n> > My point was that the SVE state of src is already preserved at this\n> > point and copied into dst. You don't need the sve_to_fpsimd(dst) again\n> > which basically does the same copying of the src SVE saved state into\n> > the FPSIMD one in dst. This has already been done in\n> > arch_dup_task_struct() by the combination of\n> > fpsimd_preserve_current_state() and *dst = *src (and, of course,\n> > clearing TIF_SVE in dst).\n> > \n> > I don't think the TIF_SVE clearing in src is just a side effect of\n> > task_fpsimd_save() here but rather a requirement. When returning from\n> > fork(), both src and dst would need to have the same state. However,\n> > your fpsimd_dup_sve() implementation makes it very clear that the SVE\n> > state is lost in dst. This is only allowed if we also lose it in src (as\n> > a result of a syscall). So making dst->sve_state = NULL requires that\n> > TIF_SVE is also cleared in both src and dst. Alternatively, you'd have\n> > to allocate a new state here and copy the full src SVE state across to\n> > dst, together with setting TIF_SVE (that's not necessary, however, since\n> > we get here as a result of a syscall).\n> \n> The currently intended ABI is that the SVE bits are unspecified after a\n> syscall, so it is legitimate (though perhaps surprising) for different\n> things to happen in dst and src.\n> \n> This complicates things a lot though, just to avoid the next SVE usage\n> exception in src after the fork.\n> \n> \n> It should be simpler to do what you suggest and discard the SVE state of\n> src unconditionally before the copy: then we really are just cloning the\n> thread apart from the need to set dst->thread.sve_state to NULL.\n> \n> fpsimd_preserve_current_state() does not necessarily write back to\n> current->thread.fpsmid_state though: at the moment, it does do this as a\n> side effect of task_fpsimd_save() because we happen to be in a syscall\n> (i.e., fork).  \n> \n> What we really want is unconditional discarding of the state.  This\n> wasn't needed anywhere else yet, so there's no explicit helper for it.\n> But it makes sense to add one.\n> \n> What about refactoring along these lines:\n> \n> \n> fpsimd.c:\n> /* Unconditionally discard the SVE state */\n> void task_sve_discard(struct task_struct *task)\n> {\n> \tif (!system_supports_sve())\n> \t\treturn;\n> \n> \tlocal_bh_disable();\n> \tif (test_and_clear_tsk_thread_flag(task, TIF_SVE))\n> \t\tsve_to_fpsimd(task);\n> \tlocal_bh_enable();\n> }\n> \n> process.c:\n> int arch_dup_task_struct(sturct task_struct *dst, struct task_struct *src)\n> {\n> \tif (current->mm) {\n> \t\tfpsimd_preserve_current_state();\n> \t\ttask_sve_discard(src);\n> \t}\n> \n> \t*dst = *src;\n> \n> \tdst->thread.sve_state = NULL;\n> }\n> \n> \n> This also avoids having to touch dst's thread flags, since now we\n> are just cloning the task except for assigning NULL to\n> dst->thread.sve_state.\n\nThis looks fine to me, the execution of task_sve_discard() is nearly a\nno-op with the current code.\n\nWe still have some local_bh_disable/enable() calls, though I don't think\nit's worth optimising them now (e.g. having a\nfpsimd_preserve_current_state_and_discard_sve() function with a\n\"sve_discard\" argument to task_fpsimd_save() to force this).","headers":{"Return-Path":"<libc-alpha-return-85375-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-85375-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"B0TZUvC8\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3y6jbT2Dmyz9t44\n\tfor <incoming@patchwork.ozlabs.org>;\n\tThu,  5 Oct 2017 04:30:29 +1100 (AEDT)","(qmail 98231 invoked by alias); 4 Oct 2017 17:30:15 -0000","(qmail 90480 invoked by uid 89); 4 Oct 2017 17:30:03 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=hbrg\n\t7pWMeECAhXSGGufHhBaXfqr8/VXq+oh+LGZL2+cNalHnudr5MpbDm8rJUqZfMmx0\n\tXEJc4OCw7dZQfXVVDEthKKDckrclrvpUw/OD7bHF3X+/KXWPx2GqueLj/fDQM6YS\n\tHuYR6iLF6YeM/nD1Tyx7pqv+bc+He2j4tQTYLnc=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=0o54ddxD/c\n\t8sXoI2lYj0j1hEwo4=; b=B0TZUvC8nuuRe9JYb/g3bp55ogRKR8wwQOvOSGCqzY\n\t66Oc/BNf4WbeSqeNqKEH/WVlWPgKfZZZ1TE9HnVzdHWwVKWTnIjdPEmPqyRSIcyn\n\t4cJa3pigJlv0DVyQSJm8rWJHpKBwwhlaYNVSC/JXEujaLT5dugo3o8pVA+k5xMOJ\n\t4=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=4.1 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD, SPAM_BODY,\n\tSPF_PASS autolearn=no version=3.3.2 spammy=H*Ad:D*columbia.edu,\n\tcleared","X-HELO":"foss.arm.com","Date":"Wed, 4 Oct 2017 18:29:56 +0100","From":"Catalin Marinas <catalin.marinas@arm.com>","To":"Dave Martin <Dave.Martin@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org, Ard Biesheuvel\n\t<ard.biesheuvel@linaro.org>, \tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, kvmarm@lists.cs.columbia.edu,\n\tlinux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20171004172955.tyooqxdoo73iq66u@armageddon.cambridge.arm.com>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>\n\t<20170913143325.hi4cfcajbts3bbao@localhost>\n\t<20170914195555.GB24231@e103592.cambridge.arm.com>\n\t<20170920135856.dfuowttwi4bk4nqb@localhost>\n\t<20171003111100.GQ3611@e103592.cambridge.arm.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20171003111100.GQ3611@e103592.cambridge.arm.com>","User-Agent":"NeoMutt/20170113 (1.7.2)"}},{"id":1780508,"web_url":"http://patchwork.ozlabs.org/comment/1780508/","msgid":"<20171005112835.2rhaqx23bhe6tnwl@armageddon.cambridge.arm.com>","list_archive_url":null,"date":"2017-10-05T11:28:35","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":938,"url":"http://patchwork.ozlabs.org/api/people/938/","name":"Catalin Marinas","email":"catalin.marinas@arm.com"},"content":"On Tue, Oct 03, 2017 at 12:33:03PM +0100, Dave P Martin wrote:\n> On Wed, Sep 13, 2017 at 07:33:25AM -0700, Catalin Marinas wrote:\n> > On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:\n> > > +/*\n> > > + * Handle SVE state across fork():\n> > > + *\n> > > + * dst and src must not end up with aliases of the same sve_state.\n> > > + * Because a task cannot fork except in a syscall, we can discard SVE\n> > > + * state for dst here, so long as we take care to retain the FPSIMD\n> > > + * subset of the state if SVE is in use.  Reallocation of the SVE state\n> > > + * will be deferred until dst tries to use SVE.\n> > > + */\n> > > +void fpsimd_dup_sve(struct task_struct *dst, struct task_struct const *src)\n> > > +{\n> > > +\tif (test_and_clear_tsk_thread_flag(dst, TIF_SVE)) {\n> > > +\t\tWARN_ON(dst->mm && !in_syscall(task_pt_regs(dst)));\n> > > +\t\tsve_to_fpsimd(dst);\n> > > +\t}\n> > > +\n> > > +\tdst->thread.sve_state = NULL;\n> > > +}\n> > \n> > I first thought the thread flags are not visible in dst yet since\n> > dup_task_struct() calls arch_dup_task_struct() before\n> > setup_thread_stack(). However, at the end of the last year we enabled\n> > CONFIG_THREAD_INFO_IN_TASK_STRUCT. But I don't particularly like relying\n> > on this.\n> > \n> > Anyway, IIUC we don't need sve_to_fpsimd() here. The\n> > arch_dup_task_struct() already called fpsimd_preserve_current_state()\n> > for src, so the FPSIMD state (which we care about) is transferred during\n> > the *dst = *src assignment. So you'd only need the last statement,\n> > possibly with a different function name like fpsimd_erase_sve (and maybe\n> > make the function static inline in the header).\n> \n> Regarding the intended meanings of the thread flags, does this help?\n> \n> --8<--\n> \n> TIF_FOREIGN_FPSTATE's meaning is expanded to cover SVE, but otherwise\n> unchanged:\n> \n>  * If a task is running and !TIF_FOREIGN_FPSTATE, then the the CPU\n>    registers of the CPU the task is running on contain the authoritative\n>    FPSIMD/SVE state of the task.  The backing memory may be stale.\n> \n>  * Otherwise (i.e., task not running, or task running and\n>    TIF_FOREIGN_FPSTATE), the task's FPSIMD/SVE backing memory is\n>    authoritative.  If additionally per_cpu(fpsimd_last_state,\n>    task->fpsimd_state.cpu) == &task->fpsimd_state.cpu, then\n>    task->fpsimd_state.cpu's registers are also up to date for task, but\n>    not authorititive: the current FPSIMD/SVE state may be read from\n>    them, but they must not be written.\n>  \n> The FPSIMD/SVE backing memory is selected by TIF_SVE:\n> \n>  * TIF_SVE set: Zn (incorporating Vn in bits[127:0]), Pn and FFR are\n>    stored in task->thread.sve_state, formatted appropriately for vector\n>    length task->thread.sve_vl.  task->thread.sve_state must point to a\n>    valid buffer at least sve_state_size(task) bytes in size.\n> \n>  * TIF_SVE clear: Vn are stored in task->fpsimd_state; Zn[max : 128] are\n>    logically zero[*] but not stored anywhere; Pn, FFR are not stored and\n>    have unspecified values from userspace's point of view.\n>    task->thread.sve_state does not need to be non-null, valid or any\n>    particular size: it must not be dereferenced.\n> \n>    In practice I don't exploit the \"unspecifiedness\" much.  The Zn high\n>    bits, Pn and FFR are all zeroed when setting TIF_SVE again:\n>    sve_alloc() is the common path for this.\n> \n>  * FPSR and FPCR are always stored in task->fpsimd_state irrespctive of\n>    whether TIF_SVE is clear or set, since these are not vector length\n>    dependent.\n\nThis looks fine. I think we need to make sure (with a warning) that\ntask_fpsimd_save() (and probably load) is always called in a\nnon-preemptible context. \n\n> [*] theoretically unspecified, which is what I've written in sve.txt.\n> However, on any FPSIMD Vn write the upper bits of the corresponding Zn\n> must become logically zero in order to conform to the SVE programmer's\n> model.  It's not feasible to track which Vn have been written\n> individually because that would involve trapping every FPSIMD insn until\n> all possible Vn have been written.  So the kernel ensures that the Zn\n> high bits become zero.\n> \n> Maybe we should just guarantee \"zero-or-preserved\" behaviour for\n> userspace.  This may close down some future optimisation opportunities,\n> but maybe it's better to be simple.\n\nDoes it work if we leave it as \"unspecified\" in the document but just do\nzero-or-preserved in the kernel code?\n\nJust wondering, as an optimisation for do_sve_acc() - instead of\nsve_alloc() and fpsimd_to_sve(), can we not force the loading of the\nFPSIMD regs on the return to user via TIF_FOREIGN_FPSTATE? This would\nensure the zeroing of the top SVE bits and we only need to allocate the\nSVE state on the saving path. This means enabling SVE for user and\nsetting TIF_SVE without having the backing storage allocated.","headers":{"Return-Path":"<libc-alpha-return-85419-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-85419-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"sClVTN2a\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3y79Wk0K9Rz9t2h\n\tfor <incoming@patchwork.ozlabs.org>;\n\tThu,  5 Oct 2017 22:28:49 +1100 (AEDT)","(qmail 1837 invoked by alias); 5 Oct 2017 11:28:43 -0000","(qmail 1819 invoked by uid 89); 5 Oct 2017 11:28:42 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=xG0A\n\tKvmIGn2mUErDn5hLUzrjTRz1pnucvnWErcKxdN1TJSn0nw1EfehMYJTsWHhSWXkX\n\tORTvrJvZv9J0pfehztdx4uVNpbcNHIJPV8PTwSUJ/j/1THK3S7w40u8bd847SFtI\n\tqwzZkViY9imCnXhlVqdBbpLkCD98ZEtKIFtH0GA=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=qOG9WzhfAD\n\tBiwWEAyhjGXAp1NWw=; b=sClVTN2a6JHOMmSVdVxJ1jVE7KMnj9sZExB5Z1goVk\n\tvlSOVN4fyhg8+TrSBaD6gaSB5LUptx5HTqewFttWCOwK0HB1sqb/dzy5yvbkgcZm\n\tiEK4S8ZpzL282GjrO1QNtUUbPAA3ALMreltziET/Pak7/W0/+4PZSawiPhUS7U+d\n\tk=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-1.9 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2\n\tspammy=HX-Envelope-From:sk:catalin","X-HELO":"foss.arm.com","Date":"Thu, 5 Oct 2017 12:28:35 +0100","From":"Catalin Marinas <catalin.marinas@arm.com>","To":"Dave Martin <Dave.Martin@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org, Ard Biesheuvel\n\t<ard.biesheuvel@linaro.org>, \tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, kvmarm@lists.cs.columbia.edu,\n\tlinux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20171005112835.2rhaqx23bhe6tnwl@armageddon.cambridge.arm.com>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>\n\t<20170913143325.hi4cfcajbts3bbao@localhost>\n\t<20171003113302.GR3611@e103592.cambridge.arm.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20171003113302.GR3611@e103592.cambridge.arm.com>","User-Agent":"NeoMutt/20170113 (1.7.2)"}},{"id":1781620,"web_url":"http://patchwork.ozlabs.org/comment/1781620/","msgid":"<20171006131005.GA3611@e103592.cambridge.arm.com>","list_archive_url":null,"date":"2017-10-06T13:10:09","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":26612,"url":"http://patchwork.ozlabs.org/api/people/26612/","name":"Dave Martin","email":"Dave.Martin@arm.com"},"content":"On Thu, Oct 05, 2017 at 12:28:35PM +0100, Catalin Marinas wrote:\n> On Tue, Oct 03, 2017 at 12:33:03PM +0100, Dave P Martin wrote:\n> > On Wed, Sep 13, 2017 at 07:33:25AM -0700, Catalin Marinas wrote:\n> > > On Thu, Aug 31, 2017 at 06:00:43PM +0100, Dave P Martin wrote:\n> > > > +/*\n> > > > + * Handle SVE state across fork():\n> > > > + *\n> > > > + * dst and src must not end up with aliases of the same sve_state.\n> > > > + * Because a task cannot fork except in a syscall, we can discard SVE\n> > > > + * state for dst here, so long as we take care to retain the FPSIMD\n> > > > + * subset of the state if SVE is in use.  Reallocation of the SVE state\n> > > > + * will be deferred until dst tries to use SVE.\n> > > > + */\n> > > > +void fpsimd_dup_sve(struct task_struct *dst, struct task_struct const *src)\n> > > > +{\n> > > > +\tif (test_and_clear_tsk_thread_flag(dst, TIF_SVE)) {\n> > > > +\t\tWARN_ON(dst->mm && !in_syscall(task_pt_regs(dst)));\n> > > > +\t\tsve_to_fpsimd(dst);\n> > > > +\t}\n> > > > +\n> > > > +\tdst->thread.sve_state = NULL;\n> > > > +}\n> > > \n> > > I first thought the thread flags are not visible in dst yet since\n> > > dup_task_struct() calls arch_dup_task_struct() before\n> > > setup_thread_stack(). However, at the end of the last year we enabled\n> > > CONFIG_THREAD_INFO_IN_TASK_STRUCT. But I don't particularly like relying\n> > > on this.\n> > > \n> > > Anyway, IIUC we don't need sve_to_fpsimd() here. The\n> > > arch_dup_task_struct() already called fpsimd_preserve_current_state()\n> > > for src, so the FPSIMD state (which we care about) is transferred during\n> > > the *dst = *src assignment. So you'd only need the last statement,\n> > > possibly with a different function name like fpsimd_erase_sve (and maybe\n> > > make the function static inline in the header).\n> > \n> > Regarding the intended meanings of the thread flags, does this help?\n> > \n> > --8<--\n> > \n> > TIF_FOREIGN_FPSTATE's meaning is expanded to cover SVE, but otherwise\n> > unchanged:\n> > \n> >  * If a task is running and !TIF_FOREIGN_FPSTATE, then the the CPU\n> >    registers of the CPU the task is running on contain the authoritative\n> >    FPSIMD/SVE state of the task.  The backing memory may be stale.\n> > \n> >  * Otherwise (i.e., task not running, or task running and\n> >    TIF_FOREIGN_FPSTATE), the task's FPSIMD/SVE backing memory is\n> >    authoritative.  If additionally per_cpu(fpsimd_last_state,\n> >    task->fpsimd_state.cpu) == &task->fpsimd_state.cpu, then\n> >    task->fpsimd_state.cpu's registers are also up to date for task, but\n> >    not authorititive: the current FPSIMD/SVE state may be read from\n> >    them, but they must not be written.\n> >  \n> > The FPSIMD/SVE backing memory is selected by TIF_SVE:\n> > \n> >  * TIF_SVE set: Zn (incorporating Vn in bits[127:0]), Pn and FFR are\n> >    stored in task->thread.sve_state, formatted appropriately for vector\n> >    length task->thread.sve_vl.  task->thread.sve_state must point to a\n> >    valid buffer at least sve_state_size(task) bytes in size.\n> > \n> >  * TIF_SVE clear: Vn are stored in task->fpsimd_state; Zn[max : 128] are\n> >    logically zero[*] but not stored anywhere; Pn, FFR are not stored and\n> >    have unspecified values from userspace's point of view.\n> >    task->thread.sve_state does not need to be non-null, valid or any\n> >    particular size: it must not be dereferenced.\n> > \n> >    In practice I don't exploit the \"unspecifiedness\" much.  The Zn high\n> >    bits, Pn and FFR are all zeroed when setting TIF_SVE again:\n> >    sve_alloc() is the common path for this.\n> > \n> >  * FPSR and FPCR are always stored in task->fpsimd_state irrespctive of\n> >    whether TIF_SVE is clear or set, since these are not vector length\n> >    dependent.\n> \n> This looks fine. I think we need to make sure (with a warning) that\n> task_fpsimd_save() (and probably load) is always called in a\n> non-preemptible context. \n> \n> > [*] theoretically unspecified, which is what I've written in sve.txt.\n> > However, on any FPSIMD Vn write the upper bits of the corresponding Zn\n> > must become logically zero in order to conform to the SVE programmer's\n> > model.  It's not feasible to track which Vn have been written\n> > individually because that would involve trapping every FPSIMD insn until\n> > all possible Vn have been written.  So the kernel ensures that the Zn\n> > high bits become zero.\n> > \n> > Maybe we should just guarantee \"zero-or-preserved\" behaviour for\n> > userspace.  This may close down some future optimisation opportunities,\n> > but maybe it's better to be simple.\n> \n> Does it work if we leave it as \"unspecified\" in the document but just do\n> zero-or-preserved in the kernel code?\n\nSure, that's the behaviour today in effect.\n\nI'll leave the documentation unchanged, then we can take advantage of\nthis flexibility later if is proves to be useful.\n\n> Just wondering, as an optimisation for do_sve_acc() - instead of\n> sve_alloc() and fpsimd_to_sve(), can we not force the loading of the\n> FPSIMD regs on the return to user via TIF_FOREIGN_FPSTATE? This would\n> ensure the zeroing of the top SVE bits and we only need to allocate the\n> SVE state on the saving path. This means enabling SVE for user and\n> setting TIF_SVE without having the backing storage allocated.\n\nCurrently the set of places where the \"TIF_SVE implies sve_state valid\"\nassumption is applied is not very constrained, so while your suggestion\nis reasonable I'd rather not mess with it just now, if possible.\n\n\nBut we can do this (which is what my current fixup has):\n\nel0_sve_acc:\n\tenable_dbg_and_irq\n\t// ...\n\tbl\tdo_sve_acc\n\tb\tret_to_user\n\nvoid do_sve_acc(unsigned int esr, struct pt_regs *regs)\n{\n\t/* Even if we chose not to use SVE, the hardware could still trap: */\n\tif (unlikely(!system_supports_sve()) || WARN_ON(is_compat_task())) {\n\t\tforce_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);\n\t\treturn;\n\t}\n\n\tsve_alloc(current);\n\n\tlocal_bh_disable();\n\tif (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {\n\t\ttask_fpsimd_load(); /* flushes high Zn bits as a side-effect */\n\t\tsve_flush_pregs();\n\t} else {\n\t\tsve_flush_all(); /* flush all the SVE bits in-place */\n\t}\n\n\tif (test_and_set_thread_flag(TIF_SVE))\n\t\tWARN_ON(1); /* SVE access shouldn't have trapped */\n\tlocal_bh_enable();\n}\n\nwhere sve_flush_all() zeroes all the high Zn bits via a series of\nMOV Vn, Vn instructions, and also zeroes Pn and FFR.  sve_fplush_pregs()\njust does the latter.\n\nIn the common case the sve_alloc() does a memzero, but doesn't\nallocate memory because more often than not, it will already have\nbeen allocated.\n\nOn this path, the memzero is now pointless -- I'll need to double-check\nwhat else may be impacted by its removal (probably only ptrace, and I'm\nnot sure if the zeroing is strictly required there).\n\n\nThere is still a bit of room for improvement, but it's still a step up\nfrom v2.\n\nWhat do you think?\n\nCheers\n---Dave","headers":{"Return-Path":"<libc-alpha-return-85504-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-85504-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"XO1Xw23H\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3y7qkS06TDz9t2f\n\tfor <incoming@patchwork.ozlabs.org>;\n\tSat,  7 Oct 2017 00:10:23 +1100 (AEDT)","(qmail 37979 invoked by alias); 6 Oct 2017 13:10:18 -0000","(qmail 37963 invoked by uid 89); 6 Oct 2017 13:10:17 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=qqsx\n\tAqMNHW8naLxNP4erisY9N47HMilf4oeSKspAMorvzI7Taz+oXPaQhjkuWqHP4wWP\n\tmCGWLbKuixVMD1G//3J+ALJ7PgmBM1i9zYGSiimIH3miauKMTR+927MCaan9jhtf\n\tWSe+O5296HTx7RqHb/4YpkwVGWuz46zvQ6b1LZo=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=ML6X/DSBB3\n\toR7+6DXfXQNdvCMZQ=; b=XO1Xw23HPP7aH/AT0fTcgxz37pNlKjYeGl0wA5bIak\n\t+CiCZwMUBYLELU4eyT00dGNKM850k5Md1mGS/PaOZXuM9/x++/Wu8cIN4TfQoZYz\n\tnLJhVIatgOI97ENYAWqFyskXqE0RbbRLAUmbTBGxkcftOfQ3v0bHTwoDAVhlxyrR\n\t8=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-1.9 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=","X-HELO":"foss.arm.com","Date":"Fri, 6 Oct 2017 14:10:09 +0100","From":"Dave Martin <Dave.Martin@arm.com>","To":"Catalin Marinas <catalin.marinas@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org, Ard Biesheuvel\n\t<ard.biesheuvel@linaro.org>, \tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, kvmarm@lists.cs.columbia.edu,\n\tlinux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20171006131005.GA3611@e103592.cambridge.arm.com>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>\n\t<20170913143325.hi4cfcajbts3bbao@localhost>\n\t<20171003113302.GR3611@e103592.cambridge.arm.com>\n\t<20171005112835.2rhaqx23bhe6tnwl@armageddon.cambridge.arm.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20171005112835.2rhaqx23bhe6tnwl@armageddon.cambridge.arm.com>","User-Agent":"Mutt/1.5.23 (2014-03-12)"}},{"id":1781654,"web_url":"http://patchwork.ozlabs.org/comment/1781654/","msgid":"<20171006133639.ialgglabo7yca5bm@armageddon.cambridge.arm.com>","list_archive_url":null,"date":"2017-10-06T13:36:40","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":938,"url":"http://patchwork.ozlabs.org/api/people/938/","name":"Catalin Marinas","email":"catalin.marinas@arm.com"},"content":"On Fri, Oct 06, 2017 at 02:10:09PM +0100, Dave P Martin wrote:\n> On Thu, Oct 05, 2017 at 12:28:35PM +0100, Catalin Marinas wrote:\n> > On Tue, Oct 03, 2017 at 12:33:03PM +0100, Dave P Martin wrote:\n> > > TIF_FOREIGN_FPSTATE's meaning is expanded to cover SVE, but otherwise\n> > > unchanged:\n> > > \n> > >  * If a task is running and !TIF_FOREIGN_FPSTATE, then the the CPU\n> > >    registers of the CPU the task is running on contain the authoritative\n> > >    FPSIMD/SVE state of the task.  The backing memory may be stale.\n> > > \n> > >  * Otherwise (i.e., task not running, or task running and\n> > >    TIF_FOREIGN_FPSTATE), the task's FPSIMD/SVE backing memory is\n> > >    authoritative.  If additionally per_cpu(fpsimd_last_state,\n> > >    task->fpsimd_state.cpu) == &task->fpsimd_state.cpu, then\n> > >    task->fpsimd_state.cpu's registers are also up to date for task, but\n> > >    not authorititive: the current FPSIMD/SVE state may be read from\n> > >    them, but they must not be written.\n> > >  \n> > > The FPSIMD/SVE backing memory is selected by TIF_SVE:\n> > > \n> > >  * TIF_SVE set: Zn (incorporating Vn in bits[127:0]), Pn and FFR are\n> > >    stored in task->thread.sve_state, formatted appropriately for vector\n> > >    length task->thread.sve_vl.  task->thread.sve_state must point to a\n> > >    valid buffer at least sve_state_size(task) bytes in size.\n\n\"Zn [...] stored in  task->thread.sve_state\" - is this still true with\nthe changes you proposed? I guess even without these changes, you have\nsituations where the hardware regs are out of sync with sve_state (see\nmore below).\n\n> > >  * TIF_SVE clear: Vn are stored in task->fpsimd_state; Zn[max : 128] are\n> > >    logically zero[*] but not stored anywhere; Pn, FFR are not stored and\n> > >    have unspecified values from userspace's point of view.\n> > >    task->thread.sve_state does not need to be non-null, valid or any\n> > >    particular size: it must not be dereferenced.\n> > > \n> > >    In practice I don't exploit the \"unspecifiedness\" much.  The Zn high\n> > >    bits, Pn and FFR are all zeroed when setting TIF_SVE again:\n> > >    sve_alloc() is the common path for this.\n> > > \n> > >  * FPSR and FPCR are always stored in task->fpsimd_state irrespctive of\n> > >    whether TIF_SVE is clear or set, since these are not vector length\n> > >    dependent.\n[...]\n> > Just wondering, as an optimisation for do_sve_acc() - instead of\n> > sve_alloc() and fpsimd_to_sve(), can we not force the loading of the\n> > FPSIMD regs on the return to user via TIF_FOREIGN_FPSTATE? This would\n> > ensure the zeroing of the top SVE bits and we only need to allocate the\n> > SVE state on the saving path. This means enabling SVE for user and\n> > setting TIF_SVE without having the backing storage allocated.\n> \n> Currently the set of places where the \"TIF_SVE implies sve_state valid\"\n> assumption is applied is not very constrained, so while your suggestion\n> is reasonable I'd rather not mess with it just now, if possible.\n> \n> \n> But we can do this (which is what my current fixup has):\n> \n> el0_sve_acc:\n> \tenable_dbg_and_irq\n> \t// ...\n> \tbl\tdo_sve_acc\n> \tb\tret_to_user\n> \n> void do_sve_acc(unsigned int esr, struct pt_regs *regs)\n> {\n> \t/* Even if we chose not to use SVE, the hardware could still trap: */\n> \tif (unlikely(!system_supports_sve()) || WARN_ON(is_compat_task())) {\n> \t\tforce_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);\n> \t\treturn;\n> \t}\n> \n> \tsve_alloc(current);\n> \n> \tlocal_bh_disable();\n> \tif (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {\n> \t\ttask_fpsimd_load(); /* flushes high Zn bits as a side-effect */\n> \t\tsve_flush_pregs();\n> \t} else {\n> \t\tsve_flush_all(); /* flush all the SVE bits in-place */\n> \t}\n> \n> \tif (test_and_set_thread_flag(TIF_SVE))\n> \t\tWARN_ON(1); /* SVE access shouldn't have trapped */\n> \tlocal_bh_enable();\n> }\n> \n> where sve_flush_all() zeroes all the high Zn bits via a series of\n> MOV Vn, Vn instructions, and also zeroes Pn and FFR.  sve_fplush_pregs()\n> just does the latter.\n\nThis looks fine to me but I added a comment above. IIUC, we can now have\nTIF_SVE set while sve_state contains stale data. I don't see an issue\ngiven that every time you enter the kernel from user space you have\nTIF_SVE set and the sve_state storage out of sync. Maybe tweak the\nTIF_SVE description above slightly.","headers":{"Return-Path":"<libc-alpha-return-85505-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-85505-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"SvoHohC5\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3y7rK20rVZz9s7M\n\tfor <incoming@patchwork.ozlabs.org>;\n\tSat,  7 Oct 2017 00:36:53 +1100 (AEDT)","(qmail 94815 invoked by alias); 6 Oct 2017 13:36:48 -0000","(qmail 94802 invoked by uid 89); 6 Oct 2017 13:36:47 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=AlnT\n\tl+4bBoO8RVdXkx7IuxjqBmGJEb2V+VMrRIxptFxcCw9rGXWSMhEYhp3Qt6+3SXuP\n\t4Z1ozjA0W1Cg1ZcwnwA+0+ZEI7RTrDRBdM0UvKp5SR7fOcim3K3dWwM291tkHCwY\n\tOM24ibb39koFSvJf5u03yWOW/em1G0UeHtmXy9s=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=z+N/VHfGyV\n\tIolG5ngt74HQ9FzkA=; b=SvoHohC5JVFMzayqadBgc5lk/ty1pnHJqdY+RAmgPt\n\tiRH8iTdYnLee2kK8DTR37CW90TWjEDk2qpEeOqA4JN0NjF1dbUD3tjX7T1s4dXPf\n\tbFVs8Sgvx1Hd4ec/W7Jicl+wL2ZClIj2krDEgNco41/6gbtrDhdSd50J/spjGmlQ\n\t0=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-1.9 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=","X-HELO":"foss.arm.com","Date":"Fri, 6 Oct 2017 14:36:40 +0100","From":"Catalin Marinas <catalin.marinas@arm.com>","To":"Dave Martin <Dave.Martin@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org, Ard Biesheuvel\n\t<ard.biesheuvel@linaro.org>, \tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, kvmarm@lists.cs.columbia.edu,\n\tlinux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20171006133639.ialgglabo7yca5bm@armageddon.cambridge.arm.com>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>\n\t<20170913143325.hi4cfcajbts3bbao@localhost>\n\t<20171003113302.GR3611@e103592.cambridge.arm.com>\n\t<20171005112835.2rhaqx23bhe6tnwl@armageddon.cambridge.arm.com>\n\t<20171006131005.GA3611@e103592.cambridge.arm.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20171006131005.GA3611@e103592.cambridge.arm.com>","User-Agent":"NeoMutt/20170113 (1.7.2)"}},{"id":1781719,"web_url":"http://patchwork.ozlabs.org/comment/1781719/","msgid":"<20171006151528.GB3611@e103592.cambridge.arm.com>","list_archive_url":null,"date":"2017-10-06T15:15:28","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":26612,"url":"http://patchwork.ozlabs.org/api/people/26612/","name":"Dave Martin","email":"Dave.Martin@arm.com"},"content":"On Fri, Oct 06, 2017 at 02:36:40PM +0100, Catalin Marinas wrote:\n> On Fri, Oct 06, 2017 at 02:10:09PM +0100, Dave P Martin wrote:\n> > On Thu, Oct 05, 2017 at 12:28:35PM +0100, Catalin Marinas wrote:\n> > > On Tue, Oct 03, 2017 at 12:33:03PM +0100, Dave P Martin wrote:\n> > > > TIF_FOREIGN_FPSTATE's meaning is expanded to cover SVE, but otherwise\n> > > > unchanged:\n> > > > \n> > > >  * If a task is running and !TIF_FOREIGN_FPSTATE, then the the CPU\n> > > >    registers of the CPU the task is running on contain the authoritative\n> > > >    FPSIMD/SVE state of the task.  The backing memory may be stale.\n> > > > \n> > > >  * Otherwise (i.e., task not running, or task running and\n> > > >    TIF_FOREIGN_FPSTATE), the task's FPSIMD/SVE backing memory is\n> > > >    authoritative.  If additionally per_cpu(fpsimd_last_state,\n> > > >    task->fpsimd_state.cpu) == &task->fpsimd_state.cpu, then\n> > > >    task->fpsimd_state.cpu's registers are also up to date for task, but\n> > > >    not authorititive: the current FPSIMD/SVE state may be read from\n> > > >    them, but they must not be written.\n> > > >  \n> > > > The FPSIMD/SVE backing memory is selected by TIF_SVE:\n> > > > \n> > > >  * TIF_SVE set: Zn (incorporating Vn in bits[127:0]), Pn and FFR are\n> > > >    stored in task->thread.sve_state, formatted appropriately for vector\n> > > >    length task->thread.sve_vl.  task->thread.sve_state must point to a\n> > > >    valid buffer at least sve_state_size(task) bytes in size.\n> \n> \"Zn [...] stored in  task->thread.sve_state\" - is this still true with\n> the changes you proposed? I guess even without these changes, you have\n> situations where the hardware regs are out of sync with sve_state (see\n> more below).\n\nI guess I need to tweak the wording here.\n\nTIF_SVE says where the vector state should be loaded/stored from,\nbut does not say whether the data is up to date in memory, or when\nit should be loaded/stored.\n\nThe latter is described by a cocktail of different things including\nwhich bit of kernel code we are executing if any, whether the task\nis running/stopped etc., TIF_FOREIGN_FPSTATE,\ntask->thread.fpsimd_state.cpu and per_cpu(fpsimd_last_state).\n\n\nDoes this make any better sense of my code below?\n\n> \n> > > >  * TIF_SVE clear: Vn are stored in task->fpsimd_state; Zn[max : 128] are\n> > > >    logically zero[*] but not stored anywhere; Pn, FFR are not stored and\n> > > >    have unspecified values from userspace's point of view.\n> > > >    task->thread.sve_state does not need to be non-null, valid or any\n> > > >    particular size: it must not be dereferenced.\n> > > > \n> > > >    In practice I don't exploit the \"unspecifiedness\" much.  The Zn high\n> > > >    bits, Pn and FFR are all zeroed when setting TIF_SVE again:\n> > > >    sve_alloc() is the common path for this.\n> > > > \n> > > >  * FPSR and FPCR are always stored in task->fpsimd_state irrespctive of\n> > > >    whether TIF_SVE is clear or set, since these are not vector length\n> > > >    dependent.\n> [...]\n> > > Just wondering, as an optimisation for do_sve_acc() - instead of\n> > > sve_alloc() and fpsimd_to_sve(), can we not force the loading of the\n> > > FPSIMD regs on the return to user via TIF_FOREIGN_FPSTATE? This would\n> > > ensure the zeroing of the top SVE bits and we only need to allocate the\n> > > SVE state on the saving path. This means enabling SVE for user and\n> > > setting TIF_SVE without having the backing storage allocated.\n> > \n> > Currently the set of places where the \"TIF_SVE implies sve_state valid\"\n> > assumption is applied is not very constrained, so while your suggestion\n> > is reasonable I'd rather not mess with it just now, if possible.\n> > \n> > \n> > But we can do this (which is what my current fixup has):\n> > \n> > el0_sve_acc:\n> > \tenable_dbg_and_irq\n> > \t// ...\n> > \tbl\tdo_sve_acc\n> > \tb\tret_to_user\n> > \n> > void do_sve_acc(unsigned int esr, struct pt_regs *regs)\n> > {\n> > \t/* Even if we chose not to use SVE, the hardware could still trap: */\n> > \tif (unlikely(!system_supports_sve()) || WARN_ON(is_compat_task())) {\n> > \t\tforce_signal_inject(SIGILL, ILL_ILLOPC, regs, 0);\n> > \t\treturn;\n> > \t}\n> > \n> > \tsve_alloc(current);\n> > \n> > \tlocal_bh_disable();\n> > \tif (test_and_clear_thread_flag(TIF_FOREIGN_FPSTATE)) {\n> > \t\ttask_fpsimd_load(); /* flushes high Zn bits as a side-effect */\n> > \t\tsve_flush_pregs();\n> > \t} else {\n> > \t\tsve_flush_all(); /* flush all the SVE bits in-place */\n> > \t}\n> > \n> > \tif (test_and_set_thread_flag(TIF_SVE))\n> > \t\tWARN_ON(1); /* SVE access shouldn't have trapped */\n> > \tlocal_bh_enable();\n> > }\n> > \n> > where sve_flush_all() zeroes all the high Zn bits via a series of\n> > MOV Vn, Vn instructions, and also zeroes Pn and FFR.  sve_fplush_pregs()\n> > just does the latter.\n> \n> This looks fine to me but I added a comment above. IIUC, we can now have\n> TIF_SVE set while sve_state contains stale data. I don't see an issue\n> given that every time you enter the kernel from user space you have\n> TIF_SVE set and the sve_state storage out of sync. Maybe tweak the\n> TIF_SVE description above slightly.\n> \n\nSee my comment above ... any better?\n\nIf so, I'll paste some of that explanatory text into fpsimd.c (in lieu\nof a better place to put it).\n\nCheers\n---Dave","headers":{"Return-Path":"<libc-alpha-return-85511-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-85511-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"cvlHDs8K\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3y7tW458fSz9t3m\n\tfor <incoming@patchwork.ozlabs.org>;\n\tSat,  7 Oct 2017 02:15:44 +1100 (AEDT)","(qmail 120644 invoked by alias); 6 Oct 2017 15:15:36 -0000","(qmail 120630 invoked by uid 89); 6 Oct 2017 15:15:36 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=w62D\n\twxgNmpVKofuho45qdv/xRjneHXkzPzE4c5act5omqSnAh5+orq085KHs8u1yN3Xh\n\tQC+yWDnoCQAiqFZ2IygU1VdKCSA18YQpClWnaUwxMvqmLpGwswLSVZEoRs63uv3I\n\tnkBc5WfIYwzKVx/NcXyGdmMxpvr+mc1lnBl49Ok=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=11FpwpTc2t\n\th3xXy3ssN8gJcaTfw=; b=cvlHDs8KgfXYx+AFnGXLJn7Xyoytcy9IpPo6atgYSF\n\tQppjUQFbc0qctUQMHiSkhHFQAgajuQ5lZpNu6CuWC9sGfmRfDxUMX8KVRW1QvcsS\n\tAMU6LEtJ7mgb/lcdB7LrqqwZ5dPERZE9jxRAZvW4gC8kxOAn3qt/X1u/l//saJoB\n\tA=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-1.9 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD,\n\tSPF_PASS autolearn=ham version=3.3.2 spammy=lieu","X-HELO":"foss.arm.com","Date":"Fri, 6 Oct 2017 16:15:28 +0100","From":"Dave Martin <Dave.Martin@arm.com>","To":"Catalin Marinas <catalin.marinas@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org, Ard Biesheuvel\n\t<ard.biesheuvel@linaro.org>, \tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, kvmarm@lists.cs.columbia.edu,\n\tlinux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20171006151528.GB3611@e103592.cambridge.arm.com>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>\n\t<20170913143325.hi4cfcajbts3bbao@localhost>\n\t<20171003113302.GR3611@e103592.cambridge.arm.com>\n\t<20171005112835.2rhaqx23bhe6tnwl@armageddon.cambridge.arm.com>\n\t<20171006131005.GA3611@e103592.cambridge.arm.com>\n\t<20171006133639.ialgglabo7yca5bm@armageddon.cambridge.arm.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20171006133639.ialgglabo7yca5bm@armageddon.cambridge.arm.com>","User-Agent":"Mutt/1.5.23 (2014-03-12)"}},{"id":1781732,"web_url":"http://patchwork.ozlabs.org/comment/1781732/","msgid":"<20171006153312.cushhcb4m2qc4kwd@armageddon.cambridge.arm.com>","list_archive_url":null,"date":"2017-10-06T15:33:13","subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","submitter":{"id":938,"url":"http://patchwork.ozlabs.org/api/people/938/","name":"Catalin Marinas","email":"catalin.marinas@arm.com"},"content":"On Fri, Oct 06, 2017 at 04:15:28PM +0100, Dave P Martin wrote:\n> On Fri, Oct 06, 2017 at 02:36:40PM +0100, Catalin Marinas wrote:\n> > On Fri, Oct 06, 2017 at 02:10:09PM +0100, Dave P Martin wrote:\n> > > On Thu, Oct 05, 2017 at 12:28:35PM +0100, Catalin Marinas wrote:\n> > > > On Tue, Oct 03, 2017 at 12:33:03PM +0100, Dave P Martin wrote:\n> > > > > TIF_FOREIGN_FPSTATE's meaning is expanded to cover SVE, but otherwise\n> > > > > unchanged:\n> > > > > \n> > > > >  * If a task is running and !TIF_FOREIGN_FPSTATE, then the the CPU\n> > > > >    registers of the CPU the task is running on contain the authoritative\n> > > > >    FPSIMD/SVE state of the task.  The backing memory may be stale.\n> > > > > \n> > > > >  * Otherwise (i.e., task not running, or task running and\n> > > > >    TIF_FOREIGN_FPSTATE), the task's FPSIMD/SVE backing memory is\n> > > > >    authoritative.  If additionally per_cpu(fpsimd_last_state,\n> > > > >    task->fpsimd_state.cpu) == &task->fpsimd_state.cpu, then\n> > > > >    task->fpsimd_state.cpu's registers are also up to date for task, but\n> > > > >    not authorititive: the current FPSIMD/SVE state may be read from\n> > > > >    them, but they must not be written.\n> > > > >  \n> > > > > The FPSIMD/SVE backing memory is selected by TIF_SVE:\n> > > > > \n> > > > >  * TIF_SVE set: Zn (incorporating Vn in bits[127:0]), Pn and FFR are\n> > > > >    stored in task->thread.sve_state, formatted appropriately for vector\n> > > > >    length task->thread.sve_vl.  task->thread.sve_state must point to a\n> > > > >    valid buffer at least sve_state_size(task) bytes in size.\n> > \n> > \"Zn [...] stored in  task->thread.sve_state\" - is this still true with\n> > the changes you proposed? I guess even without these changes, you have\n> > situations where the hardware regs are out of sync with sve_state (see\n> > more below).\n> \n> I guess I need to tweak the wording here.\n> \n> TIF_SVE says where the vector state should be loaded/stored from,\n> but does not say whether the data is up to date in memory, or when\n> it should be loaded/stored.\n> \n> The latter is described by a cocktail of different things including\n> which bit of kernel code we are executing if any, whether the task\n> is running/stopped etc., TIF_FOREIGN_FPSTATE,\n> task->thread.fpsimd_state.cpu and per_cpu(fpsimd_last_state).\n> \n> Does this make any better sense of my code below?\n\nYes, it looks fine.","headers":{"Return-Path":"<libc-alpha-return-85512-incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list libc-alpha@sourceware.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=sourceware.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=libc-alpha-return-85512-incoming=patchwork.ozlabs.org@sourceware.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tsecure) header.d=sourceware.org header.i=@sourceware.org\n\theader.b=\"O4Os6oju\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3y7tvX2lFTz9t3t\n\tfor <incoming@patchwork.ozlabs.org>;\n\tSat,  7 Oct 2017 02:33:28 +1100 (AEDT)","(qmail 85938 invoked by alias); 6 Oct 2017 15:33:20 -0000","(qmail 85924 invoked by uid 89); 6 Oct 2017 15:33:20 -0000"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; q=dns; s=default; b=yrk/\n\tEKyf909oO8p0ITVetoYe2xsDgKAPsnm0EuZ8DBkZhVVDB2WRhPY2i++zGtmFBcx8\n\t55oBbLWO2nfVXKfRCEveoJpArMf9lR7bipjPKNPWrJv3A5r8AkGdXc4hKXxtRd5X\n\th9pWHsJ/Ju8+nWzET2K41EyvIZWLJwA3A18ZEYA=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id\n\t:list-unsubscribe:list-subscribe:list-archive:list-post\n\t:list-help:sender:date:from:to:cc:subject:message-id:references\n\t:mime-version:content-type:in-reply-to; s=default; bh=hnqjoT61M7\n\tL6lypxPuGN80nzLxI=; b=O4Os6oju9zwBRHqTS1OGyPTbTRNYcP5JkRg1u2kT+2\n\tB6fI+chJBKpJOsDiDB0YGfjsve0L0A7nWq1V5xfwAMx2OeuBWRhPePIO88LSZhzU\n\ts1k1ksZrI4t7hZ1D4siZkgrshnNrHX3SP3njuwS7GU3PtCcSLLXYSigT+yxQj5ms\n\tY=","Mailing-List":"contact libc-alpha-help@sourceware.org; run by ezmlm","Precedence":"bulk","List-Id":"<libc-alpha.sourceware.org>","List-Unsubscribe":"<mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>","List-Subscribe":"<mailto:libc-alpha-subscribe@sourceware.org>","List-Archive":"<http://sourceware.org/ml/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-help@sourceware.org>,\n\t<http://sourceware.org/ml/#faqs>","Sender":"libc-alpha-owner@sourceware.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-1.9 required=5.0 tests=BAYES_00,\n\tRP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=","X-HELO":"foss.arm.com","Date":"Fri, 6 Oct 2017 16:33:13 +0100","From":"Catalin Marinas <catalin.marinas@arm.com>","To":"Dave Martin <Dave.Martin@arm.com>","Cc":"linux-arch@vger.kernel.org, libc-alpha@sourceware.org, Ard Biesheuvel\n\t<ard.biesheuvel@linaro.org>, \tSzabolcs Nagy <szabolcs.nagy@arm.com>,\n\tRichard Sandiford <richard.sandiford@arm.com>, \n\tWill Deacon <will.deacon@arm.com>, Alex =?iso-8859-1?q?Benn=E9e?=\n\t<alex.bennee@linaro.org>, kvmarm@lists.cs.columbia.edu,\n\tlinux-arm-kernel@lists.infradead.org","Subject":"Re: [PATCH v2 11/28] arm64/sve: Core task context handling","Message-ID":"<20171006153312.cushhcb4m2qc4kwd@armageddon.cambridge.arm.com>","References":"<1504198860-12951-1-git-send-email-Dave.Martin@arm.com>\n\t<1504198860-12951-12-git-send-email-Dave.Martin@arm.com>\n\t<20170913143325.hi4cfcajbts3bbao@localhost>\n\t<20171003113302.GR3611@e103592.cambridge.arm.com>\n\t<20171005112835.2rhaqx23bhe6tnwl@armageddon.cambridge.arm.com>\n\t<20171006131005.GA3611@e103592.cambridge.arm.com>\n\t<20171006133639.ialgglabo7yca5bm@armageddon.cambridge.arm.com>\n\t<20171006151528.GB3611@e103592.cambridge.arm.com>","MIME-Version":"1.0","Content-Type":"text/plain; charset=us-ascii","Content-Disposition":"inline","In-Reply-To":"<20171006151528.GB3611@e103592.cambridge.arm.com>","User-Agent":"NeoMutt/20170113 (1.7.2)"}}]