From patchwork Tue Jan 30 14:31:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1893009 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TPSLw1cf3z23fD for ; Wed, 31 Jan 2024 01:32:55 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B53D53857C46 for ; Tue, 30 Jan 2024 14:32:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 9BDB63858407 for ; Tue, 30 Jan 2024 14:32:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9BDB63858407 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9BDB63858407 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706625144; cv=none; b=Va1XmCFlYtgb+Vj+GuTeiKo6fCKcoOHkYGG0to8I94+eCeEb3ByVAEsR0lwOnWXbcQMRkFuqjlv3VPc0oUCgsalD2mfyUtYvpcuL839lyHRFXNpMvvL7l5HgTPFtV6ROJ954uqNx6nWbCEkFWD8NZN+CYXDAQiDtOriXL5JRMNc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706625144; c=relaxed/simple; bh=yZslbApIQ9dQ4s13QqbxuBu++MTCNtIpbIfa/idnpo0=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=Q93vnjc5+GK1vH7v9KA5Di9ZAH8iiLQiCzNg988yTill/XS0T0P0TjopY6YojXeAt8eT/6kt3GfbzAZtsNDmsuV3gy02WKUz43Jj609ZNvdTGD4iL2Ee+joO1rhIUz3PIX6hYpzIKoysv6G2izQi9XPYa9ODfUM1dKIMZkGEXYE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2B680DA7; Tue, 30 Jan 2024 06:33:04 -0800 (PST) Received: from e107157-lin.cambridge.arm.com (e107157-lin.cambridge.arm.com [10.2.78.70]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3597D3F762; Tue, 30 Jan 2024 06:32:19 -0800 (PST) From: Andre Vieira To: gcc-patches@gcc.gnu.org Cc: Richard.Sandiford@arm.com, rguenther@suse.de, Andre Vieira Subject: [PATCH 0/3] vect, aarch64: Add SVE support for simdclones Date: Tue, 30 Jan 2024 14:31:29 +0000 Message-Id: <20240130143132.9575-1-andre.simoesdiasvieira@arm.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, This patch series is a set of patches that I have sent up for review before and it enables initial support SVE simd clones with some caveats. Caveat 1: we do not support SVE simd clones with function bodies. To enable support for this we need to change the way we 'simdify' a function body. For each argument that maps to a vector an array is created with 'simdlen'. This however does not work for VLA simdlen. We will need to come up with a way to support this such that the generated code is performant, there's little reason to 'simdify' a function by generating really slow code. I have some ideas on how we might be able to do this, though I'm not convinced it's even worth trying, but I think that's a bigger discussion. For now I've disabled generating SVE simdclones for functions with function bodies. This still fits our libmvec usecase as the simd clones are handwritten using intrinsics in glibc. Caveat 2: we can not generate ncopy calls to a SVE simd clone call. When I first sent the second patch of this series upstream Richi asked me to look at enabling being able to support calling ncopies of VLA simdlen simd clones, I have vectorizer code to do this, however I found that we didn't yet have enough backend support to be able to index VLA vectors to support this. I think that's something that will need to wait until gcc 15, so for now I'd simply reject vectorization where that is required. Caveat 3: we don't yet support SVE simdclones for VLS codegen. We've disabled the use of SVE simdclones when the -msve-vector-bits option is used to request VLS codegen. We need this because the mangling is determined by the 'simdlen' of a simd clone which will not be VLA when -msve-vector-bits is passed. We would like to support using VLA simd clones when generating VLS, but for that to work right now we'd need to set the simdlen of the simd clone to the VLS value and that messes up the mangling. In the future we will need to add a target hook to specify the mangling. Given that the target agnostic changes are minimal, have been suggested before and have no impact on other targets, the target specific parts have been reviewed before, would this still be acceptable for Stage 4? I would really like to make use of the work that was done to support this and the SVE simdclones added to glibc. Kind regards, Andre Andre Vieira (3): vect: Pass stmt_vec_info to TARGET_SIMD_CLONE_USABLE vect: disable multiple calls of poly simdclones aarch64: Add SVE support for simd clones [PR 96342]