From patchwork Tue Feb 7 00:16:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738533 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=D1Nq7xuv; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kH30cG2z23jB for ; Tue, 7 Feb 2023 11:17:51 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0AC613857B9B for ; Tue, 7 Feb 2023 00:17:49 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by sourceware.org (Postfix) with ESMTPS id 164923858D35 for ; Tue, 7 Feb 2023 00:16:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 164923858D35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x32f.google.com with SMTP id bg26so9932492wmb.0 for ; Mon, 06 Feb 2023 16:16:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8iDBnWQl9q8DHc5u8j+HZPVJRKs/uVA6CFhCy0D+JC0=; b=D1Nq7xuvOFJQdZtlLUqqSEkLjh1T0BpdtneMDuuqBzrzJZd8ELENHUFgXMzu4CZeVq tp5LExVO7Cdg0kwm+/ZmLNcD7VFgqCHWYyfmiQyZzGGKD6WYo/y7kPSD5Cd2usdHTUUq cy1sn5VXp24VkDEr5eOsF44YoSVi/wTgQlna+uhWvcJiXofv/jkhsM1GNy6QOYTH4UKZ noo4uyabcQKVCZ2mARvZNKymPaaezW641SmSRoqK2Wup0WwFi6U0R0lsBGiZsxeElvAx RDmTSviRFjLgdIxCgrYyIOwfswvLfImkJgMsZTrly5yYFhD9378Ozt51ylrmOOVjqvyH KFmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8iDBnWQl9q8DHc5u8j+HZPVJRKs/uVA6CFhCy0D+JC0=; b=pSKHM02T1/R4nd6dsqoAd95c4hFY4aLLxUkWGtyPiryIRWzyvFc06Sli8g2O8f7vSd QUuCUjt+qto1fRpeLqkkjtmRJMr4SdccWBo6m9E473S0m1fOQtuWVNaQkV2B1ImbCnur CnIwzFEAeYExxrw+PXqELdfAvZcZCFXPl7UAroM9pSYxbnc0qrYgFhGPVM4X4a98aIcz h7ML9dNfm9HtGS8fmGezR9YhXnYyA69dEQZlHTJrXT2wlU81ODuKz3ciiosih12gBs4Q ULX860SBEZGfsY+mCt6oHXAP/Kf+dsb2IsVcYepCdU3EbmOTIJtyoldDT+kiyp5QzeLX /3pA== X-Gm-Message-State: AO0yUKUBRg5PQQ5fiZXSwii47xPCFQvCczcr7HqnaaCNL602U4Uj3B0u AUvAYF82ny4w7qf8U5Mvjn/3WtuLHrGSgA1w X-Google-Smtp-Source: AK7set8WBoW7WsPNi2OldKZOI8Y2EB2bdyxmagA1aKjImtIjdbuz+RNghqGi9BhZJ/zQs4SFLOjcrQ== X-Received: by 2002:a05:600c:3317:b0:3e0:111:28a9 with SMTP id q23-20020a05600c331700b003e0011128a9mr1300861wmp.22.1675728992272; Mon, 06 Feb 2023 16:16:32 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:31 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 01/19] Inhibit early libcalls before ifunc support is ready Date: Tue, 7 Feb 2023 01:16:00 +0100 Message-Id: <20230207001618.458947-2-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner One of the few tasks in __libc_start_main_impl, before ifunc support is ready on many architectures is to process the AUX vector. GCC is able to detect libcall routines in this code, which will result in invocations of uninitialized ifunc pointers. Let's set the proper attributes to these early functions to avoid avoid libcalls. This was observed to be an issue (endless loop) in combination with: - GCC upstream/master - glibc upstream/master - glibc built with -O3 - target arch RISC-V (RV64) - experimental RISC-V ifunc support patches Other combinations/architectures might be affected as well. Signed-off-by: Christoph Müllner --- csu/libc-start.c | 1 + elf/dl-support.c | 1 + 2 files changed, 2 insertions(+) diff --git a/csu/libc-start.c b/csu/libc-start.c index c3bb6d09bc..8566a54df5 100644 --- a/csu/libc-start.c +++ b/csu/libc-start.c @@ -231,6 +231,7 @@ STATIC int LIBC_START_MAIN (int (*main) (int, char **, char ** locate constructors and destructors. For statically linked executables, the relevant symbols are access directly. */ STATIC int +inhibit_loop_to_libcall LIBC_START_MAIN (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL), int argc, char **argv, #ifdef LIBC_START_MAIN_AUXVEC_ARG diff --git a/elf/dl-support.c b/elf/dl-support.c index 9714f75db0..b0e9e1636a 100644 --- a/elf/dl-support.c +++ b/elf/dl-support.c @@ -242,6 +242,7 @@ __rtld_lock_define_initialized_recursive (, _dl_load_tls_lock) int _dl_clktck; void +inhibit_loop_to_libcall _dl_aux_init (ElfW(auxv_t) *av) { #ifdef NEED_DL_SYSINFO From patchwork Tue Feb 7 00:16:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738532 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=UagEb76s; dkim-atps=neutral Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kGp4rLPz23jB for ; Tue, 7 Feb 2023 11:17:38 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 51003385B530 for ; Tue, 7 Feb 2023 00:17:35 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x331.google.com (mail-wm1-x331.google.com [IPv6:2a00:1450:4864:20::331]) by sourceware.org (Postfix) with ESMTPS id 4B50D3858D37 for ; Tue, 7 Feb 2023 00:16:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4B50D3858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x331.google.com with SMTP id c4-20020a1c3504000000b003d9e2f72093so12043045wma.1 for ; Mon, 06 Feb 2023 16:16:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gYmctAQB3ZzYxVFHCKBucn0X/zwXMtqCGXXbdyEuwY0=; b=UagEb76suXcPOoOSV/zpSo4OY0zwi841DjiLjhYlj1jqadlF7F3XahHWWP8+R4Xhey asJT9Si4i3kFZ3In/DhZFv7sRGbiLA58OG0fE1Bf5X0fqzGqoxasrm94LkoI0fB50E1j qRTwnu5REXqLJjwyaqr3SJgjnBoyYi/MrpP3St4NR6pV/+kAgrf8h6GHaTUqDtFH1579 /hSdeJbD0e0f/LlXbluGfRxX/XyODb75+glBvO6583QUHco+W6hR7oNLN8zSeV/uupo2 xFeO1YQmLR30QvUxFwKVdD3H0Ymb/R4eli5ozRmBkPJom29mHw/WKi0WSX69bZufxTCp hurA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gYmctAQB3ZzYxVFHCKBucn0X/zwXMtqCGXXbdyEuwY0=; b=Sdzg0pNZcXQ69Kq/TpaDSBio+YUdRvvX6PtsWBkuSax+w63SpSsvkpGXgUQTNPVy93 5vlBZMzbZsdCa45eyXbwwbbLQwDKL/IZoey5pGqCiPRPGG4qCeRI/TMrwUIkICvdddOm wLJWSd9gzF0NfXoEpUWGkVeiB7mEa4Bti2g+LrZVYsPFI/JKasLsFbEDFMDiIifFi+7J lKIwJENIin9OVLrG2zAoMrMsRI5OHIVB9vxOYpKF9ox5GEz2IEx75UNihe/LWmddebqD vCRVt/015HGG2IqoT2t6fJmn2hrm4siBU8NdMColN6yjK196Go25Dp8dIWEIwhkcZF2J 8mew== X-Gm-Message-State: AO0yUKVD9LRGnVFK55KVvlLUGtxh2YTG9ygZ3KfDd0IfvCRZUDhOsIK6 sxQwIzubnGPHYIdsxzRRxcPfKASkBvt+ttgA X-Google-Smtp-Source: AK7set8kBHF/CG0a0mV5x3SfgSAyDxooEcNvSm4VDaqQiLJUDuPF9JNahlUxhKmixd0MY7P+Nl868g== X-Received: by 2002:a05:600c:3c8a:b0:3df:1f48:3d01 with SMTP id bg10-20020a05600c3c8a00b003df1f483d01mr773919wmb.37.1675728993619; Mon, 06 Feb 2023 16:16:33 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:33 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 02/19] riscv: LEAF: Use C_LABEL() to construct the asm name for a C symbol Date: Tue, 7 Feb 2023 01:16:01 +0100 Message-Id: <20230207001618.458947-3-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner It is common practice in glibc to use C_LABEL() to construct the asm name for a C symbol. Let's do this for RISC-V as well, even if this is essentially a non-functional change. Signed-off-by: Christoph Müllner --- sysdeps/riscv/sys/asm.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sysdeps/riscv/sys/asm.h b/sysdeps/riscv/sys/asm.h index 5432f2d5d2..b782cfa2f2 100644 --- a/sysdeps/riscv/sys/asm.h +++ b/sysdeps/riscv/sys/asm.h @@ -51,7 +51,7 @@ .globl symbol; \ .align 2; \ .type symbol,@function; \ -symbol: \ + C_LABEL(symbol) \ cfi_startproc; /* Mark end of function. */ From patchwork Tue Feb 7 00:16:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738535 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=pTKcYnFW; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kHW4l9sz23jB for ; Tue, 7 Feb 2023 11:18:15 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 94D2438493ED for ; Tue, 7 Feb 2023 00:18:13 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x329.google.com (mail-wm1-x329.google.com [IPv6:2a00:1450:4864:20::329]) by sourceware.org (Postfix) with ESMTPS id 99D443858D39 for ; Tue, 7 Feb 2023 00:16:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 99D443858D39 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x329.google.com with SMTP id o36so9916473wms.1 for ; Mon, 06 Feb 2023 16:16:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4/XNYwmF6e2mdGU8TghSnwh46KipOWK5wqSwyFy//q4=; b=pTKcYnFWm1u5KzNoxX6B51Jtk3r5LfZ9Hs9MbrGT6Od4HOptZ93GBaf7mt6vPmaXXE rUQre+z89eBiX31xThu2TYfhngs2v0Yr7MuGsc/AvRDtH8Kp1BexrNO3KzjIydJe+qcn PdwWbtggZ2wyZXZ3rEp3H0QG7klZAjh2Hp3f1w+hdja0sNWxbldoxc81aZ88D+WPl52e oXs8zb85fbD91vqReTGQvK7xQkMvDH0xoFMUg4A3LRGQ4g1CKOcPoTmnBqf/YvYST+oG QyAfD+UVgTefan2Uf/9ER+OfkzjJxbRO0S+MkdXIX0zkXf/oiVhpxmc+y2cZSxRsucRG whdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4/XNYwmF6e2mdGU8TghSnwh46KipOWK5wqSwyFy//q4=; b=bgle6QGNMrNTFdU0qIjGWHkX5HlzD9KEai2g6rYXkvGUcTTm2qsdHLRwI4TboXa5dF QBzfrBiH755nnR9cw1F4cDYiOnJolpcVGSO+ZNpCgcd1rIa3WBMG8iAysSkP7e/R2uuh zyaBnyF4Mc5VaAiNDJR8tX1L/3QQsiuAOBLSPkwbIqhkYPjGm21KksKARSm1jM9Qmaqt GDs6abX0K/F9y2yWOw+a0U6GUJS7xUROtBTnAUi1VQkaG+EME08I5vnggthEf4pmw2eW fk/eLqVinmyPm9NaD3jbD+nI4LeGV240XEkgtp1Dnrwr2rZJGR++3D3mFO8F2kp2QaLd iyXA== X-Gm-Message-State: AO0yUKUabTbRm3GaTr+LNZFfMf7D00A8g88s8+Vu9D4GNSWBEKqXZlU3 Qo6b7lx3Z/2a/y9duUY6W79SssiKukMSoTHP X-Google-Smtp-Source: AK7set9gTYZQdaXaqZnismXT4CI0V3C2bT5bBSt2AP2xVIXoIo+mPCoro4AT+K5CFpKemzRAFsUhlw== X-Received: by 2002:a05:600c:2ac8:b0:3d9:fb59:c16b with SMTP id t8-20020a05600c2ac800b003d9fb59c16bmr1249209wme.36.1675728994903; Mon, 06 Feb 2023 16:16:34 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:34 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 03/19] riscv: Add ENTRY_ALIGN() macro Date: Tue, 7 Feb 2023 01:16:02 +0100 Message-Id: <20230207001618.458947-4-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner This patch adds an ENTRY_ALIGN() macro to generate aligned function symbols in assembly files. Since the LEAF() macro is a special-case of that, we change LEAF() to be reflect this. Signed-off-by: Christoph Müllner --- sysdeps/riscv/sys/asm.h | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/sysdeps/riscv/sys/asm.h b/sysdeps/riscv/sys/asm.h index b782cfa2f2..de6394b984 100644 --- a/sysdeps/riscv/sys/asm.h +++ b/sysdeps/riscv/sys/asm.h @@ -46,14 +46,18 @@ # endif #endif -/* Declare leaf routine. */ -#define LEAF(symbol) \ - .globl symbol; \ - .align 2; \ - .type symbol,@function; \ +/* Define an entry point visible from C with custom p2-alignment. */ +#define ENTRY_ALIGN(symbol, align) \ + .globl symbol; \ + .p2align align; \ + .type symbol,@function; \ C_LABEL(symbol) \ cfi_startproc; +/* Declare leaf routine. */ +#define LEAF(symbol) \ + ENTRY_ALIGN (symbol, 1) + /* Mark end of function. */ #undef END #define END(function) \ From patchwork Tue Feb 7 00:16:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738534 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=PM1C+fHQ; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kH34mkjz23y5 for ; Tue, 7 Feb 2023 11:17:51 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 568B438493E5 for ; Tue, 7 Feb 2023 00:17:49 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by sourceware.org (Postfix) with ESMTPS id DCC0C3858C50 for ; Tue, 7 Feb 2023 00:16:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DCC0C3858C50 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x32b.google.com with SMTP id j29-20020a05600c1c1d00b003dc52fed235so10240132wms.1 for ; Mon, 06 Feb 2023 16:16:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=n9dxq7kpd010HIi1OLQl6ApK22PyYnLk2lBE1dozjxw=; b=PM1C+fHQd/x1ptXfiv3U5ZBaN39Uz0ZtSenwYIexpAt47ycXmpdilKW0P1P191XAlo dcLc5AzN1SMcS1DhfvOo776qo6BYAlYMwDTx2CC7FH4x3n15j+RpEnMi3d3FPHXLpXPU DjoyfSZu81m2/B9BrZsXG0NtQ1/S9iH6IPtAIU2LMBbmj6qR9uJ6sHNSTyOEF28OwHOD oRZ6xhM0gBA3S5CX79oqIOUDLfsXSEBwNHf+EFtnupWLtnRnWsGL6XnRI19xOHXpxfEs nYzXIMMu616FRHhm6ztJcRhzdmCXeYrFQRszflh3y4rQmiV9c5Uou32HA3OWeGq16shr jNzA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=n9dxq7kpd010HIi1OLQl6ApK22PyYnLk2lBE1dozjxw=; b=Guou766CKlRQ6GrKGkmjKUEoTx02Ho9OwYE9FuMSRr9cy5PCm8krS2tD3BBJDja2Gg XmewwoyIFhi6dpX43R3Eym3iT0awQbBLoUu7v4K/F91Fe7+kK5+rtjpzrPBplUUwxYw9 kYwrxohsMrfgICX4a4BbgaVomZoSKePxYYaCeQSU514vF1h4Y7OoZXROt57OpGsAHldi ML5mwYCdlsiqVaMdKQQ06HeemyoSCbI9/FkkkGXi/gtU8Supd+2y2AAIUWLg9wjW95Q4 ycMXo8S8SB9Z1SMoFU979Ps/lgRHXTgH2Saw/dQH6Dx60QvKAloZXgYf82owb+BD8C/R 5YCQ== X-Gm-Message-State: AO0yUKUF1LEkTwDKN2qLAxi1JGHbgTPtOQXDEG5MkSImZ6RvRMAJAbuo hS2CWVInbkHuoddZlMGDomhKW1twMeL/Gu9X X-Google-Smtp-Source: AK7set9rmiqFH7Z8vCr4qUWnfeFpMoDS21e5kz2E5TDWA8ENumNeIjxSbtEToCr2AIYaHQJcqGMxuA== X-Received: by 2002:a05:600c:2b46:b0:3e0:185:44af with SMTP id e6-20020a05600c2b4600b003e0018544afmr805115wmf.20.1675728996018; Mon, 06 Feb 2023 16:16:36 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:35 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 04/19] riscv: Add hart feature run-time detection framework Date: Tue, 7 Feb 2023 01:16:03 +0100 Message-Id: <20230207001618.458947-5-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner This patch introduces a framework to detect and store hart features (e.g. ISA extensions and their parameters) for RISC-V. This patch does not introduce a concrete mechanism for run-time detection, but implements everything so that such a mechanism can be introduced. Most of the changes in this patch are inspired by similar code for other architectures, so nothing surprising should be hidden here. Signed-off-by: Christoph Müllner --- sysdeps/riscv/dl-machine.h | 13 ++++ sysdeps/riscv/ldsodefs.h | 1 + sysdeps/unix/sysv/linux/riscv/dl-procinfo.c | 62 +++++++++++++++++++ sysdeps/unix/sysv/linux/riscv/dl-procinfo.h | 46 ++++++++++++++ sysdeps/unix/sysv/linux/riscv/hart-features.c | 43 +++++++++++++ sysdeps/unix/sysv/linux/riscv/hart-features.h | 26 ++++++++ sysdeps/unix/sysv/linux/riscv/libc-start.c | 29 +++++++++ 7 files changed, 220 insertions(+) create mode 100644 sysdeps/unix/sysv/linux/riscv/dl-procinfo.c create mode 100644 sysdeps/unix/sysv/linux/riscv/dl-procinfo.h create mode 100644 sysdeps/unix/sysv/linux/riscv/hart-features.c create mode 100644 sysdeps/unix/sysv/linux/riscv/hart-features.h create mode 100644 sysdeps/unix/sysv/linux/riscv/libc-start.c diff --git a/sysdeps/riscv/dl-machine.h b/sysdeps/riscv/dl-machine.h index c0c9bd93ad..43f4f96c0e 100644 --- a/sysdeps/riscv/dl-machine.h +++ b/sysdeps/riscv/dl-machine.h @@ -28,6 +28,7 @@ #include #include #include +#include #ifndef _RTLD_PROLOGUE # define _RTLD_PROLOGUE(entry) \ @@ -148,6 +149,18 @@ elf_machine_fixup_plt (struct link_map *map, lookup_t t, return *reloc_addr = value; } +#define DL_PLATFORM_INIT dl_platform_init () + +static inline void __attribute__ ((unused)) +dl_platform_init (void) +{ +#ifdef SHARED + /* init_hart_features has been called early from __libc_start_main in + static executable. */ + init_hart_features (&GLRO(dl_riscv_hart_features)); +#endif /* SHARED */ +} + #endif /* !dl_machine_h */ #ifdef RESOLVE_MAP diff --git a/sysdeps/riscv/ldsodefs.h b/sysdeps/riscv/ldsodefs.h index 90e95e60c5..4b184de255 100644 --- a/sysdeps/riscv/ldsodefs.h +++ b/sysdeps/riscv/ldsodefs.h @@ -20,6 +20,7 @@ #define _RISCV_LDSODEFS_H 1 #include +#include struct La_riscv_regs; struct La_riscv_retval; diff --git a/sysdeps/unix/sysv/linux/riscv/dl-procinfo.c b/sysdeps/unix/sysv/linux/riscv/dl-procinfo.c new file mode 100644 index 0000000000..ce137d10c4 --- /dev/null +++ b/sysdeps/unix/sysv/linux/riscv/dl-procinfo.c @@ -0,0 +1,62 @@ +/* Data for RISC-V version of processor capability information. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +/* This information must be kept in sync with the _DL_PLATFORM_COUNT + definitions in procinfo.h. + + If anything should be added here check whether the size of each string + is still ok with the given array size. + + All the #ifdefs in the definitions are quite irritating but + necessary if we want to avoid duplicating the information. There + are three different modes: + + - PROCINFO_DECL is defined. This means we are only interested in + declarations. + + - PROCINFO_DECL is not defined: + + + if SHARED is defined the file is included in an array + initializer. The .element = { ... } syntax is needed. + + + if SHARED is not defined a normal array initialization is + needed. + */ + +#ifndef PROCINFO_CLASS +# define PROCINFO_CLASS +#endif + +#if !IS_IN (ldconfig) +# if !defined PROCINFO_DECL && defined SHARED + ._dl_riscv_hart_features +# else +PROCINFO_CLASS struct hart_features _dl_riscv_hart_features +# endif +# ifndef PROCINFO_DECL += { } +# endif +# if !defined SHARED || defined PROCINFO_DECL +; +# else +, +# endif +#endif + +#undef PROCINFO_DECL +#undef PROCINFO_CLASS diff --git a/sysdeps/unix/sysv/linux/riscv/dl-procinfo.h b/sysdeps/unix/sysv/linux/riscv/dl-procinfo.h new file mode 100644 index 0000000000..27aaebe02d --- /dev/null +++ b/sysdeps/unix/sysv/linux/riscv/dl-procinfo.h @@ -0,0 +1,46 @@ +/* RISC-V version of processor capability information handling macros. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#ifndef _DL_PROCINFO_H +#define _DL_PROCINFO_H 1 + +#include +#include +#include +#include + +/* We cannot provide a general printing function. */ +#define _dl_procinfo(word, val) -1 + +/* There are no hardware capabilities defined. */ +#define _dl_hwcap_string(idx) "" + +/* By default there is no important hardware capability. */ +#define HWCAP_IMPORTANT (0) + +/* We don't have any hardware capabilities. */ +#define _DL_HWCAP_COUNT 0 + +#define _dl_string_hwcap(str) (-1) + +/* There're no platforms to filter out. */ +#define _DL_HWCAP_PLATFORM 0 + +#define _dl_string_platform(str) (-1) + +#endif /* dl-procinfo.h */ diff --git a/sysdeps/unix/sysv/linux/riscv/hart-features.c b/sysdeps/unix/sysv/linux/riscv/hart-features.c new file mode 100644 index 0000000000..41111eff57 --- /dev/null +++ b/sysdeps/unix/sysv/linux/riscv/hart-features.c @@ -0,0 +1,43 @@ +/* Initialize hart feature data. RISC-V version. + This file is part of the GNU C Library. + Copyright (C) 2022 Free Software Foundation, Inc. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +/* The code in this file is executed very early, so we cannot call + indirect functions because ifunc support is not initialized. + Therefore this file adds a few simple helper functions to avoid + dependencies to functions outside of this file. */ + +static inline void +inhibit_loop_to_libcall +simple_memset (void *s, int c, size_t n) +{ + char *p = (char*)s; + while (n != 0) + { + *p = c; + n--; + } +} + +/* Discover hart features and store them. */ +static inline void +init_hart_features (struct hart_features *hart_features) +{ + simple_memset (hart_features, 0, sizeof (*hart_features)); +} diff --git a/sysdeps/unix/sysv/linux/riscv/hart-features.h b/sysdeps/unix/sysv/linux/riscv/hart-features.h new file mode 100644 index 0000000000..a417cbc326 --- /dev/null +++ b/sysdeps/unix/sysv/linux/riscv/hart-features.h @@ -0,0 +1,26 @@ +/* Initialize CPU feature data. RISC-V version. + This file is part of the GNU C Library. + Copyright (C) 2022 Free Software Foundation, Inc. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _CPU_FEATURES_RISCV_H +#define _CPU_FEATURES_RISCV_H + +struct hart_features +{ +}; + +#endif /* _CPU_FEATURES_RISCV_H */ diff --git a/sysdeps/unix/sysv/linux/riscv/libc-start.c b/sysdeps/unix/sysv/linux/riscv/libc-start.c new file mode 100644 index 0000000000..57c7c09223 --- /dev/null +++ b/sysdeps/unix/sysv/linux/riscv/libc-start.c @@ -0,0 +1,29 @@ +/* Override csu/libc-start.c on RISC-V + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef SHARED + +# include +# include + +extern struct hart_features _dl_riscv_hart_features; + +# define ARCH_INIT_CPU_FEATURES() init_hart_features (&_dl_riscv_hart_features) + +#endif +#include From patchwork Tue Feb 7 00:16:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738538 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=YLVwij0Q; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kJ34vDgz23jB for ; Tue, 7 Feb 2023 11:18:43 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 783EF388451E for ; Tue, 7 Feb 2023 00:18:41 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by sourceware.org (Postfix) with ESMTPS id 3AF933858C60 for ; Tue, 7 Feb 2023 00:16:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3AF933858C60 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x32e.google.com with SMTP id j29-20020a05600c1c1d00b003dc52fed235so10240156wms.1 for ; Mon, 06 Feb 2023 16:16:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=W4ismAtBWb/CwuQ6RLImTVydg+v9Y4rq5yN1JVec2Bk=; b=YLVwij0Q7JLIECsVQkUkko4BxeUNo92ymCZtwfQIpBcys3hGs0NrkU42hQ5Y+nk8m4 p8LhwDTJKiNLnd9yYtZb5jjcuL2gT4Qm1nEsSw+gyC3EGLfF6ZFcSooubM+/JyVRBJoQ lX0YTMp+AHtKq0zZ2Mzj/DcCmpPQHs63Br7Nl0R1AHCa/QBKGE/kfxTvKPeByVXPaER2 e6knrEUdlV+n3HVCMsPWD4SqU8gjq1H1mYQ3xdfMmpmrwinuzIep/N19upgINPdDMC/+ hWSal+14C3KX/wwFILacba42VLbhqNkTaePdsHfnlOG/e4sDiyJnbQ33tjsvMx0pMKmy Lcmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=W4ismAtBWb/CwuQ6RLImTVydg+v9Y4rq5yN1JVec2Bk=; b=hZqOUir+/YonCb/Be9ktW93cZHHRLhiADuYmq7WRRRdfgbT5udxirK/pOUbj2CDFGf mOhl87hDeEZ6qZtpShTk19xNNA9Dsp/Ie+ZDEAVbA9SjZBk6J2igAhLAqObo+U3s4sfF t+8XBDG40s64HyRSDbWC25iIQtHsMqSWD1MGZXP6UD8TMGjvGasbUX2DqQWox+54chWZ v3Wd3JblmRA2yt7yHWDQvSAVjrH375EQfPXzC+48sWp+8RC64oAHaEc07e5PDVdQ82gY x1AzYSKwj7iMSe79T8d5/tFp+rQ5/augGSw+zhF5EiWJ05Gf/7Q+6K61fp0dUyrnxyDt xbAg== X-Gm-Message-State: AO0yUKUnFWN6zAL3V+D7HSuUiBuxmJXvN/ySrr1jouDOYB9Dy6Qv8J/H +5fYFi+szRk9DSVDApNLUy/U9SufnFtwMlcM X-Google-Smtp-Source: AK7set9n2tFF+tVMr/nsI4hlQPWZBXvXEiM5lGcaIIjsA9DluP4Obd9WKuyvQNTu+EHlhBVuJqDLXA== X-Received: by 2002:a05:600c:18a6:b0:3dd:1a8b:7374 with SMTP id x38-20020a05600c18a600b003dd1a8b7374mr1342444wmp.5.1675728997491; Mon, 06 Feb 2023 16:16:37 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:37 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 05/19] riscv: Introduction of ISA extensions Date: Tue, 7 Feb 2023 01:16:04 +0100 Message-Id: <20230207001618.458947-6-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner The RISC-V ISA consists of a base ISA and a multitude of optional ISA extensions. This patch introduces some of them, which are expected to be relevant in the near future for ifunc-based optimizations in glibc: * Base (i or e) * M * A * F * D * C * Zicsr * Zifencei * G * Zihintpause * zicbom * zicbop * zicboz * zawrs * zba * zbb * zbc * zbs Given the DSL-like definition it should be trivial to extend the list. Signed-off-by: Christoph Müllner --- sysdeps/unix/sysv/linux/riscv/hart-features.h | 27 +++++++ .../unix/sysv/linux/riscv/isa-extensions.def | 72 +++++++++++++++++++ 2 files changed, 99 insertions(+) create mode 100644 sysdeps/unix/sysv/linux/riscv/isa-extensions.def diff --git a/sysdeps/unix/sysv/linux/riscv/hart-features.h b/sysdeps/unix/sysv/linux/riscv/hart-features.h index a417cbc326..dd94685676 100644 --- a/sysdeps/unix/sysv/linux/riscv/hart-features.h +++ b/sysdeps/unix/sysv/linux/riscv/hart-features.h @@ -19,8 +19,35 @@ #ifndef _CPU_FEATURES_RISCV_H #define _CPU_FEATURES_RISCV_H +#define IS_RV32() \ + (GLRO (dl_riscv_hart_features).xlen == 32) + +#define IS_RV64() \ + (GLRO (dl_riscv_hart_features).xlen == 64) + +#define HAVE_RV(E) \ + (GLRO (dl_riscv_hart_features).have_ ## E == 1) + +#define HAVE_CBOM_BLOCKSIZE(n) \ + (GLRO (dl_riscv_hart_features).cbom_blocksize == n) + +#define HAVE_CBOZ_BLOCKSIZE(n) \ + (GLRO (dl_riscv_hart_features).cboz_blocksize == n) + struct hart_features { + const char* rt_march; + unsigned xlen; +#define ISA_EXT(e) \ + unsigned have_##e:1; +#define ISA_EXT_GROUP(g, ...) \ + unsigned have_##g:1; +#include "isa-extensions.def" + + const char* rt_cbom_blocksize; + unsigned cbom_blocksize; + const char* rt_cboz_blocksize; + unsigned cboz_blocksize; }; #endif /* _CPU_FEATURES_RISCV_H */ diff --git a/sysdeps/unix/sysv/linux/riscv/isa-extensions.def b/sysdeps/unix/sysv/linux/riscv/isa-extensions.def new file mode 100644 index 0000000000..eb05823998 --- /dev/null +++ b/sysdeps/unix/sysv/linux/riscv/isa-extensions.def @@ -0,0 +1,72 @@ +/* ISA extensions of RISC-V. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define RISC-V ISA extension. */ +#ifndef ISA_EXT +# define ISA_EXT(e) +#endif + +/* Define RISC-V ISA extension group. */ +#ifndef ISA_EXT_GROUP +# define ISA_EXT_GROUP(...) +#endif + +/* + * Here are the ordering rules of extension naming defined by RISC-V + * specification : + * 1. All extensions should be separated from other multi-letter extensions + * by an underscore. + * 2. The first letter following the 'Z' conventionally indicates the most + * closely related alphabetical extension category, IMAFDQLCBKJTPVH. + * If multiple 'Z' extensions are named, they should be ordered first + * by category, then alphabetically within a category. + * 3. Standard supervisor-level extensions (starts with 'S') should be + * listed after standard unprivileged extensions. If multiple + * supervisor-level extensions are listed, they should be ordered + * alphabetically. + * 4. Non-standard extensions (starts with 'X') must be listed after all + * standard extensions. They must be separated from other multi-letter + * extensions by an underscore. + */ + +ISA_EXT (i) +ISA_EXT (e) + +ISA_EXT (m) +ISA_EXT (a) +ISA_EXT (f) +ISA_EXT (d) +ISA_EXT (c) +ISA_EXT (zicsr) +ISA_EXT (zifencei) +ISA_EXT_GROUP (g, i, m, a, f, d, zicsr, zifencei) + +ISA_EXT (zicbom) +ISA_EXT (zicbop) +ISA_EXT (zicboz) +ISA_EXT (zihintpause) + +ISA_EXT (zawrs) + +ISA_EXT (zba) +ISA_EXT (zbb) +ISA_EXT (zbc) +ISA_EXT (zbs) + +#undef ISA_EXT +#undef ISA_EXT_GROUP From patchwork Tue Feb 7 00:16:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738541 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=MqABSmo8; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kK34RYBz23jB for ; Tue, 7 Feb 2023 11:19:35 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 911003887F79 for ; Tue, 7 Feb 2023 00:19:33 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by sourceware.org (Postfix) with ESMTPS id 0ABCC3858C62 for ; Tue, 7 Feb 2023 00:16:41 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0ABCC3858C62 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x335.google.com with SMTP id z13so2433241wmp.2 for ; Mon, 06 Feb 2023 16:16:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=y0pe2Jlsqh24PBbKuqrb5puaOeETVYzfcrURNZiRsmY=; b=MqABSmo8hzx5QnT9kaVenCrKDTUIhXr94oyyIg8EW+Cej+XmNvZTl/lWHkipnPFioG pIyay/jQjtTTTFLqtPiyxuzglhpG6MfV2zcKY7joBFQxTCJ4B3jo+v03iWRTo10GPN2U HnsKhWus1IvwUVnAPWvcoAiFv4Puf7UM3XgwfZz9yEgVujEvmuGJiCtRqe1AP0Z3uvWo e0vKGscV74siopnVWOuRqRKk6/MjdziLnXb2jnXTA03AElAGUSIzLmIXCkzGSg1KTPGh ln+Zp6ttxH0JVrMjRzisyNn0VB2rT/D2yWixRnmMmyt3oGvtOMk7w22Qp9a4kRGw5jGb mwdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=y0pe2Jlsqh24PBbKuqrb5puaOeETVYzfcrURNZiRsmY=; b=dSWSPxHLVAPWfIMDSV2GRBybZeiUTQ18DT6GVzj6KDK0nqcfMY9jd8dFmvtWAJcWu1 pL3wCUTVW9aKgiO4vdZtnXhQnNLYxEiLEDaBsX+hYu4BsZG02RaV8MYPaHmlvvKWhIiH B/MmeNKdWTz2ttZUbHr9avH781f+5fqRnz85ohBIG7TtCndjyKOeP5WRKvaZGDYZHrwH 2+w/GIRS9quSE8oCd3Na9JBVtYLmC2fD2CYpTjMN5l4R7JV3NeLf6X7hoKJ85EPUSRY+ GrQQHD6W4OfISxI0CuQMZdvV0a+TqSh13djbvOIQkka63j3YC4qE2IPEo4Ti8kPx/jP+ Wrfg== X-Gm-Message-State: AO0yUKXLIE2m6YkBsr9+OideyT8UwM4V4qWJSRVgkJ/2SRBusEXKSabB dJfXHqAfSBTOZQvvAQVR3k+Tq3VKOrRkD7of X-Google-Smtp-Source: AK7set9YkBkSA8kdqe27yYKdK0nPdCacI6z3Hpa3bOf+Oml0xcMZxuGgtx4zLdlolNAbokQuoC9acg== X-Received: by 2002:a05:600c:502b:b0:3dc:c05:9db6 with SMTP id n43-20020a05600c502b00b003dc0c059db6mr1252975wmr.33.1675728999235; Mon, 06 Feb 2023 16:16:39 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:38 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 06/19] riscv: Adding ISA string parser for environment variables Date: Tue, 7 Feb 2023 01:16:05 +0100 Message-Id: <20230207001618.458947-7-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner RISC-V does not have a reliable mechanism to detect hart features like supported ISA extensions or cache block sizes at run-time as of now. Not knowing the hart features limits optimization strategies of glibc (e.g. ifunc support requires run-time hard feature knowledge). To circumvent this limitation this patch introduces a mechanism to get the hart features via environment variables: * RISCV_RT_MARCH represents a lower-case ISA string (-march string) E.g. RISCV_RT_MARCH=rv64gc_zicboz * RISCV_RT_CBOM_BLOCKSIZE represents the cbom instruction block size E.g. RISCV_RT_CBOZ_BLOCKSIZE=64 * RISCV_RT_CBOZ_BLOCKSIZE represents the cboz instruction block size These environment variables are parsed during startup and the found ISA extensions are stored a struct (hart_features) for evaluation by dynamic dispatching code. As the parser code is executed very early, we cannot call functions that have direct or indirect (via getenv()) dependencies to strlen() and strncmp(), as these functions cannot be called before the ifunc support is initialized. Therefore, this patch contains its own helper functions for strlen(), strncmp(), and getenv(). Signed-off-by: Christoph Müllner --- sysdeps/unix/sysv/linux/riscv/hart-features.c | 294 ++++++++++++++++++ .../unix/sysv/linux/riscv/macro-for-each.h | 24 ++ 2 files changed, 318 insertions(+) create mode 100644 sysdeps/unix/sysv/linux/riscv/macro-for-each.h diff --git a/sysdeps/unix/sysv/linux/riscv/hart-features.c b/sysdeps/unix/sysv/linux/riscv/hart-features.c index 41111eff57..6de41a26cc 100644 --- a/sysdeps/unix/sysv/linux/riscv/hart-features.c +++ b/sysdeps/unix/sysv/linux/riscv/hart-features.c @@ -17,12 +17,17 @@ . */ #include +#include +#include /* The code in this file is executed very early, so we cannot call indirect functions because ifunc support is not initialized. Therefore this file adds a few simple helper functions to avoid dependencies to functions outside of this file. */ +#define xstr(s) str(s) +#define str(s) #s + static inline void inhibit_loop_to_libcall simple_memset (void *s, int c, size_t n) @@ -35,9 +40,298 @@ simple_memset (void *s, int c, size_t n) } } +static inline size_t +inhibit_loop_to_libcall +simple_strlen (const char *s) +{ + size_t n = 0; + char c = *s; + while (c != 0) + { + s++; + n++; + c = *s; + } + return n; +} + +static inline int +inhibit_loop_to_libcall +simple_strncmp (const char *s1, const char *s2, size_t n) +{ + while (n != 0) + { + if (*s1 == 0 || *s1 != *s2) + return *((const unsigned char *)s1) - *((const unsigned char *)s2); + n--; + s1++; + s2++; + } + return 0; +} + +extern char **__environ; +static inline char* +simple_getenv (const char *name) +{ + char **ep; + uint16_t name_start; + + if (__environ == NULL || name[0] == 0 || name[1] == 0) + return NULL; + + size_t len = simple_strlen (name); +#if _STRING_ARCH_unaligned + name_start = *(const uint16_t *) name; +#else + name_start = (((const unsigned char *) name)[0] + | (((const unsigned char *) name)[1] << 8)); +#endif + len -= 2; + name += 2; + + for (ep = __environ; *ep != NULL; ++ep) + { +#if _STRING_ARCH_unaligned + uint16_t ep_start = *(uint16_t *) *ep; +#else + uint16_t ep_start = (((unsigned char *) *ep)[0] + | (((unsigned char *) *ep)[1] << 8)); +#endif + if (name_start == ep_start && !simple_strncmp (*ep + 2, name, len) + && (*ep)[len + 2] == '=') + return &(*ep)[len + 3]; + } + return NULL; +} + +/* Check if the given number is a power of 2. + Return true if so, or false otherwise. */ +static inline int +is_power_of_two (unsigned long v) +{ + return (v & (v - 1)) == 0; +} + +/* Check if the given string str starts with + the prefix pre. Return true if so, or false + otherwise. */ +static inline int +starts_with (const char *str, const char *pre) +{ + return simple_strncmp (pre, str, simple_strlen (pre)) == 0; +} + +/* Lower all characters of a string up to the + first NUL-character in the string. */ +static inline void +strtolower (char *s) +{ + char c = *s; + while (c != '\0') + { + if (c >= 'A' && c <= 'Z') + *s = c + 'a' - 'A'; + s++; + c = *s; + } +} + +/* Count the number of detected extensions. */ +static inline unsigned long +count_extensions (struct hart_features *hart_features) +{ + unsigned long n = 0; +#define ISA_EXT(e) \ + if (hart_features->have_##e == 1) \ + n++; +#define ISA_EXT_GROUP(g, ...) \ + if (hart_features->have_##g == 1) \ + n++; +#include "isa-extensions.def" + return n; +} + +/* Check if the given charater is not '0'-'9'. */ +static inline int +notanumber (const char c) +{ + return (c < '0' || c > '9'); +} + +/* Parse RISCV_RT_MARCH and store found extensions. */ +static inline void +parse_rt_march (struct hart_features *hart_features) +{ + const char* s = simple_getenv ("RISCV_RT_MARCH"); + if (s == NULL) + goto end; + + hart_features->rt_march = s; + + /* "RISC-V ISA strings begin with either RV32I, RV32E, RV64I, or RV128I + indicating the supported address space size in bits for the base + integer ISA." */ + if (starts_with (s, "rv32") && notanumber (*(s+4))) + { + hart_features->xlen = 32; + s += 4; + } + else if (starts_with (s, "rv64") && notanumber (*(s+4))) + { + hart_features->xlen = 64; + s += 4; + } + else if (starts_with (s, "rv128") && notanumber (*(s+5))) + { + hart_features->xlen = 128; + s += 5; + } + else + { + goto fail; + } + + /* Parse the extensions. */ + const char *s_old = s; + while (*s != '\0') + { +#define ISA_EXT(e) \ + else if (starts_with (s, xstr (e))) \ + { \ + hart_features->have_##e = 1; \ + s += simple_strlen (xstr (e)); \ + } +#define ISA_EXT_GROUP(g, ...) \ + ISA_EXT (g) + if (0); +#include "isa-extensions.def" + + /* Consume optional version information. */ + while (*s >= '0' && *s <= '9') + s++; + while (*s == 'p') + s++; + while (*s >= '0' && *s <= '9') + s++; + + /* Consume optional '_'. */ + if (*s == '_') + s++; + + /* If we got stuck, bail out. */ + if (s == s_old) + goto fail; + } + + /* Propagate subsets (until we reach a fixpoint). */ + unsigned long n = count_extensions (hart_features); + while (1) + { + /* Forward-propagation. E.g.: + if (hart_features->have_g == 1) + { + hart_features->have_i = 1; + ... + hart_features->have_zifencei = 1; + } */ +#define ISA_EXT_GROUP_HEAD(y) \ + if (hart_features->have_##y) \ + { +#define ISA_EXT_GROUP_SUBSET(s) \ + hart_features->have_##s = 1; +#define ISA_EXT_GROUP_TAIL(z) \ + } +#define ISA_EXT_GROUP(x, ...) \ + ISA_EXT_GROUP_HEAD (x) \ + FOR_EACH (ISA_EXT_GROUP_SUBSET, __VA_ARGS__) \ + ISA_EXT_GROUP_TAIL (x) +#include "isa-extensions.def" +#undef ISA_EXT_GROUP_HEAD +#undef ISA_EXT_GROUP_SUBSET +#undef ISA_EXT_GROUP_TAIL + + /* Backward-propagation. E.g.: + if (1 + && hart_features->have_i == 1 + ... + && hart_features->have_zifencei == 1 + ) + hart_features->have_g = 1; */ +#define ISA_EXT_GROUP_HEAD(y) \ + if (1 +#define ISA_EXT_GROUP_SUBSET(s) \ + && hart_features->have_##s == 1 +#define ISA_EXT_GROUP_TAIL(z) \ + ) \ + hart_features->have_##z = 1; +#define ISA_EXT_GROUP(x, ...) \ + ISA_EXT_GROUP_HEAD (x) \ + FOR_EACH (ISA_EXT_GROUP_SUBSET, __VA_ARGS__) \ + ISA_EXT_GROUP_TAIL (x) +#include "isa-extensions.def" +#undef ISA_EXT_GROUP_HEAD +#undef ISA_EXT_GROUP_SUBSET +#undef ISA_EXT_GROUP_TAIL + + unsigned long n2 = count_extensions (hart_features); + /* Stop if fix-point reached. */ + if (n == n2) + break; + n = n2; + } + +end: + return; + +fail: + hart_features->rt_march = NULL; +} + +/* Parse RISCV_RT_CBOM_BLOCKSIZE and store value. */ +static inline void +parse_rt_cbom_blocksize (struct hart_features *hart_features) +{ + hart_features->rt_cbom_blocksize = NULL; + hart_features->cbom_blocksize = 0; + + const char *s = simple_getenv ("RISCV_RT_CBOM_BLOCKSIZE"); + if (s == NULL) + return; + + uint64_t v = _dl_strtoul (s, NULL); + if (!is_power_of_two (v)) + return; + + hart_features->rt_cbom_blocksize = s; + hart_features->cbom_blocksize = v; +} + +/* Parse RISCV_RT_CBOZ_BLOCKSIZE and store value. */ +static inline void +parse_rt_cboz_blocksize (struct hart_features *hart_features) +{ + hart_features->rt_cboz_blocksize = NULL; + hart_features->cboz_blocksize = 0; + + const char *s = simple_getenv ("RISCV_RT_CBOZ_BLOCKSIZE"); + if (s == NULL) + return; + + uint64_t v = _dl_strtoul (s, NULL); + if (!is_power_of_two (v)) + return; + + hart_features->rt_cboz_blocksize = s; + hart_features->cboz_blocksize = v; +} + /* Discover hart features and store them. */ static inline void init_hart_features (struct hart_features *hart_features) { simple_memset (hart_features, 0, sizeof (*hart_features)); + parse_rt_march (hart_features); + parse_rt_cbom_blocksize (hart_features); + parse_rt_cboz_blocksize (hart_features); } diff --git a/sysdeps/unix/sysv/linux/riscv/macro-for-each.h b/sysdeps/unix/sysv/linux/riscv/macro-for-each.h new file mode 100644 index 0000000000..524bef3c0a --- /dev/null +++ b/sysdeps/unix/sysv/linux/riscv/macro-for-each.h @@ -0,0 +1,24 @@ +/* Recursive macros implementation by David Mazières + https://www.scs.stanford.edu/~dm/blog/va-opt.html */ + +#ifndef _MACRO_FOR_EACH_H +#define _MACRO_FOR_EACH_H + +#define EXPAND1(...) __VA_ARGS__ +#define EXPAND2(...) EXPAND1 (EXPAND1 (EXPAND1 (EXPAND1 (__VA_ARGS__)))) +#define EXPAND3(...) EXPAND2 (EXPAND2 (EXPAND2 (EXPAND2 (__VA_ARGS__)))) +#define EXPAND4(...) EXPAND3 (EXPAND3 (EXPAND3 (EXPAND3 (__VA_ARGS__)))) +#define EXPAND(...) EXPAND4 (EXPAND4 (EXPAND4 (EXPAND4 (__VA_ARGS__)))) + +#define FOR_EACH(macro, ...) \ + __VA_OPT__ (EXPAND (FOR_EACH_HELPER (macro, __VA_ARGS__))) + +#define PARENS () + +#define FOR_EACH_HELPER(macro, a1, ...) \ + macro (a1) \ + __VA_OPT__ (FOR_EACH_AGAIN PARENS (macro, __VA_ARGS__)) + +#define FOR_EACH_AGAIN() FOR_EACH_HELPER + +#endif /* _MACRO_FOR_EACH_H */ From patchwork Tue Feb 7 00:16:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738536 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=glP8Zqku; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kHx0HR2z23y5 for ; Tue, 7 Feb 2023 11:18:37 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D684E3858031 for ; Tue, 7 Feb 2023 00:18:34 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by sourceware.org (Postfix) with ESMTPS id 36D8F3858C66 for ; Tue, 7 Feb 2023 00:16:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 36D8F3858C66 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wr1-x42c.google.com with SMTP id g6so4068929wrv.1 for ; Mon, 06 Feb 2023 16:16:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YJYMF8Hd3zoU/fHdlx9uwPS6z7TdUxv0RapDL83AFlw=; b=glP8ZqkuGdEwUAtPEM0nyEcXOlMM22uONiQrXV8Fq/OXD7JIs2IC7b56GyaR3aV9WQ 42nSGKU3ORv/LTrDh3UZhpjvhTL1JD2FBs078xgBgVziITtgQvOKg6a46yT9f9vCFBeE leqvlLeUxJNCtW5SlWlFrY92A0BZV3Za2mTMmDhMa9PDwisaP9QFuzoDgFONaw5eIA5O MbxNIulfpWDHA/nRUHn6GnDgvcKA/P8hlpX77/nMvVabfQwNBNywiEyNCKCib+SxU2Ia QwnjTPbNU0qnWbwUpT+D80DzbWrXAyLYq/bI4L1qXTdRzbXcLrBfwcOpnyqFW21hW8Oo sCNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YJYMF8Hd3zoU/fHdlx9uwPS6z7TdUxv0RapDL83AFlw=; b=MsVpaYFHBOyxUTBmOn8ohKynArSjJ26R+q+YjN1/t6rRIf4WIiJWEV95f43U+PyCOP pOXrm44XxJBPm7000eRH1ijlrKi7iL2vsh5fWVcNv0g9XVEYaiKwKeAFo9xB6TM/QwBr q2ge3ViQD82ZTiloecIsTBIlh1d+Q0kqeOfetKm+a2RKB4WILtyb2RyEXPsbBHoIdSp/ ZxNFTHQQ80WQKaHAEE35MrOCsRFkz4aBsnfKFJRSdPIU0tat74DLqYjnolerXY1bHFXP m39cUcI6BYYCfhckWyZ1W3LkV05GG5hDcs4eS+akhPvXNp0rE4HLxQ2P0IeWw0olnKMx VrKw== X-Gm-Message-State: AO0yUKXLm7AzCzIGmBVYrH3PFjjzf6/fN51fun8InrwLNxD2kzAcH47c 232m31buOxGkDywopaarDWVxFmIysZBtK0zq X-Google-Smtp-Source: AK7set8E+JEmcWcPBbyz+HGe9LgnMqb50GnkrzEFpwOKS9+r8KeMQf6MyVZk6xhbRnLIUTUAWgugkQ== X-Received: by 2002:a05:6000:110:b0:2c3:ea52:7d0e with SMTP id o16-20020a056000011000b002c3ea527d0emr557164wrx.69.1675729000598; Mon, 06 Feb 2023 16:16:40 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:40 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 07/19] riscv: hart-features: Add fast_unaligned property Date: Tue, 7 Feb 2023 01:16:06 +0100 Message-Id: <20230207001618.458947-8-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner Having fast unaligned accesses opens the door for a performance optimizations. Let's add this property to the hart-features so that this property can be queried using the environment variable RISCV_RT_FAST_UNALIGNED (e.g. by setting it to "1"). Signed-off-by: Christoph Müllner --- sysdeps/unix/sysv/linux/riscv/hart-features.c | 19 +++++++++++++++++++ sysdeps/unix/sysv/linux/riscv/hart-features.h | 5 +++++ 2 files changed, 24 insertions(+) diff --git a/sysdeps/unix/sysv/linux/riscv/hart-features.c b/sysdeps/unix/sysv/linux/riscv/hart-features.c index 6de41a26cc..b3b7955534 100644 --- a/sysdeps/unix/sysv/linux/riscv/hart-features.c +++ b/sysdeps/unix/sysv/linux/riscv/hart-features.c @@ -326,6 +326,22 @@ parse_rt_cboz_blocksize (struct hart_features *hart_features) hart_features->cboz_blocksize = v; } +/* Parse RISCV_RT_FAST_UNALIGNED and store value. */ +static inline void +parse_rt_fast_unaligned (struct hart_features *hart_features) +{ + hart_features->rt_fast_unaligned = NULL; + hart_features->fast_unaligned = 0; + + const char *s = simple_getenv ("RISCV_RT_FAST_UNALIGNED"); + if (s == NULL) + return; + + uint64_t v = _dl_strtoul (s, NULL); + hart_features->rt_fast_unaligned = s; + hart_features->fast_unaligned = v; +} + /* Discover hart features and store them. */ static inline void init_hart_features (struct hart_features *hart_features) @@ -334,4 +350,7 @@ init_hart_features (struct hart_features *hart_features) parse_rt_march (hart_features); parse_rt_cbom_blocksize (hart_features); parse_rt_cboz_blocksize (hart_features); + + /* Parse tuning properties. */ + parse_rt_fast_unaligned (hart_features); } diff --git a/sysdeps/unix/sysv/linux/riscv/hart-features.h b/sysdeps/unix/sysv/linux/riscv/hart-features.h index dd94685676..b2cefd5748 100644 --- a/sysdeps/unix/sysv/linux/riscv/hart-features.h +++ b/sysdeps/unix/sysv/linux/riscv/hart-features.h @@ -34,6 +34,9 @@ #define HAVE_CBOZ_BLOCKSIZE(n) \ (GLRO (dl_riscv_hart_features).cboz_blocksize == n) +#define HAVE_FAST_UNALIGNED() \ + (GLRO (dl_riscv_hart_features).fast_unaligned != 0) + struct hart_features { const char* rt_march; @@ -48,6 +51,8 @@ struct hart_features unsigned cbom_blocksize; const char* rt_cboz_blocksize; unsigned cboz_blocksize; + const char* rt_fast_unaligned; + unsigned fast_unaligned; }; #endif /* _CPU_FEATURES_RISCV_H */ From patchwork Tue Feb 7 00:16:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738545 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=s4oiNbAH; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kKf1C1pz23j0 for ; Tue, 7 Feb 2023 11:20:06 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1C4D1389EC61 for ; Tue, 7 Feb 2023 00:20:03 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x32a.google.com (mail-wm1-x32a.google.com [IPv6:2a00:1450:4864:20::32a]) by sourceware.org (Postfix) with ESMTPS id 93E783858C20 for ; Tue, 7 Feb 2023 00:16:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 93E783858C20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x32a.google.com with SMTP id hn2-20020a05600ca38200b003dc5cb96d46so12012912wmb.4 for ; Mon, 06 Feb 2023 16:16:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DlRBv+g2/GvW8WIJFIk6smVkrd3o3vNOVA1dOGbuqTo=; b=s4oiNbAH0ybZpdZNDNJ1gkuZCQOHI2GFH8OlmbzO6OnWJghLZT2+D9KlvuaJzcXwvV P5zopKz5dDcQf5Lin+j/4r9v5E+7LZcX+KOMl4cqUTduQPb9cN+ehyo4/EVHeqCWps+v yjFs2jcmw+E9T/Z8YCFnPKGiyLxYEq9Wu28h5TTfsBlyLTWnlTsKXRCCZ5DmrOF3Mqc1 WCfMWS1KeVWN5GeLOfSD5pxWlBWShrguSjN9OasC6AM2wDRfbnHrHr5irWWkpQbM37Ok Inc4WCBA7OazawlWJpJnkPVooj2u7WAgI0za0KesbJ+qJuYVHLJAmZQ+pAckxgTyXNZf gujg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DlRBv+g2/GvW8WIJFIk6smVkrd3o3vNOVA1dOGbuqTo=; b=nWDFQlVjNT0NC6QclqDNeF9sUp+mEoNlWf01sGOjoY278MVTw0pD0Ff4G79u2odw4w rFHghTOqd1tnWB49fEJijjCslurEjLVJ+VzfgVCxjR+IDCsVJ7CWaMX33dxzHSi3e42Z Py2LdoFrtVLBvHpJ7RxBPC70+9GgXQycRwKXiFY4QOgtM2QxfZ5FQADB5wZWDY58DiUt UFf+KpE/7onpF8qFWe03hHwkGpELnbELdO4bv2vSTNnVoeQiNAx/5ANXjHYZS4mkz9hj QtKT/KyiCvXr0erR5T0XDtKj4RytfhG5jZGLaLNh1uyjRCgcgXEaz7ntiAj8J7kcwFm7 4QaA== X-Gm-Message-State: AO0yUKWCYUeW4ifLWwxUW7+Dj2YGGZdS/sAPiNVGtFnc4lnnL6AYvGyk F2/sC6Na4pDXDkgkr7lA0nSpBuS24y66Dhta X-Google-Smtp-Source: AK7set9lHqkGtkWr9X/Eak0nRTPNnyKGvA/GzKT+xzih57s/iWIXIW8XST4W+yQbwC/ocabB5s+8Yg== X-Received: by 2002:a05:600c:755:b0:3e0:6c4:6a3a with SMTP id j21-20020a05600c075500b003e006c46a3amr1276186wmn.22.1675729002079; Mon, 06 Feb 2023 16:16:42 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:41 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 08/19] riscv: Add (empty) ifunc framework Date: Tue, 7 Feb 2023 01:16:07 +0100 Message-Id: <20230207001618.458947-9-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner This patch adds the missing pieces to add ifunc implementations of routines. No optimized code is added as part of this patch. Signed-off-by: Christoph Müllner --- sysdeps/riscv/multiarch/Makefile | 4 +++ sysdeps/riscv/multiarch/ifunc-impl-list.c | 39 +++++++++++++++++++++++ sysdeps/riscv/multiarch/init-arch.h | 24 ++++++++++++++ 3 files changed, 67 insertions(+) create mode 100644 sysdeps/riscv/multiarch/Makefile create mode 100644 sysdeps/riscv/multiarch/ifunc-impl-list.c create mode 100644 sysdeps/riscv/multiarch/init-arch.h diff --git a/sysdeps/riscv/multiarch/Makefile b/sysdeps/riscv/multiarch/Makefile new file mode 100644 index 0000000000..68d3f5192f --- /dev/null +++ b/sysdeps/riscv/multiarch/Makefile @@ -0,0 +1,4 @@ +ifeq ($(subdir),string) +sysdep_routines += \ + +endif diff --git a/sysdeps/riscv/multiarch/ifunc-impl-list.c b/sysdeps/riscv/multiarch/ifunc-impl-list.c new file mode 100644 index 0000000000..c0cdca45fd --- /dev/null +++ b/sysdeps/riscv/multiarch/ifunc-impl-list.c @@ -0,0 +1,39 @@ +/* Enumerate available IFUNC implementations of a function. RISC-V version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include +#include +#include +#include +#include +#include + +/* Maximum number of IFUNC implementations. */ +#define MAX_IFUNC 7 + +size_t +__libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, + size_t max) +{ + assert (max >= MAX_IFUNC); + + size_t i = 0; + + return i; +} diff --git a/sysdeps/riscv/multiarch/init-arch.h b/sysdeps/riscv/multiarch/init-arch.h new file mode 100644 index 0000000000..c9afeec07b --- /dev/null +++ b/sysdeps/riscv/multiarch/init-arch.h @@ -0,0 +1,24 @@ +/* Define INIT_ARCH for RISC-V. + This file is part of the GNU C Library. + Copyright (C) 2022 Free Software Foundation, Inc. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _INIT_ARCH_RISCV +#define _INIT_ARCH_RISCV + +#define INIT_ARCH() + +#endif /* _INIT_ARCH_RISCV */ From patchwork Tue Feb 7 00:16:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738537 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=TCUSYMkL; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kJ14r08z23jB for ; Tue, 7 Feb 2023 11:18:41 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9F209382E6A7 for ; Tue, 7 Feb 2023 00:18:39 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x330.google.com (mail-wm1-x330.google.com [IPv6:2a00:1450:4864:20::330]) by sourceware.org (Postfix) with ESMTPS id 0FD1F385840A for ; Tue, 7 Feb 2023 00:16:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0FD1F385840A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x330.google.com with SMTP id o36so9916622wms.1 for ; Mon, 06 Feb 2023 16:16:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=f8F8/LKA8ApCKbih+IBLa4IoTM1LoL3MkN+4weX429k=; b=TCUSYMkLGviuqNi8d839d8FwaeAZ1RMzpWdvLuTWHU7VytTMdRWuQo9ntulej7YeXm 8gw6MeyMKEkIiERdVfsXuVOyI9GQK/8g77Dw4lBhH4HMYEPviMCDZa7APJkxegRTVful p451GPZNFpEAoSZnT5YFZFtwBwqkhx44L2+bsqvBfTobq98G1t2xHJXSIuUufiwOy6Zq b1vvT3SDhKDFXaMvGUaaJxdi9MQbmKolcNort5HDPw8AixaMPP0Gia6UVXQ4H0v7xIun Gc+yX7LfW1hY8irYfsjujPWfqSRKW4naBBtOrmf0MaT+HsopAQ/JDiWZG1c5egwi0aci ykOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=f8F8/LKA8ApCKbih+IBLa4IoTM1LoL3MkN+4weX429k=; b=BtCNP7OlqUHeXyUb/4qL1U2NV91nObJaPqSlC0qNe/klqkfbNuuvUAJB/EstXt2EUp 2aApNwjn8Op5Lropf1JWZKNUnfAMwDo7v+A9N2XmmcfbZ1P9H+e5XX6pb2v9yeC9wnuf V08XeEJ/SsjWtRxPIpwXWQWqgxovH6wDkWqWh/iQaxn88YbSjV11lUlPXgjhUYuKPcWN 2HHBfkSSaTmO634kdpKostKD5EnQR/LILagrA7uIjFXrKb4StEePQIBSoglLRs5gvm8z sATgn4GeEIF4RLseKVBvZ99NGUgQLGJce7PCcgX1yG+LSEcjDjwrjNDaiJfGoynaebMA HyMA== X-Gm-Message-State: AO0yUKWqB4tx+pNdFPFLwbgK9aC931iHt5l5r8hQCQYfCK9caO+xcQgC OeZ/gopx5qDZMCXhodl02l8R6W3OJlB05mEH X-Google-Smtp-Source: AK7set8Dojc18bonySy7RnjVsdTsV2BQrKsPazOcsubpFDj/Z0hxzoxZKExAuX5b7gMzJL0zPJEAjg== X-Received: by 2002:a05:600c:4aa0:b0:3d9:69fd:7707 with SMTP id b32-20020a05600c4aa000b003d969fd7707mr817023wmp.2.1675729003285; Mon, 06 Feb 2023 16:16:43 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:42 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 09/19] riscv: Add ifunc support for memset Date: Tue, 7 Feb 2023 01:16:08 +0100 Message-Id: <20230207001618.458947-10-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner This patch adds ifunc support for calls to memset to the RISC-V code. No optimized code is added as part of this patch. Signed-off-by: Christoph Müllner --- sysdeps/riscv/multiarch/Makefile | 2 +- sysdeps/riscv/multiarch/ifunc-impl-list.c | 3 ++ sysdeps/riscv/multiarch/memset.c | 40 +++++++++++++++++++++++ sysdeps/riscv/multiarch/memset_generic.c | 32 ++++++++++++++++++ 4 files changed, 76 insertions(+), 1 deletion(-) create mode 100644 sysdeps/riscv/multiarch/memset.c create mode 100644 sysdeps/riscv/multiarch/memset_generic.c diff --git a/sysdeps/riscv/multiarch/Makefile b/sysdeps/riscv/multiarch/Makefile index 68d3f5192f..453f0f4e4c 100644 --- a/sysdeps/riscv/multiarch/Makefile +++ b/sysdeps/riscv/multiarch/Makefile @@ -1,4 +1,4 @@ ifeq ($(subdir),string) sysdep_routines += \ - + memset_generic endif diff --git a/sysdeps/riscv/multiarch/ifunc-impl-list.c b/sysdeps/riscv/multiarch/ifunc-impl-list.c index c0cdca45fd..fd1752bc46 100644 --- a/sysdeps/riscv/multiarch/ifunc-impl-list.c +++ b/sysdeps/riscv/multiarch/ifunc-impl-list.c @@ -35,5 +35,8 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, size_t i = 0; + IFUNC_IMPL (i, name, memset, + IFUNC_IMPL_ADD (array, i, memset, 1, __memset_generic)) + return i; } diff --git a/sysdeps/riscv/multiarch/memset.c b/sysdeps/riscv/multiarch/memset.c new file mode 100644 index 0000000000..ae4289ab03 --- /dev/null +++ b/sysdeps/riscv/multiarch/memset.c @@ -0,0 +1,40 @@ +/* Multiple versions of memset. RISC-V version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define multiple versions only for the definition in libc. */ + +#if IS_IN (libc) +/* Redefine memset so that the compiler won't complain about the type + mismatch with the IFUNC selector in strong_alias, below. */ +# undef memset +# define memset __redirect_memset +# include +# include +# include +# include + +extern __typeof (__redirect_memset) __libc_memset; +extern __typeof (__redirect_memset) __memset_generic attribute_hidden; + +libc_ifunc (__libc_memset, __memset_generic); + +# undef memset +strong_alias (__libc_memset, memset); +#else +# include +#endif diff --git a/sysdeps/riscv/multiarch/memset_generic.c b/sysdeps/riscv/multiarch/memset_generic.c new file mode 100644 index 0000000000..37acb398d4 --- /dev/null +++ b/sysdeps/riscv/multiarch/memset_generic.c @@ -0,0 +1,32 @@ +/* Memset for RISC-V, default version for internal use. + Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include + +#define MEMSET __memset_generic + +#ifdef SHARED +# undef libc_hidden_builtin_def +# define libc_hidden_builtin_def(name) \ + __hidden_ver1(__memset_generic, __GI_memset, __memset_generic); +#endif + +extern void *__memset_generic(void *s, int c, size_t n); + +#include From patchwork Tue Feb 7 00:16:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738542 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=QgwARrkY; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kK8321Tz23jB for ; Tue, 7 Feb 2023 11:19:40 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6576D38493D4 for ; Tue, 7 Feb 2023 00:19:38 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x32e.google.com (mail-wm1-x32e.google.com [IPv6:2a00:1450:4864:20::32e]) by sourceware.org (Postfix) with ESMTPS id 873A13858404 for ; Tue, 7 Feb 2023 00:16:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 873A13858404 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x32e.google.com with SMTP id hn2-20020a05600ca38200b003dc5cb96d46so12012954wmb.4 for ; Mon, 06 Feb 2023 16:16:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=akMDhCyjoko+kBn9o1em0IPfdDjHcTURB9a5m+zkGVo=; b=QgwARrkYblP2JaqMXhFazCxa5FrhfxL0JV70qf4fUyuC02BXTrB81hzIxa85zYgLoh 29J2vYSARwE+mi47+PUYULNwrDAiFr5Y81h9O6nyLF4Xt9VW+jnLBrQPHPxWKm+U5deS YWkS9YGgG36IN8j1PcFMK8sw3jo++uPn//+LZ9uQoS3nDmlz0Ib1DMKgml7bLFBeyMJH bnuQAZ86xJLVPaNO2QxX8hix2ZYnqfWwJVpBK4IPTQTMmivg/mDQNBJ3ipA6g2C2onYQ IYo7yNJN3n2jhllhcjhNrCjAfm3JoC9Sda8IvvvqT0BikMurJcmYMBL3GLwNvyoJGySd 4J/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=akMDhCyjoko+kBn9o1em0IPfdDjHcTURB9a5m+zkGVo=; b=eJyC+7vKhlsLF1ILUGP7p7qVaQgUT10exa0rAnUryUSs/qnYfilug9kRvSkKjDfaBG xEPgW+Ywxq7dehYxhaxxsznRlNs9qXFHkB3AJkaajyj3b+IYA+V+kiSTppwQ7i81Cg08 DLi5LKTZZNxprNKvtyLQ4+3G06FcmKzKJUeOvbTrpZCFUpAO2p+d7GMSoHzWVCRiD90y ERhzP2FO3UxAy0eK0ijcgynn585fR6UgGs4IXOIGdfcjK1j6rpa2REyZ7aMWElzhnpK0 15KivENgAKGAsS7JV8/mMb6Q4oYijwjwQK3tQlI8o2Yu7CUbgxnVaPJ3u4Re2IKVP74C 1+NQ== X-Gm-Message-State: AO0yUKVc6j2ZhkzV4H3A5PMOpb9KiJb9eQaVcw2/zhyNmynayWpA0wBD JItFmYlXZHJbkHhUBXVZghx1BIJY9x6q/4Ib X-Google-Smtp-Source: AK7set9VZqhJ/OLYwmf7vDGUlvIwwrJpGbRqgLXIE6ruQHckp0CKpjs2E3ioKUJ46l5WYT8J4PWYAQ== X-Received: by 2002:a05:600c:43d5:b0:3d9:e5d3:bf with SMTP id f21-20020a05600c43d500b003d9e5d300bfmr1280846wmn.32.1675729004827; Mon, 06 Feb 2023 16:16:44 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:44 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 10/19] riscv: Add accelerated memset routines for RV64 Date: Tue, 7 Feb 2023 01:16:09 +0100 Message-Id: <20230207001618.458947-11-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_DNSWL_NONE, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner The implementation of memset() can be accelerated by loop unrolling, fast unaligned accesses and cbo.zero. Let's provide an implementation that supports that, with a cbo.zero being optional and only available for a block size of 64 bytes. Signed-off-by: Christoph Müllner --- sysdeps/riscv/multiarch/Makefile | 4 +- sysdeps/riscv/multiarch/ifunc-impl-list.c | 4 + sysdeps/riscv/multiarch/memset.c | 12 + .../riscv/multiarch/memset_rv64_unaligned.S | 31 +++ .../multiarch/memset_rv64_unaligned_cboz64.S | 217 ++++++++++++++++++ 5 files changed, 267 insertions(+), 1 deletion(-) create mode 100644 sysdeps/riscv/multiarch/memset_rv64_unaligned.S create mode 100644 sysdeps/riscv/multiarch/memset_rv64_unaligned_cboz64.S diff --git a/sysdeps/riscv/multiarch/Makefile b/sysdeps/riscv/multiarch/Makefile index 453f0f4e4c..6e8ebb42d8 100644 --- a/sysdeps/riscv/multiarch/Makefile +++ b/sysdeps/riscv/multiarch/Makefile @@ -1,4 +1,6 @@ ifeq ($(subdir),string) sysdep_routines += \ - memset_generic + memset_generic \ + memset_rv64_unaligned \ + memset_rv64_unaligned_cboz64 endif diff --git a/sysdeps/riscv/multiarch/ifunc-impl-list.c b/sysdeps/riscv/multiarch/ifunc-impl-list.c index fd1752bc46..e878977b73 100644 --- a/sysdeps/riscv/multiarch/ifunc-impl-list.c +++ b/sysdeps/riscv/multiarch/ifunc-impl-list.c @@ -36,6 +36,10 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, size_t i = 0; IFUNC_IMPL (i, name, memset, +#if __riscv_xlen == 64 + IFUNC_IMPL_ADD (array, i, memset, 1, __memset_rv64_unaligned_cboz64) + IFUNC_IMPL_ADD (array, i, memset, 1, __memset_rv64_unaligned) +#endif IFUNC_IMPL_ADD (array, i, memset, 1, __memset_generic)) return i; diff --git a/sysdeps/riscv/multiarch/memset.c b/sysdeps/riscv/multiarch/memset.c index ae4289ab03..7ba10dd3da 100644 --- a/sysdeps/riscv/multiarch/memset.c +++ b/sysdeps/riscv/multiarch/memset.c @@ -31,7 +31,19 @@ extern __typeof (__redirect_memset) __libc_memset; extern __typeof (__redirect_memset) __memset_generic attribute_hidden; +#if __riscv_xlen == 64 +extern __typeof (__redirect_memset) __memset_rv64_unaligned_cboz64 attribute_hidden; +extern __typeof (__redirect_memset) __memset_rv64_unaligned attribute_hidden; + +libc_ifunc (__libc_memset, + (IS_RV64() && HAVE_FAST_UNALIGNED() && HAVE_RV(zicboz) && HAVE_CBOZ_BLOCKSIZE(64) + ? __memset_rv64_unaligned_cboz64 + : (IS_RV64() && HAVE_FAST_UNALIGNED() + ? __memset_rv64_unaligned + : __memset_generic))); +#else libc_ifunc (__libc_memset, __memset_generic); +#endif # undef memset strong_alias (__libc_memset, memset); diff --git a/sysdeps/riscv/multiarch/memset_rv64_unaligned.S b/sysdeps/riscv/multiarch/memset_rv64_unaligned.S new file mode 100644 index 0000000000..561e564b42 --- /dev/null +++ b/sysdeps/riscv/multiarch/memset_rv64_unaligned.S @@ -0,0 +1,31 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include +#include + +#ifndef MEMSET +# define MEMSET __memset_rv64_unaligned +#endif + +#undef CBO_ZERO_THRESHOLD +#define CBO_ZERO_THRESHOLD 0 + +/* Assumptions: rv64i unaligned accesses. */ + +#include "./memset_rv64_unaligned_cboz64.S" diff --git a/sysdeps/riscv/multiarch/memset_rv64_unaligned_cboz64.S b/sysdeps/riscv/multiarch/memset_rv64_unaligned_cboz64.S new file mode 100644 index 0000000000..710bb41e44 --- /dev/null +++ b/sysdeps/riscv/multiarch/memset_rv64_unaligned_cboz64.S @@ -0,0 +1,217 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#if __riscv_xlen == 64 + +#include +#include + +#define dstin a0 +#define val a1 +#define count a2 +#define dst a3 +#define dstend a4 +#define tmp1 a5 + +#ifndef MEMSET +# define MEMSET __memset_rv64_unaligned_cboz64 +#endif + +/* cbo.zero can be used to improve the performance of memset-zero. + * However, the performance gain depends on the amount of data + * to be cleared. This threshold allows to set the minimum amount + * of bytes to enable the cbo.zero loop. + * To disable cbo.zero, set this threshold to 0. */ +#ifndef CBO_ZERO_THRESHOLD +# define CBO_ZERO_THRESHOLD 128 +#endif + +/* Assumptions: + * rv64i_zicboz, 64 byte cbo.zero block size, unaligned accesses. */ + +ENTRY_ALIGN (MEMSET, 6) + + /* Repeat the byte. */ + slli tmp1, val, 8 + or val, tmp1, a1 + slli tmp1, val, 16 + or val, tmp1, a1 + slli tmp1, val, 32 + or val, tmp1, val + + /* Calculate the end position. */ + add dstend, dstin, count + + /* Decide how to process. */ + li tmp1, 96 + bgtu count, tmp1, L(set_long) + li tmp1, 16 + bgtu count, tmp1, L(set_medium) + + /* Set 0..16 bytes. */ + li tmp1, 8 + bltu count, tmp1, 1f + /* Set 8..16 bytes. */ + sd val, 0(dstin) + sd val, -8(dstend) + ret + + .p2align 3 + /* Set 0..7 bytes. */ +1: li tmp1, 4 + bltu count, tmp1, 2f + /* Set 4..7 bytes. */ + sw val, 0(dstin) + sw val, -4(dstend) + ret + + /* Set 0..3 bytes. */ +2: beqz count, 3f + sb val, 0(dstin) + li tmp1, 2 + bltu count, tmp1, 3f + sh val, -2(dstend) +3: ret + + .p2align 3 + /* Set 17..96 bytes. */ +L(set_medium): + sd val, 0(dstin) + sd val, 8(dstin) + li tmp1, 64 + bgtu count, tmp1, L(set96) + sd val, -16(dstend) + sd val, -8(dstend) + li tmp1, 32 + bleu count, tmp1, 1f + sd val, 16(dstin) + sd val, 24(dstin) + sd val, -32(dstend) + sd val, -24(dstend) +1: ret + + .p2align 4 + /* Set 65..96 bytes. Write 64 bytes from the start and + 32 bytes from the end. */ +L(set96): + sd val, 16(dstin) + sd val, 24(dstin) + sd val, 32(dstin) + sd val, 40(dstin) + sd val, 48(dstin) + sd val, 56(dstin) + sd val, -32(dstend) + sd val, -24(dstend) + sd val, -16(dstend) + sd val, -8(dstend) + ret + + .p2align 4 + /* Set 97+ bytes. */ +L(set_long): + /* Store 16 bytes unaligned. */ + sd val, 0(dstin) + sd val, 8(dstin) + +#if CBO_ZERO_THRESHOLD + li tmp1, CBO_ZERO_THRESHOLD + blt count, tmp1, 1f + beqz val, L(cbo_zero_64) +1: +#endif + + /* Round down to the previous 16 byte boundary (keep offset of 16). */ + andi dst, dstin, -16 + + /* Calculate loop termination position. */ + addi tmp1, dstend, -(16+64) + + /* Store 64 bytes in a loop. */ + .p2align 4 +1: sd val, 16(dst) + sd val, 24(dst) + sd val, 32(dst) + sd val, 40(dst) + sd val, 48(dst) + sd val, 56(dst) + sd val, 64(dst) + sd val, 72(dst) + addi dst, dst, 64 + bltu dst, tmp1, 1b + + /* Calculate remainder (dst2 is 16 too less). */ + sub count, dstend, dst + + /* Check if we have more than 32 bytes to copy. */ + li tmp1, (32+16) + ble count, tmp1, 1f + sd val, 16(dst) + sd val, 24(dst) + sd val, 32(dst) + sd val, 40(dst) +1: sd val, -32(dstend) + sd val, -24(dstend) + sd val, -16(dstend) + sd val, -8(dstend) + ret + +#if CBO_ZERO_THRESHOLD + .option push + .option arch,+zicboz + .p2align 3 +L(cbo_zero_64): + /* Align dst (down). */ + sd val, 16(dstin) + sd val, 24(dstin) + sd val, 32(dstin) + sd val, 40(dstin) + sd val, 48(dstin) + sd val, 56(dstin) + + /* Round up to the next 64 byte boundary. */ + andi dst, dstin, -64 + addi dst, dst, 64 + + /* Calculate loop termination position. */ + addi tmp1, dstend, -64 + + /* cbo.zero sets 64 bytes each time. */ + .p2align 4 +1: cbo.zero (dst) + addi dst, dst, 64 + bltu dst, tmp1, 1b + + sub count, dstend, dst + li tmp1, 32 + ble count, tmp1, 1f + sd val, 0(dst) + sd val, 8(dst) + sd val, 16(dst) + sd val, 24(dst) +1: sd val, -32(dstend) + sd val, -24(dstend) + sd val, -16(dstend) + sd val, -8(dstend) + ret + .option pop +#endif /* CBO_ZERO_THRESHOLD */ + +END (MEMSET) +libc_hidden_builtin_def (MEMSET) + +#endif /* __riscv_xlen == 64 */ From patchwork Tue Feb 7 00:16:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738540 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=X2/ud9M4; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kJy2GsJz23jB for ; Tue, 7 Feb 2023 11:19:30 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4340B3888816 for ; Tue, 7 Feb 2023 00:19:28 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x32c.google.com (mail-wm1-x32c.google.com [IPv6:2a00:1450:4864:20::32c]) by sourceware.org (Postfix) with ESMTPS id 0A0B03858C30 for ; Tue, 7 Feb 2023 00:16:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0A0B03858C30 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x32c.google.com with SMTP id u10so6852981wmj.3 for ; Mon, 06 Feb 2023 16:16:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eV8w4DflMjsQTeC9E4djYSCYGafIxUSYc+n8bY0TA7E=; b=X2/ud9M4il3ziufQ4XrAAsoClgQ9zkoAF/yQPDGWpGa1vlBxkjcwv/rCo0rHjDDI0L dcNEfoIaGZW1J1CRScd+QWuW6Wmb0QxcI2kRJw/8UjPe4ClpZ7uGJMDgySGNZtNdosY9 wRyCVyfxCl7/ucamBS3zDcKAKLC6LprpZ6MiLBwG8GUFGz7nAxnG4HQhTTryC33A3mJu 7lv+bXX4msdtztmXD9WqFJdlqTVHcuV2qHOOPQA8o3+xUqvjZxzL2WCaRnhibm+6p4h4 6fpbOfsj5rJeabMpqxPN8dMs9/5EEWeDEmo5UxsGOsZpkPrkLm4dM+DoBimg/MNO8USB z1hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eV8w4DflMjsQTeC9E4djYSCYGafIxUSYc+n8bY0TA7E=; b=VvpwJWJJpa/U3XX7SJzWpXpCcayIy6sWaM1PnSE3t0ctKLwrXfH0dUxoCF/3aEnAQx BPtxi3baD9fnFg39TaQPM/FCTq8hMxjkw/oxOr7y6E2jZ6/PNjYXmM4IDKzhRgC3NcwI MoFpt5CSOzjdR2DFMgazLgDh+9Ljkkly77uGvANSsNR6YMAycGkJ9qhyMEPtql/7m6yL kTLUXgQWRJ/gp8AGwuSqb4TPgNZY4u/YWbOJbwCfbdI6bzD2LiZKG7lm6GdIKvum9HpO cnGpOgR8YIIUL1YqGe+2fAHj0OiEnZH3PCRmKqLZfKLc07xxDjQ8zHl6LsquyCFNRvXx yDxQ== X-Gm-Message-State: AO0yUKVK2fn4XHc7yuotcIkyrqZfVqVowaJObe7JCD6MpZ7oQSYtWfLz 2XLmg5pSjs+qBDzRPKE23DRZmLfTJGw/fTs7 X-Google-Smtp-Source: AK7set+5mN9m+VgpZHeS6tq76FujDZCB2IRPPYCMrAF8ggEM1ATjgSXttSJnnwyJoq1BjFIStyltLA== X-Received: by 2002:a05:600c:2ac8:b0:3c6:e61e:ae71 with SMTP id t8-20020a05600c2ac800b003c6e61eae71mr1380621wme.1.1675729006298; Mon, 06 Feb 2023 16:16:46 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:45 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 11/19] riscv: Add ifunc support for memcpy/memmove Date: Tue, 7 Feb 2023 01:16:10 +0100 Message-Id: <20230207001618.458947-12-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner This patch adds ifunc support for calls to memcpy() and memmove() to the RISC-V code. No optimized code is added as part of this patch. Signed-off-by: Christoph Müllner --- sysdeps/riscv/multiarch/Makefile | 2 ++ sysdeps/riscv/multiarch/ifunc-impl-list.c | 6 ++++ sysdeps/riscv/multiarch/memcpy.c | 40 +++++++++++++++++++++++ sysdeps/riscv/multiarch/memcpy_generic.c | 32 ++++++++++++++++++ sysdeps/riscv/multiarch/memmove.c | 40 +++++++++++++++++++++++ sysdeps/riscv/multiarch/memmove_generic.c | 32 ++++++++++++++++++ 6 files changed, 152 insertions(+) create mode 100644 sysdeps/riscv/multiarch/memcpy.c create mode 100644 sysdeps/riscv/multiarch/memcpy_generic.c create mode 100644 sysdeps/riscv/multiarch/memmove.c create mode 100644 sysdeps/riscv/multiarch/memmove_generic.c diff --git a/sysdeps/riscv/multiarch/Makefile b/sysdeps/riscv/multiarch/Makefile index 6e8ebb42d8..6bc20c4fe0 100644 --- a/sysdeps/riscv/multiarch/Makefile +++ b/sysdeps/riscv/multiarch/Makefile @@ -1,5 +1,7 @@ ifeq ($(subdir),string) sysdep_routines += \ + memcpy_generic \ + memmove_generic \ memset_generic \ memset_rv64_unaligned \ memset_rv64_unaligned_cboz64 diff --git a/sysdeps/riscv/multiarch/ifunc-impl-list.c b/sysdeps/riscv/multiarch/ifunc-impl-list.c index e878977b73..16e4d7137f 100644 --- a/sysdeps/riscv/multiarch/ifunc-impl-list.c +++ b/sysdeps/riscv/multiarch/ifunc-impl-list.c @@ -35,6 +35,12 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, size_t i = 0; + IFUNC_IMPL (i, name, memcpy, + IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_generic)) + + IFUNC_IMPL (i, name, memmove, + IFUNC_IMPL_ADD (array, i, memmove, 1, __memmove_generic)) + IFUNC_IMPL (i, name, memset, #if __riscv_xlen == 64 IFUNC_IMPL_ADD (array, i, memset, 1, __memset_rv64_unaligned_cboz64) diff --git a/sysdeps/riscv/multiarch/memcpy.c b/sysdeps/riscv/multiarch/memcpy.c new file mode 100644 index 0000000000..cc9185912a --- /dev/null +++ b/sysdeps/riscv/multiarch/memcpy.c @@ -0,0 +1,40 @@ +/* Multiple versions of memcpy. RISC-V version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define multiple versions only for the definition in libc. */ + +#if IS_IN (libc) +/* Redefine memcpy so that the compiler won't complain about the type + mismatch with the IFUNC selector in strong_alias, below. */ +# undef memcpy +# define memcpy __redirect_memcpy +# include +# include +# include +# include + +extern __typeof (__redirect_memcpy) __libc_memcpy; +extern __typeof (__redirect_memcpy) __memcpy_generic attribute_hidden; + +libc_ifunc (__libc_memcpy, __memcpy_generic); + +# undef memcpy +strong_alias (__libc_memcpy, memcpy); +#else +# include +#endif diff --git a/sysdeps/riscv/multiarch/memcpy_generic.c b/sysdeps/riscv/multiarch/memcpy_generic.c new file mode 100644 index 0000000000..fb46fe7622 --- /dev/null +++ b/sysdeps/riscv/multiarch/memcpy_generic.c @@ -0,0 +1,32 @@ +/* Memcpy for RISC-V, default version for internal use. + Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include + +#define MEMCPY __memcpy_generic + +#ifdef SHARED +# undef libc_hidden_builtin_def +# define libc_hidden_builtin_def(name) \ + __hidden_ver1(__memcpy_generic, __GI_memcpy, __memcpy_generic); +#endif + +extern void *__memcpy_generic(void *dest, const void *src, size_t n); + +#include diff --git a/sysdeps/riscv/multiarch/memmove.c b/sysdeps/riscv/multiarch/memmove.c new file mode 100644 index 0000000000..581a8327d6 --- /dev/null +++ b/sysdeps/riscv/multiarch/memmove.c @@ -0,0 +1,40 @@ +/* Multiple versions of memmove. RISC-V version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define multiple versions only for the definition in libc. */ + +#if IS_IN (libc) +/* Redefine memmove so that the compiler won't complain about the type + mismatch with the IFUNC selector in strong_alias, below. */ +# undef memmove +# define memmove __redirect_memmove +# include +# include +# include +# include + +extern __typeof (__redirect_memmove) __libc_memmove; +extern __typeof (__redirect_memmove) __memmove_generic attribute_hidden; + +libc_ifunc (__libc_memmove, __memmove_generic); + +# undef memmove +strong_alias (__libc_memmove, memmove); +#else +# include +#endif diff --git a/sysdeps/riscv/multiarch/memmove_generic.c b/sysdeps/riscv/multiarch/memmove_generic.c new file mode 100644 index 0000000000..4a9e83c13c --- /dev/null +++ b/sysdeps/riscv/multiarch/memmove_generic.c @@ -0,0 +1,32 @@ +/* Memmove for RISC-V, default version for internal use. + Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include + +#define MEMMOVE __memmove_generic + +#ifdef SHARED +# undef libc_hidden_builtin_def +# define libc_hidden_builtin_def(name) \ + __hidden_ver1(__memmove_generic, __GI_memmove, __memmove_generic); +#endif + +extern void *__memmove_generic(void *dest, const void *src, size_t n); + +#include From patchwork Tue Feb 7 00:16:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738539 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=pTv1sQ13; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kJS4rLXz23jB for ; Tue, 7 Feb 2023 11:19:04 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8EC003839DF5 for ; Tue, 7 Feb 2023 00:19:02 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x329.google.com (mail-wm1-x329.google.com [IPv6:2a00:1450:4864:20::329]) by sourceware.org (Postfix) with ESMTPS id A7B793858C33 for ; Tue, 7 Feb 2023 00:16:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A7B793858C33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x329.google.com with SMTP id j29-20020a05600c1c1d00b003dc52fed235so10240335wms.1 for ; Mon, 06 Feb 2023 16:16:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=yFrHuZPAfR6fp/Gzal1le5pq6PKXicPXblYncZyCBBo=; b=pTv1sQ13yfkda0x4DiUWm5xnG7N5g0jA9t03WPoe9/OHJYFGjQ1HUJFGZgIS4gEGzu 8X91x6gG8acsUWii6qGfV8oyprsUDbzznl3hPMZq51Yd7SSKr7kLD1Kfmgj6BeIU3QtA q/qUNuq+XkLiU536/Myqi3Atg6cN/3mOkYXdpuN8HA0zKc/uenc7/fobefHojcRxQaAE y97fVbz9LxDSguBr28f3haQ0+ATUfgPXYraYTDxtO1Cd9vTj8NY194LwHdh919GdqSGy K7DpH84i2ayInjTjhG30bI2EUMQ/V8DrR3O+jUV2HY5pxCg03XK85HUxd1W8Cihwo2OE re0w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yFrHuZPAfR6fp/Gzal1le5pq6PKXicPXblYncZyCBBo=; b=R1DL/jkJEOB/3/XepeOA+109/WdXRYFyBKvJ0tTysp6fQ0VqQWs0NqxkSkaPP9JovJ T5tPm4FqumxEaxKRR033ycY+a5PNG/55NaS5FLzoEArgqyYeZ5zrwQdgHETUwuVei3HM RHR8D7IPToS6GcVgLh73NSBDNlqVvGr+Q3Rx2fKulyCvNMC8w2bhp46OY8hp6bD7jCNA L8dLcAax9ocfkkLr/VVf6Qw8viRpOpU5CM1TizbasKgJl4GiUUrZsEW8sEQHJsm8PBHE cnWaBbBvsCJ2e5KFDNzS6YVIO33/hYOLOeTVTjCnBdRCzh2PMxy05/LkT/WQZOqaSb0e Mz9g== X-Gm-Message-State: AO0yUKWzzhzg0xr93I4fTv59RgdWm1EST/asOtefru2mP5s9/o5R5+ON BRQ5xz1v2ALmi6PG0Adfur3TsFW25A4AetbB X-Google-Smtp-Source: AK7set+oyy/XyunTpgHeac3zptqHld/4RiWMIVb5aMXdFpiow0rNzF/fnGUQrymzMKScCFj2Zo5ttQ== X-Received: by 2002:a05:600c:3b18:b0:3df:e1d8:cd8f with SMTP id m24-20020a05600c3b1800b003dfe1d8cd8fmr11476843wms.6.1675729007732; Mon, 06 Feb 2023 16:16:47 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:47 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 12/19] riscv: Add accelerated memcpy/memmove routines for RV64 Date: Tue, 7 Feb 2023 01:16:11 +0100 Message-Id: <20230207001618.458947-13-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_DNSWL_NONE, SCC_10_SHORT_WORD_LINES, SCC_20_SHORT_WORD_LINES, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner The implementation of memcpy()/memmove() can be accelerated by loop unrolling and fast unaligned accesses. Let's provide an implementation that is optimized accordingly. Signed-off-by: Christoph Müllner --- sysdeps/riscv/multiarch/Makefile | 2 + sysdeps/riscv/multiarch/ifunc-impl-list.c | 6 + sysdeps/riscv/multiarch/memcpy.c | 9 + .../riscv/multiarch/memcpy_rv64_unaligned.S | 475 ++++++++++++++++++ sysdeps/riscv/multiarch/memmove.c | 9 + 5 files changed, 501 insertions(+) create mode 100644 sysdeps/riscv/multiarch/memcpy_rv64_unaligned.S diff --git a/sysdeps/riscv/multiarch/Makefile b/sysdeps/riscv/multiarch/Makefile index 6bc20c4fe0..b08d7d1c8b 100644 --- a/sysdeps/riscv/multiarch/Makefile +++ b/sysdeps/riscv/multiarch/Makefile @@ -2,6 +2,8 @@ ifeq ($(subdir),string) sysdep_routines += \ memcpy_generic \ memmove_generic \ + memcpy_rv64_unaligned \ + \ memset_generic \ memset_rv64_unaligned \ memset_rv64_unaligned_cboz64 diff --git a/sysdeps/riscv/multiarch/ifunc-impl-list.c b/sysdeps/riscv/multiarch/ifunc-impl-list.c index 16e4d7137f..84b3eb25a4 100644 --- a/sysdeps/riscv/multiarch/ifunc-impl-list.c +++ b/sysdeps/riscv/multiarch/ifunc-impl-list.c @@ -36,9 +36,15 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, size_t i = 0; IFUNC_IMPL (i, name, memcpy, +#if __riscv_xlen == 64 + IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_rv64_unaligned) +#endif IFUNC_IMPL_ADD (array, i, memcpy, 1, __memcpy_generic)) IFUNC_IMPL (i, name, memmove, +#if __riscv_xlen == 64 + IFUNC_IMPL_ADD (array, i, memmove, 1, __memmove_rv64_unaligned) +#endif IFUNC_IMPL_ADD (array, i, memmove, 1, __memmove_generic)) IFUNC_IMPL (i, name, memset, diff --git a/sysdeps/riscv/multiarch/memcpy.c b/sysdeps/riscv/multiarch/memcpy.c index cc9185912a..68ac9bbe35 100644 --- a/sysdeps/riscv/multiarch/memcpy.c +++ b/sysdeps/riscv/multiarch/memcpy.c @@ -31,7 +31,16 @@ extern __typeof (__redirect_memcpy) __libc_memcpy; extern __typeof (__redirect_memcpy) __memcpy_generic attribute_hidden; +#if __riscv_xlen == 64 +extern __typeof (__redirect_memcpy) __memcpy_rv64_unaligned attribute_hidden; + +libc_ifunc (__libc_memcpy, + (IS_RV64() && HAVE_FAST_UNALIGNED() + ? __memcpy_rv64_unaligned + : __memcpy_generic)); +#else libc_ifunc (__libc_memcpy, __memcpy_generic); +#endif # undef memcpy strong_alias (__libc_memcpy, memcpy); diff --git a/sysdeps/riscv/multiarch/memcpy_rv64_unaligned.S b/sysdeps/riscv/multiarch/memcpy_rv64_unaligned.S new file mode 100644 index 0000000000..372cd0baea --- /dev/null +++ b/sysdeps/riscv/multiarch/memcpy_rv64_unaligned.S @@ -0,0 +1,475 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#if __riscv_xlen == 64 + +#include +#include + +#define dst a0 +#define src a1 +#define count a2 +#define srcend a3 +#define dstend a4 +#define tmp1 a5 +#define dst2 t6 + +#define A_l a6 +#define A_h a7 +#define B_l t0 +#define B_h t1 +#define C_l t2 +#define C_h t3 +#define D_l t4 +#define D_h t5 +#define E_l tmp1 +#define E_h count +#define F_l dst2 +#define F_h srcend + +#ifndef MEMCPY +# define MEMCPY __memcpy_rv64_unaligned +#endif + +#ifndef MEMMOVE +# define MEMMOVE __memmove_rv64_unaligned +#endif + +#ifndef COPY97_128 +# define COPY97_128 1 +#endif + +/* Assumptions: rv64i, unaligned accesses. */ + +/* memcpy/memmove is implemented by unrolling copy loops. + We have two strategies: + 1) copy from front/start to back/end ("forward") + 2) copy from back/end to front/start ("backward") + In case of memcpy(), the strategy does not matter for correctness. + For memmove() and overlapping buffers we need to use the following strategy: + if dst < src && src-dst < count -> copy from front to back + if src < dst && dst-src < count -> copy from back to front */ + +ENTRY_ALIGN (MEMCPY, 6) + /* Calculate the end position. */ + add srcend, src, count + add dstend, dst, count + + /* Decide how to process. */ + li tmp1, 96 + bgtu count, tmp1, L(copy_long_forward) + li tmp1, 32 + bgtu count, tmp1, L(copy33_96) + li tmp1, 16 + bleu count, tmp1, L(copy0_16) + + /* Copy 17-32 bytes. */ + ld A_l, 0(src) + ld A_h, 8(src) + ld B_l, -16(srcend) + ld B_h, -8(srcend) + sd A_l, 0(dst) + sd A_h, 8(dst) + sd B_l, -16(dstend) + sd B_h, -8(dstend) + ret + +L(copy0_16): + li tmp1, 8 + bleu count, tmp1, L(copy0_8) + /* Copy 9-16 bytes. */ + ld A_l, 0(src) + ld A_h, -8(srcend) + sd A_l, 0(dst) + sd A_h, -8(dstend) + ret + + .p2align 3 +L(copy0_8): + li tmp1, 4 + bleu count, tmp1, L(copy0_4) + /* Copy 5-8 bytes. */ + lw A_l, 0(src) + lw B_l, -4(srcend) + sw A_l, 0(dst) + sw B_l, -4(dstend) + ret + +L(copy0_4): + li tmp1, 2 + bleu count, tmp1, L(copy0_2) + /* Copy 3-4 bytes. */ + lh A_l, 0(src) + lh B_l, -2(srcend) + sh A_l, 0(dst) + sh B_l, -2(dstend) + ret + +L(copy0_2): + li tmp1, 1 + bleu count, tmp1, L(copy0_1) + /* Copy 2 bytes. */ + lh A_l, 0(src) + sh A_l, 0(dst) + ret + +L(copy0_1): + beqz count, L(copy0) + /* Copy 1 byte. */ + lb A_l, 0(src) + sb A_l, 0(dst) +L(copy0): + ret + + .p2align 4 +L(copy33_96): + /* Copy 33-96 bytes. */ + ld A_l, 0(src) + ld A_h, 8(src) + ld B_l, 16(src) + ld B_h, 24(src) + ld C_l, -32(srcend) + ld C_h, -24(srcend) + ld D_l, -16(srcend) + ld D_h, -8(srcend) + + li tmp1, 64 + bgtu count, tmp1, L(copy65_96_preloaded) + + sd A_l, 0(dst) + sd A_h, 8(dst) + sd B_l, 16(dst) + sd B_h, 24(dst) + sd C_l, -32(dstend) + sd C_h, -24(dstend) + sd D_l, -16(dstend) + sd D_h, -8(dstend) + ret + + .p2align 4 +L(copy65_96_preloaded): + /* Copy 65-96 bytes with pre-loaded A, B, C and D. */ + ld E_l, 32(src) + ld E_h, 40(src) + ld F_l, 48(src) /* dst2 will be overwritten. */ + ld F_h, 56(src) /* srcend will be overwritten. */ + + sd A_l, 0(dst) + sd A_h, 8(dst) + sd B_l, 16(dst) + sd B_h, 24(dst) + sd E_l, 32(dst) + sd E_h, 40(dst) + sd F_l, 48(dst) + sd F_h, 56(dst) + sd C_l, -32(dstend) + sd C_h, -24(dstend) + sd D_l, -16(dstend) + sd D_h, -8(dstend) + ret + +#ifdef COPY97_128 + .p2align 4 +L(copy97_128_forward): + /* Copy 97-128 bytes from front to back. */ + ld A_l, 0(src) + ld A_h, 8(src) + ld B_l, 16(src) + ld B_h, 24(src) + ld C_l, -16(srcend) + ld C_h, -8(srcend) + ld D_l, -32(srcend) + ld D_h, -24(srcend) + ld E_l, -48(srcend) + ld E_h, -40(srcend) + ld F_l, -64(srcend) /* dst2 will be overwritten. */ + ld F_h, -56(srcend) /* srcend will be overwritten. */ + + sd A_l, 0(dst) + sd A_h, 8(dst) + ld A_l, 32(src) + ld A_h, 40(src) + sd B_l, 16(dst) + sd B_h, 24(dst) + ld B_l, 48(src) + ld B_h, 56(src) + + sd C_l, -16(dstend) + sd C_h, -8(dstend) + sd D_l, -32(dstend) + sd D_h, -24(dstend) + sd E_l, -48(dstend) + sd E_h, -40(dstend) + sd F_l, -64(dstend) + sd F_h, -56(dstend) + + sd A_l, 32(dst) + sd A_h, 40(dst) + sd B_l, 48(dst) + sd B_h, 56(dst) + ret +#endif + + .p2align 4 + /* Copy 97+ bytes from front to back. */ +L(copy_long_forward): +#ifdef COPY97_128 + /* Avoid loop if possible. */ + li tmp1, 128 + ble count, tmp1, L(copy97_128_forward) +#endif + + /* Copy 16 bytes and then align dst to 16-byte alignment. */ + ld D_l, 0(src) + ld D_h, 8(src) + + /* Round down to the previous 16 byte boundary (keep offset of 16). */ + andi tmp1, dst, 15 + andi dst2, dst, -16 + sub src, src, tmp1 + + ld A_l, 16(src) + ld A_h, 24(src) + sd D_l, 0(dst) + sd D_h, 8(dst) + ld B_l, 32(src) + ld B_h, 40(src) + ld C_l, 48(src) + ld C_h, 56(src) + ld D_l, 64(src) + ld D_h, 72(src) + addi src, src, 64 + + /* Calculate loop termination position. */ + addi tmp1, dstend, -(16+128) + bgeu dst2, tmp1, L(copy64_from_end) + + /* Store 64 bytes in a loop. */ + .p2align 4 +L(loop64_forward): + addi src, src, 64 + sd A_l, 16(dst2) + sd A_h, 24(dst2) + ld A_l, -48(src) + ld A_h, -40(src) + sd B_l, 32(dst2) + sd B_h, 40(dst2) + ld B_l, -32(src) + ld B_h, -24(src) + sd C_l, 48(dst2) + sd C_h, 56(dst2) + ld C_l, -16(src) + ld C_h, -8(src) + sd D_l, 64(dst2) + sd D_h, 72(dst2) + ld D_l, 0(src) + ld D_h, 8(src) + addi dst2, dst2, 64 + bltu dst2, tmp1, L(loop64_forward) + +L(copy64_from_end): + ld E_l, -64(srcend) + ld E_h, -56(srcend) + sd A_l, 16(dst2) + sd A_h, 24(dst2) + ld A_l, -48(srcend) + ld A_h, -40(srcend) + sd B_l, 32(dst2) + sd B_h, 40(dst2) + ld B_l, -32(srcend) + ld B_h, -24(srcend) + sd C_l, 48(dst2) + sd C_h, 56(dst2) + ld C_l, -16(srcend) + ld C_h, -8(srcend) + sd D_l, 64(dst2) + sd D_h, 72(dst2) + sd E_l, -64(dstend) + sd E_h, -56(dstend) + sd A_l, -48(dstend) + sd A_h, -40(dstend) + sd B_l, -32(dstend) + sd B_h, -24(dstend) + sd C_l, -16(dstend) + sd C_h, -8(dstend) + ret + +END (MEMCPY) +libc_hidden_builtin_def (MEMCPY) + +ENTRY_ALIGN (MEMMOVE, 6) + /* Calculate the end position. */ + add srcend, src, count + add dstend, dst, count + + /* Decide how to process. */ + li tmp1, 96 + bgtu count, tmp1, L(move_long) + li tmp1, 32 + bgtu count, tmp1, L(copy33_96) + li tmp1, 16 + bleu count, tmp1, L(copy0_16) + + /* Copy 17-32 bytes. */ + ld A_l, 0(src) + ld A_h, 8(src) + ld B_l, -16(srcend) + ld B_h, -8(srcend) + sd A_l, 0(dst) + sd A_h, 8(dst) + sd B_l, -16(dstend) + sd B_h, -8(dstend) + ret + +#ifdef COPY97_128 + .p2align 4 +L(copy97_128_backward): + /* Copy 97-128 bytes from back to front. */ + ld A_l, -16(srcend) + ld A_h, -8(srcend) + ld B_l, -32(srcend) + ld B_h, -24(srcend) + ld C_l, -48(srcend) + ld C_h, -40(srcend) + ld D_l, -64(srcend) + ld D_h, -56(srcend) + ld E_l, -80(srcend) + ld E_h, -72(srcend) + ld F_l, -96(srcend) /* dst2 will be overwritten. */ + ld F_h, -88(srcend) /* srcend will be overwritten. */ + + sd A_l, -16(dstend) + sd A_h, -8(dstend) + ld A_l, 16(src) + ld A_h, 24(src) + sd B_l, -32(dstend) + sd B_h, -24(dstend) + ld B_l, 0(src) + ld B_h, 8(src) + + sd C_l, -48(dstend) + sd C_h, -40(dstend) + sd D_l, -64(dstend) + sd D_h, -56(dstend) + sd E_l, -80(dstend) + sd E_h, -72(dstend) + sd F_l, -96(dstend) + sd F_h, -88(dstend) + + sd A_l, 16(dst) + sd A_h, 24(dst) + sd B_l, 0(dst) + sd B_h, 8(dst) + ret +#endif + + .p2align 4 + /* Copy 97+ bytes. */ +L(move_long): + /* dst-src is positive if src < dst. + In this case we must copy forward if dst-src >= count. + If dst-src is negative, then we can interpret the difference + as unsigned value to enforce dst-src >= count as well. */ + sub tmp1, dst, src + beqz tmp1, L(copy0) + bgeu tmp1, count, L(copy_long_forward) + +#ifdef COPY97_128 + /* Avoid loop if possible. */ + li tmp1, 128 + ble count, tmp1, L(copy97_128_backward) +#endif + + /* Copy 16 bytes and then align dst to 16-byte alignment. */ + ld D_l, -16(srcend) + ld D_h, -8(srcend) + + /* Round down to the previous 16 byte boundary (keep offset of 16). */ + andi tmp1, dstend, 15 + sub srcend, srcend, tmp1 + + ld A_l, -16(srcend) + ld A_h, -8(srcend) + ld B_l, -32(srcend) + ld B_h, -24(srcend) + ld C_l, -48(srcend) + ld C_h, -40(srcend) + sd D_l, -16(dstend) + sd D_h, -8(dstend) + ld D_l, -64(srcend) + ld D_h, -56(srcend) + andi dstend, dstend, -16 + + /* Calculate loop termination position. */ + addi tmp1, dst, 128 + bleu dstend, tmp1, L(copy64_from_start) + + /* Store 64 bytes in a loop. */ + .p2align 4 +L(loop64_backward): + addi srcend, srcend, -64 + sd A_l, -16(dstend) + sd A_h, -8(dstend) + ld A_l, -16(srcend) + ld A_h, -8(srcend) + sd B_l, -32(dstend) + sd B_h, -24(dstend) + ld B_l, -32(srcend) + ld B_h, -24(srcend) + sd C_l, -48(dstend) + sd C_h, -40(dstend) + ld C_l, -48(srcend) + ld C_h, -40(srcend) + sd D_l, -64(dstend) + sd D_h, -56(dstend) + ld D_l, -64(srcend) + ld D_h, -56(srcend) + addi dstend, dstend, -64 + bgtu dstend, tmp1, L(loop64_backward) + +L(copy64_from_start): + ld E_l, 48(src) + ld E_h, 56(src) + sd A_l, -16(dstend) + sd A_h, -8(dstend) + ld A_l, 32(src) + ld A_h, 40(src) + sd B_l, -32(dstend) + sd B_h, -24(dstend) + ld B_l, 16(src) + ld B_h, 24(src) + sd C_l, -48(dstend) + sd C_h, -40(dstend) + ld C_l, 0(src) + ld C_h, 8(src) + sd D_l, -64(dstend) + sd D_h, -56(dstend) + sd E_l, 48(dst) + sd E_h, 56(dst) + sd A_l, 32(dst) + sd A_h, 40(dst) + sd B_l, 16(dst) + sd B_h, 24(dst) + sd C_l, 0(dst) + sd C_h, 8(dst) + ret + +END (MEMMOVE) +libc_hidden_builtin_def (MEMMOVE) + +#endif /* __riscv_xlen == 64 */ diff --git a/sysdeps/riscv/multiarch/memmove.c b/sysdeps/riscv/multiarch/memmove.c index 581a8327d6..b446a9e036 100644 --- a/sysdeps/riscv/multiarch/memmove.c +++ b/sysdeps/riscv/multiarch/memmove.c @@ -31,7 +31,16 @@ extern __typeof (__redirect_memmove) __libc_memmove; extern __typeof (__redirect_memmove) __memmove_generic attribute_hidden; +#if __riscv_xlen == 64 +extern __typeof (__redirect_memmove) __memmove_rv64_unaligned attribute_hidden; + +libc_ifunc (__libc_memmove, + (IS_RV64() && HAVE_FAST_UNALIGNED() + ? __memmove_rv64_unaligned + : __memmove_generic)); +#else libc_ifunc (__libc_memmove, __memmove_generic); +#endif # undef memmove strong_alias (__libc_memmove, memmove); From patchwork Tue Feb 7 00:16:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738543 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=YiqQ3aN2; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kKJ6qZmz23jB for ; Tue, 7 Feb 2023 11:19:48 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 186FD383FB82 for ; Tue, 7 Feb 2023 00:19:46 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x334.google.com (mail-wm1-x334.google.com [IPv6:2a00:1450:4864:20::334]) by sourceware.org (Postfix) with ESMTPS id 2CB103858425 for ; Tue, 7 Feb 2023 00:16:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2CB103858425 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x334.google.com with SMTP id c4-20020a1c3504000000b003d9e2f72093so12043330wma.1 for ; Mon, 06 Feb 2023 16:16:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BAcTcjATBAdQf+KWosVGSXCa8AEadnxSJLwGoHNU96Y=; b=YiqQ3aN2qy2CL6l7OBeF32jtmVxUtnhTU/u5cTkQYx6EqzgDYeOdfiXH1vwAXHD4dZ SHNTUceyVXgPCS4+WI3vFXxmtENKxnUfs4mSY4QqKNpTQNrQAxnq5AJS9GnhBiROW4dX 9mHFMGZ/ssoZUzXqA1X7nE91EPx6C+MDOHnmZZsuq0tTqdy8cegaZTwwcFVA4ptWE5M2 WMhCqZ853O5KeWMb7ry7N2ZdKPwSLXZLNkITJEeXYsrwcv28cbB34nWRifY08JJEpL2L XyfnScjoGRzlh6ACpxLYBzHKz3YhmZQnmAyjNZyMoWKuHzdgqVdZQsqZGAXfCcFVru74 H68Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BAcTcjATBAdQf+KWosVGSXCa8AEadnxSJLwGoHNU96Y=; b=mD/VWS5cALns/FVpf6L13xyWsvtJ6cNdmjClFz+BgC1U+GYfXS6fMIp1HsCDMiiwUj R63s5RoxqOUGtM0AiZKpDxnFhlMYMfi6AD57qc7P+Alsn02XwN/pj8AdZ2dAj2QoYLpM in0+NI3g8Z9CHH/b42RFK9CwXP7b/wcdhsMQxrUrUg2WzY0+/3WBaLnAaVRjXIumAVVq 7c0NeJptS2jD85phPLjcfWp/ojOn27MDHnEnEWMwrAXZAVEa8lIyfH+AD5cVGoETM3N+ 7lEANe3TiJoQL4ZxsCe6Hl7eI6DrLycC6Ewqh8DopZuOj1y7Q7+2403D4z2xYDket6H9 xdzA== X-Gm-Message-State: AO0yUKWXwLHzFhowQpphM9/MC4Wvvx/cnVsTod1jhe6j9xDlEtq8z5W7 JT2VYM+CylLAEFx+2Wp7h3zrjr3GmekIJKev X-Google-Smtp-Source: AK7set9mSPvkxGUsAn0302oPWgU6iD0O50xhx9PwMofY4wkfix8nbTgnnlu/f7E4si9U9SSY9eC/Pw== X-Received: by 2002:a05:600c:331c:b0:3dc:9ecc:22a with SMTP id q28-20020a05600c331c00b003dc9ecc022amr1367730wmp.8.1675729009472; Mon, 06 Feb 2023 16:16:49 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:48 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 13/19] riscv: Add ifunc support for strlen Date: Tue, 7 Feb 2023 01:16:12 +0100 Message-Id: <20230207001618.458947-14-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner This patch adds ifunc support for calls to strlen to the RISC-V code. No optimized code is added as part of this patch. Signed-off-by: Christoph Müllner --- sysdeps/riscv/multiarch/Makefile | 4 ++- sysdeps/riscv/multiarch/ifunc-impl-list.c | 4 +++ sysdeps/riscv/multiarch/strlen.c | 40 +++++++++++++++++++++++ sysdeps/riscv/multiarch/strlen_generic.c | 32 ++++++++++++++++++ 4 files changed, 79 insertions(+), 1 deletion(-) create mode 100644 sysdeps/riscv/multiarch/strlen.c create mode 100644 sysdeps/riscv/multiarch/strlen_generic.c diff --git a/sysdeps/riscv/multiarch/Makefile b/sysdeps/riscv/multiarch/Makefile index b08d7d1c8b..8e2b020233 100644 --- a/sysdeps/riscv/multiarch/Makefile +++ b/sysdeps/riscv/multiarch/Makefile @@ -6,5 +6,7 @@ sysdep_routines += \ \ memset_generic \ memset_rv64_unaligned \ - memset_rv64_unaligned_cboz64 + memset_rv64_unaligned_cboz64 \ + \ + strlen_generic endif diff --git a/sysdeps/riscv/multiarch/ifunc-impl-list.c b/sysdeps/riscv/multiarch/ifunc-impl-list.c index 84b3eb25a4..f848fc8401 100644 --- a/sysdeps/riscv/multiarch/ifunc-impl-list.c +++ b/sysdeps/riscv/multiarch/ifunc-impl-list.c @@ -54,5 +54,9 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, #endif IFUNC_IMPL_ADD (array, i, memset, 1, __memset_generic)) + IFUNC_IMPL (i, name, strlen, + IFUNC_IMPL_ADD (array, i, strlen, 1, __strlen_generic)) + + return i; } diff --git a/sysdeps/riscv/multiarch/strlen.c b/sysdeps/riscv/multiarch/strlen.c new file mode 100644 index 0000000000..85f7a91c9f --- /dev/null +++ b/sysdeps/riscv/multiarch/strlen.c @@ -0,0 +1,40 @@ +/* Multiple versions of strlen. RISC-V version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define multiple versions only for the definition in libc. */ + +#if IS_IN (libc) +/* Redefine strlen so that the compiler won't complain about the type + mismatch with the IFUNC selector in strong_alias, below. */ +# undef strlen +# define strlen __redirect_strlen +# include +# include +# include +# include + +extern __typeof (__redirect_strlen) __libc_strlen; +extern __typeof (__redirect_strlen) __strlen_generic attribute_hidden; + +libc_ifunc (__libc_strlen, __strlen_generic); + +# undef strlen +strong_alias (__libc_strlen, strlen); +#else +# include +#endif diff --git a/sysdeps/riscv/multiarch/strlen_generic.c b/sysdeps/riscv/multiarch/strlen_generic.c new file mode 100644 index 0000000000..10aa05e699 --- /dev/null +++ b/sysdeps/riscv/multiarch/strlen_generic.c @@ -0,0 +1,32 @@ +/* strlen for RISC-V, default version for internal use. + Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include + +#define STRLEN __strlen_generic + +#ifdef SHARED +# undef libc_hidden_builtin_def +# define libc_hidden_builtin_def(name) \ + __hidden_ver1(__strlen_generic, __GI_strlen, __strlen_generic); +#endif + +extern size_t __strlen_generic(const char *str); + +#include From patchwork Tue Feb 7 00:16:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738548 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=jtfz4UJ5; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kLg4sdDz23y5 for ; Tue, 7 Feb 2023 11:20:59 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A34423881D35 for ; Tue, 7 Feb 2023 00:20:57 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x32b.google.com (mail-wm1-x32b.google.com [IPv6:2a00:1450:4864:20::32b]) by sourceware.org (Postfix) with ESMTPS id 634A43858431 for ; Tue, 7 Feb 2023 00:16:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 634A43858431 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x32b.google.com with SMTP id j32-20020a05600c1c2000b003dc4fd6e61dso12016350wms.5 for ; Mon, 06 Feb 2023 16:16:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sWR8SAAC+fazj6mFfkJTHA8fGla9wAUW7XCvoAqOVLY=; b=jtfz4UJ5UsidoXCX/E+pFOkRcalu2DZZp+N3CY+2EC2uD+rSuv0MAdcu4D8g5j1xTT 2PnXCNARMaKP6Te20FahFbiE52lxE87E4F/Az5fLoaOl8m2qWvzNXdfEvOQv/cs7jq1I xYzJss98oUGAOg5jNw11PsfCjqNC9SLg8tj/q8IIrn+oQPHOOClKlxitduV0x/I5hCuM ip/M3Hw9MCg5O0Zh1Vm0I1Qvz3W+hiPMIDfFCIsdY8rqEy7rWE7BcM0875bXemUHgEmp Wseg0QhSLPf+ipuTSuIEEV67UvOJ2Vfg2v+i6Z3hz+x/pTUduMwp+gtn/qCJ3VdjBvuz UCuw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sWR8SAAC+fazj6mFfkJTHA8fGla9wAUW7XCvoAqOVLY=; b=vQlTcqvK34lJDS8oRuuKBITnhDBA58upFgeLcMyYhYxG1yG3H0DKdJ3yGL60PPMjy/ THioLqCufXN1H3NQyD/Vg9jYwzD2dcyhlo6sjr6OZO/mIgNY9OFELlKovHCjxoqw2doz p8C83VMFDG5+Cm6tLjZ32ynOh+lgeKD1GJHrwTKMlJ57Pb2Dc08oDdVw81DpyKZiRPXW dUpio5rIQjLOK+frTn5jBfyqfTwm1qew4HRbxsjGHokDCFEhhbuETUnCq0Z2Huif2qi/ Oi1AeZMqsBh+ulDRVBgBKL+T6atnqNAn3yTDLh4ejSiEPFe57yywnWye9Z5eLDX1XrLg NOww== X-Gm-Message-State: AO0yUKUZgZ9e0uFzMHG6WEf3nHxmGZVJPYuub+VPR+hPXV4OQpCgy9sH t7/HUz4SP7xseWZfQW1CKDj9jHYd9HHoyWo1 X-Google-Smtp-Source: AK7set/y4Z2jqM6MPq7WC6HZWk3w7pDbcsWLSPmUXslekK/p7Bmmw5q4h7d8UMOa1UfuEhzEcaL7KQ== X-Received: by 2002:a05:600c:3486:b0:3df:9858:c03d with SMTP id a6-20020a05600c348600b003df9858c03dmr12443128wmq.18.1675729010895; Mon, 06 Feb 2023 16:16:50 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:50 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 14/19] riscv: Add accelerated strlen routine Date: Tue, 7 Feb 2023 01:16:13 +0100 Message-Id: <20230207001618.458947-15-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner The implementation of strlen() can be accelerated using Zbb's orc.b instruction. Let's add an implementation that provides that. The implementation is part of the Bitmanip specification. Signed-off-by: Christoph Müllner --- sysdeps/riscv/multiarch/Makefile | 3 +- sysdeps/riscv/multiarch/ifunc-impl-list.c | 1 + sysdeps/riscv/multiarch/strlen.c | 6 +- sysdeps/riscv/multiarch/strlen_zbb.S | 105 ++++++++++++++++++++++ 4 files changed, 113 insertions(+), 2 deletions(-) create mode 100644 sysdeps/riscv/multiarch/strlen_zbb.S diff --git a/sysdeps/riscv/multiarch/Makefile b/sysdeps/riscv/multiarch/Makefile index 8e2b020233..b2247b7326 100644 --- a/sysdeps/riscv/multiarch/Makefile +++ b/sysdeps/riscv/multiarch/Makefile @@ -8,5 +8,6 @@ sysdep_routines += \ memset_rv64_unaligned \ memset_rv64_unaligned_cboz64 \ \ - strlen_generic + strlen_generic \ + strlen_zbb endif diff --git a/sysdeps/riscv/multiarch/ifunc-impl-list.c b/sysdeps/riscv/multiarch/ifunc-impl-list.c index f848fc8401..2b4d2e1c17 100644 --- a/sysdeps/riscv/multiarch/ifunc-impl-list.c +++ b/sysdeps/riscv/multiarch/ifunc-impl-list.c @@ -55,6 +55,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, IFUNC_IMPL_ADD (array, i, memset, 1, __memset_generic)) IFUNC_IMPL (i, name, strlen, + IFUNC_IMPL_ADD (array, i, strlen, 1, __strlen_zbb) IFUNC_IMPL_ADD (array, i, strlen, 1, __strlen_generic)) diff --git a/sysdeps/riscv/multiarch/strlen.c b/sysdeps/riscv/multiarch/strlen.c index 85f7a91c9f..8b2f4d94b2 100644 --- a/sysdeps/riscv/multiarch/strlen.c +++ b/sysdeps/riscv/multiarch/strlen.c @@ -30,8 +30,12 @@ extern __typeof (__redirect_strlen) __libc_strlen; extern __typeof (__redirect_strlen) __strlen_generic attribute_hidden; +extern __typeof (__redirect_strlen) __strlen_zbb attribute_hidden; -libc_ifunc (__libc_strlen, __strlen_generic); +libc_ifunc (__libc_strlen, + HAVE_RV(zbb) + ? __strlen_zbb + : __strlen_generic); # undef strlen strong_alias (__libc_strlen, strlen); diff --git a/sysdeps/riscv/multiarch/strlen_zbb.S b/sysdeps/riscv/multiarch/strlen_zbb.S new file mode 100644 index 0000000000..a0ca599c8e --- /dev/null +++ b/sysdeps/riscv/multiarch/strlen_zbb.S @@ -0,0 +1,105 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include +#include + +/* Assumptions: rvi_zbb. */ +/* Implementation from the Bitmanip specification. */ + +#define src a0 +#define result a0 +#define addr a1 +#define data a2 +#define offset a3 +#define offset_bits a3 +#define valid_bytes a4 +#define m1 a4 + +#if __riscv_xlen == 64 +# define REG_L ld +# define SZREG 8 +#else +# define REG_L lw +# define SZREG 4 +#endif + +#define BITSPERBYTELOG 3 + +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ +# define CZ clz +# define SHIFT sll +#else +# define CZ ctz +# define SHIFT srl +#endif + +#ifndef STRLEN +# define STRLEN __strlen_zbb +#endif + +.option push +.option arch,+zbb + +ENTRY_ALIGN (STRLEN, 6) + andi offset, src, SZREG-1 + andi addr, src, -SZREG + + li valid_bytes, SZREG + sub valid_bytes, valid_bytes, offset + slli offset_bits, offset, BITSPERBYTELOG + REG_L data, 0(addr) + /* Shift the partial/unaligned chunk we loaded to remove the bytes + * from before the start of the string, adding NUL bytes at the end. */ + SHIFT data, data, offset_bits + orc.b data, data + not data, data + /* Non-NUL bytes in the string have been expanded to 0x00, while + * NUL bytes have become 0xff. Search for the first set bit + * (corresponding to a NUL byte in the original chunk). */ + CZ data, data + /* The first chunk is special: compare against the number of valid + * bytes in this chunk. */ + srli result, data, 3 + bgtu valid_bytes, result, L(done) + addi offset, addr, SZREG + li m1, -1 + + /* Our critical loop is 4 instructions and processes data in 4 byte + * or 8 byte chunks. */ + .p2align 2 +L(loop): + REG_L data, SZREG(addr) + addi addr, addr, SZREG + orc.b data, data + beq data, m1, L(loop) + +L(epilogue): + not data, data + CZ data, data + sub offset, addr, offset + add result, result, offset + srli data, data, 3 + add result, result, data +L(done): + ret + +.option pop + +END (STRLEN) +libc_hidden_builtin_def (STRLEN) From patchwork Tue Feb 7 00:16:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738544 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=WnR4CWXR; dkim-atps=neutral Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kKY50MVz23jB for ; Tue, 7 Feb 2023 11:20:01 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F02BE3888C47 for ; Tue, 7 Feb 2023 00:19:58 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x329.google.com (mail-wm1-x329.google.com [IPv6:2a00:1450:4864:20::329]) by sourceware.org (Postfix) with ESMTPS id B3165385840D for ; Tue, 7 Feb 2023 00:16:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B3165385840D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x329.google.com with SMTP id q8so9902923wmo.5 for ; Mon, 06 Feb 2023 16:16:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RBHAK002AzI3hQ/WoDDMDNndvRPVevPn2VOyQuf0EHU=; b=WnR4CWXRdyYX8udraWfIrSIR1cTGN60Nn4WqAJYv9yZkCG0URoMkFIEgUJB/Q3/PIi ns+V4HDGpAaSVbhPdjqKHT1LyS0ydqhekgcyMbGez4dqUdLa+EIPMj5swxdSRPFVJOZJ 6FI2CArPFbuT4B39cZS85PolJaOY3Z/ypYJ0S8J0NdKBYmTqCFXJd0vqamkfZoZTiSvT sB3+JI+3gnP/3nOGjZW/DkYpc36zwzSSlOPU/gLyxE0Dn9PBxqm17WmoW1fUcYdVNau8 7FpWoRlINbsx5dot6RJDDb/EGwdekrE4Y9TzYogmWR65OejQAWwp8ur+88GKWoHkThmP 6rHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RBHAK002AzI3hQ/WoDDMDNndvRPVevPn2VOyQuf0EHU=; b=gnpCX5KseBi+UtP6AlY0qoZdlLitHcTUJsdpJioB7dC8GzgPSq7vYK+1XZgJffPryv EgRm24HwFIxWrRiUc+SwviuqILc3Rulym8Z2DM9kG4wx1qxug9eotHiO+aAQ+1hmeUNs HlCL1TOOie8xTaS7evhp4K1s7ZHzjG/lDpJuGF6aeS6uRJwW1kyHy9j4fwTwlEo7Tl3L k8GslPeIY8RaUxYh4t5VomMelybi0nxwI4oKwcZfUxv1RSw9ycw5rq7IsnK58L7KoKn0 OHcKS8KpaIVx8Oj88jKXn0NmCzre/c16OybXqsDoIyniKQDOGS51TNsMH6oybNfB8s1m BeaQ== X-Gm-Message-State: AO0yUKWmMrAraGmO791Q5eaL3mwKyc328PZ/XAWm7WJh/SOUAwc1V0Sd Rf09CgOfdVsuoqMuoXRQ7D19jp/mMasZ45Ux X-Google-Smtp-Source: AK7set81fktnwAnnvt1kNavcwnOjyQgDmwd1uOSPoouWl4Jh8URirZGzKBtUVyLSCee30NQ4Z9JWgw== X-Received: by 2002:a05:600c:4a97:b0:3dc:5342:4132 with SMTP id b23-20020a05600c4a9700b003dc53424132mr1336343wmp.4.1675729012168; Mon, 06 Feb 2023 16:16:52 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:51 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 15/19] riscv: Add ifunc support for strcmp Date: Tue, 7 Feb 2023 01:16:14 +0100 Message-Id: <20230207001618.458947-16-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner This patch adds ifunc support for calls to strcmp to the RISC-V code. No optimized code is added as part of this patch. Signed-off-by: Christoph Müllner --- sysdeps/riscv/multiarch/Makefile | 4 ++- sysdeps/riscv/multiarch/ifunc-impl-list.c | 2 ++ sysdeps/riscv/multiarch/strcmp.c | 40 +++++++++++++++++++++++ sysdeps/riscv/multiarch/strcmp_generic.c | 32 ++++++++++++++++++ 4 files changed, 77 insertions(+), 1 deletion(-) create mode 100644 sysdeps/riscv/multiarch/strcmp.c create mode 100644 sysdeps/riscv/multiarch/strcmp_generic.c diff --git a/sysdeps/riscv/multiarch/Makefile b/sysdeps/riscv/multiarch/Makefile index b2247b7326..3017bde75a 100644 --- a/sysdeps/riscv/multiarch/Makefile +++ b/sysdeps/riscv/multiarch/Makefile @@ -9,5 +9,7 @@ sysdep_routines += \ memset_rv64_unaligned_cboz64 \ \ strlen_generic \ - strlen_zbb + strlen_zbb \ + \ + strcmp_generic endif diff --git a/sysdeps/riscv/multiarch/ifunc-impl-list.c b/sysdeps/riscv/multiarch/ifunc-impl-list.c index 2b4d2e1c17..64331a4c7f 100644 --- a/sysdeps/riscv/multiarch/ifunc-impl-list.c +++ b/sysdeps/riscv/multiarch/ifunc-impl-list.c @@ -58,6 +58,8 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, IFUNC_IMPL_ADD (array, i, strlen, 1, __strlen_zbb) IFUNC_IMPL_ADD (array, i, strlen, 1, __strlen_generic)) + IFUNC_IMPL (i, name, strcmp, + IFUNC_IMPL_ADD (array, i, strcpy, 1, __strcmp_generic)) return i; } diff --git a/sysdeps/riscv/multiarch/strcmp.c b/sysdeps/riscv/multiarch/strcmp.c new file mode 100644 index 0000000000..8c21a90afd --- /dev/null +++ b/sysdeps/riscv/multiarch/strcmp.c @@ -0,0 +1,40 @@ +/* Multiple versions of strcmp. RISC-V version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define multiple versions only for the definition in libc. */ + +#if IS_IN (libc) +/* Redefine strcmp so that the compiler won't complain about the type + mismatch with the IFUNC selector in strong_alias, below. */ +# undef strcmp +# define strcmp __redirect_strcmp +# include +# include +# include +# include + +extern __typeof (__redirect_strcmp) __libc_strcmp; +extern __typeof (__redirect_strcmp) __strcmp_generic attribute_hidden; + +libc_ifunc (__libc_strcmp, __strcmp_generic); + +# undef strcmp +strong_alias (__libc_strcmp, strcmp); +#else +# include +#endif diff --git a/sysdeps/riscv/multiarch/strcmp_generic.c b/sysdeps/riscv/multiarch/strcmp_generic.c new file mode 100644 index 0000000000..d85cf3940f --- /dev/null +++ b/sysdeps/riscv/multiarch/strcmp_generic.c @@ -0,0 +1,32 @@ +/* strcmp for RISC-V, default version for internal use. + Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include + +#define STRCMP __strcmp_generic + +#ifdef SHARED +# undef libc_hidden_builtin_def +# define libc_hidden_builtin_def(name) \ + __hidden_ver1(__strcmp_generic, __GI_strcmp, __strcmp_generic); +#endif + +extern int __strcmp_generic(const char *s1, const char *s2); + +#include From patchwork Tue Feb 7 00:16:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738549 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=JfLREQG7; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kM81St6z23r8 for ; Tue, 7 Feb 2023 11:21:24 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2C824388B6AA for ; Tue, 7 Feb 2023 00:21:22 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x333.google.com (mail-wm1-x333.google.com [IPv6:2a00:1450:4864:20::333]) by sourceware.org (Postfix) with ESMTPS id 5DD2B3858C2D for ; Tue, 7 Feb 2023 00:16:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5DD2B3858C2D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x333.google.com with SMTP id n28-20020a05600c3b9c00b003ddca7a2bcbso10222636wms.3 for ; Mon, 06 Feb 2023 16:16:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tpAAnoYxnHORtZpj+Y7K145GsNBMQa5J1cPEKDZgNvk=; b=JfLREQG7O+l0gh1DYHYk7BAzsJVuu5/kOeJLZTW0QRXi9woPhJFiWHGv++86PVqpnI VaYyzlH9iXSL0eUfnJQxvwIPYq5lrHEI+K1xIQG19+UaIqCRRri+ekyZzLn2M1Rstyoz HjTqttXElqlzU1yFRm10lqcozM6mWTPU582v/K57h7StBD3A5YIGlTPaR4DheP7Xx/sR 0Lb/7pCE/C0ARkynC8SG3KoBVW9zFQJbZ096WLlUB24i8gOVhYx9WMJ8B6C1b1uydBa8 VoJOMfny+WCQDR+NbE04F5DGhTAlqXw9bue3lTfPiblcD3pxom8lLL3WrzHNvfgPObHZ fi/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tpAAnoYxnHORtZpj+Y7K145GsNBMQa5J1cPEKDZgNvk=; b=3k3tbAKjYFTQnZO3ePpCZOO0yZEDPolZxESCcpVPy0VvIMbUBah2PR+d41Cur2IQ9x UyhejV3U+65MigQ8TGLbIg0PYy7LbyC5vuSNT0QNkgm/qre4CT/JlQsADYlcAmrl3qdJ LZZV8bZZVEZX/5p5pdjZSMYsjWYof8ccEOkT/GAwa8Bftdw+tsHzag6iCvthvnvpIX8d 3j1sXsFDLDeOkEyB/tt7eGNTLSaQJx6MTOAow1yCjwAyEj/O9jm8xWKLBeivPGniVbcG Ftd4gduv6A/f4Jh5n0oKXE9TwxLcd28E5KlZE/ilCwtvnKyn/yBDvAdwU8RRU8nll459 pv8g== X-Gm-Message-State: AO0yUKVzfyP+y+eEpZW0VqBPwzuruoU7PBijBYru8W4tWo4UBtJpnUB8 hS4gXj9EwdvTDGL2xnj142Tja+dcRhXki0s9 X-Google-Smtp-Source: AK7set8kJgrQoWMQFeiYpjUpUo3W1eaF2In1U2o0OPu84uVy3HgQJFHesL+WZPUFiHp8ilKXM/HtVA== X-Received: by 2002:a05:600c:16d6:b0:3db:14d0:65be with SMTP id l22-20020a05600c16d600b003db14d065bemr1262082wmn.34.1675729013519; Mon, 06 Feb 2023 16:16:53 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:52 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 16/19] riscv: Add accelerated strcmp routines Date: Tue, 7 Feb 2023 01:16:15 +0100 Message-Id: <20230207001618.458947-17-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner The implementation of strcmp() can be accelerated using Zbb's orc.b instruction and fast unaligned accesses. Howver, strcmp can use unaligned accesses only if such an address does not change the exception behaviour (compared to a single-byte compare loop). Let's add an implementation that does all that. Additionally, let's add the strcmp implementation from the Bitmanip specification, which does not do any unaligned accesses. Signed-off-by: Christoph Müllner --- sysdeps/riscv/multiarch/Makefile | 4 +- sysdeps/riscv/multiarch/ifunc-impl-list.c | 4 +- sysdeps/riscv/multiarch/strcmp.c | 11 +- sysdeps/riscv/multiarch/strcmp_zbb.S | 104 +++++++++ .../riscv/multiarch/strcmp_zbb_unaligned.S | 213 ++++++++++++++++++ 5 files changed, 332 insertions(+), 4 deletions(-) create mode 100644 sysdeps/riscv/multiarch/strcmp_zbb.S create mode 100644 sysdeps/riscv/multiarch/strcmp_zbb_unaligned.S diff --git a/sysdeps/riscv/multiarch/Makefile b/sysdeps/riscv/multiarch/Makefile index 3017bde75a..73a62be85d 100644 --- a/sysdeps/riscv/multiarch/Makefile +++ b/sysdeps/riscv/multiarch/Makefile @@ -11,5 +11,7 @@ sysdep_routines += \ strlen_generic \ strlen_zbb \ \ - strcmp_generic + strcmp_generic \ + strcmp_zbb \ + strcmp_zbb_unaligned endif diff --git a/sysdeps/riscv/multiarch/ifunc-impl-list.c b/sysdeps/riscv/multiarch/ifunc-impl-list.c index 64331a4c7f..d354aa1178 100644 --- a/sysdeps/riscv/multiarch/ifunc-impl-list.c +++ b/sysdeps/riscv/multiarch/ifunc-impl-list.c @@ -59,7 +59,9 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, IFUNC_IMPL_ADD (array, i, strlen, 1, __strlen_generic)) IFUNC_IMPL (i, name, strcmp, - IFUNC_IMPL_ADD (array, i, strcpy, 1, __strcmp_generic)) + IFUNC_IMPL_ADD (array, i, strcmp, 1, __strcmp_zbb_unaligned) + IFUNC_IMPL_ADD (array, i, strcmp, 1, __strcmp_zbb) + IFUNC_IMPL_ADD (array, i, strcmp, 1, __strcmp_generic)) return i; } diff --git a/sysdeps/riscv/multiarch/strcmp.c b/sysdeps/riscv/multiarch/strcmp.c index 8c21a90afd..d3f2fe19ae 100644 --- a/sysdeps/riscv/multiarch/strcmp.c +++ b/sysdeps/riscv/multiarch/strcmp.c @@ -30,8 +30,15 @@ extern __typeof (__redirect_strcmp) __libc_strcmp; extern __typeof (__redirect_strcmp) __strcmp_generic attribute_hidden; - -libc_ifunc (__libc_strcmp, __strcmp_generic); +extern __typeof (__redirect_strcmp) __strcmp_zbb attribute_hidden; +extern __typeof (__redirect_strcmp) __strcmp_zbb_unaligned attribute_hidden; + +libc_ifunc (__libc_strcmp, + HAVE_RV(zbb) && HAVE_FAST_UNALIGNED() + ? __strcmp_zbb_unaligned + : HAVE_RV(zbb) + ? __strcmp_zbb + : __strcmp_generic); # undef strcmp strong_alias (__libc_strcmp, strcmp); diff --git a/sysdeps/riscv/multiarch/strcmp_zbb.S b/sysdeps/riscv/multiarch/strcmp_zbb.S new file mode 100644 index 0000000000..1c265d6107 --- /dev/null +++ b/sysdeps/riscv/multiarch/strcmp_zbb.S @@ -0,0 +1,104 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include +#include + +/* Assumptions: rvi_zbb. */ +/* Implementation from the Bitmanip specification. */ + +#define src1 a0 +#define result a0 +#define src2 a1 +#define data1 a2 +#define data2 a3 +#define align a4 +#define data1_orcb t0 +#define m1 t2 + +#if __riscv_xlen == 64 +# define REG_L ld +# define SZREG 8 +#else +# define REG_L lw +# define SZREG 4 +#endif + +#ifndef STRCMP +# define STRCMP __strcmp_zbb +#endif + +.option push +.option arch,+zbb + +ENTRY_ALIGN (STRCMP, 6) + or align, src1, src2 + and align, align, SZREG-1 + bnez align, L(simpleloop) + li m1, -1 + + /* Main loop for aligned strings. */ + .p2align 2 +L(loop): + REG_L data1, 0(src1) + REG_L data2, 0(src2) + orc.b data1_orcb, data1 + bne data1_orcb, m1, L(foundnull) + addi src1, src1, SZREG + addi src2, src2, SZREG + beq data1, data2, L(loop) + + /* Words don't match, and no null byte in the first word. + * Get bytes in big-endian order and compare. */ +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + rev8 data1, data1 + rev8 data2, data2 +#endif + /* Synthesize (data1 >= data2) ? 1 : -1 in a branchless sequence. */ + sltu result, data1, data2 + neg result, result + ori result, result, 1 + ret + +L(foundnull): + /* Found a null byte. + * If words don't match, fall back to simple loop. */ + bne data1, data2, L(simpleloop) + + /* Otherwise, strings are equal. */ + li result, 0 + ret + + /* Simple loop for misaligned strings. */ + .p2align 3 +L(simpleloop): + lbu data1, 0(src1) + lbu data2, 0(src2) + addi src1, src1, 1 + addi src2, src2, 1 + bne data1, data2, L(sub) + bnez data1, L(simpleloop) + +L(sub): + sub result, data1, data2 + ret + +.option pop + +END (STRCMP) +libc_hidden_builtin_def (STRCMP) diff --git a/sysdeps/riscv/multiarch/strcmp_zbb_unaligned.S b/sysdeps/riscv/multiarch/strcmp_zbb_unaligned.S new file mode 100644 index 0000000000..ec21982b65 --- /dev/null +++ b/sysdeps/riscv/multiarch/strcmp_zbb_unaligned.S @@ -0,0 +1,213 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include +#include + +/* Assumptions: rvi_zbb with fast unaligned access. */ +/* Implementation inspired by aarch64/strcmp.S. */ + +#define src1 a0 +#define result a0 +#define src2 a1 +#define off a3 +#define m1 a4 +#define align1 a5 +#define src3 a6 +#define tmp a7 + +#define data1 t0 +#define data2 t1 +#define b1 t0 +#define b2 t1 +#define data3 t2 +#define data1_orcb t3 +#define data3_orcb t4 +#define shift t5 + +#if __riscv_xlen == 64 +# define REG_L ld +# define SZREG 8 +# define PTRLOG 3 +#else +# define REG_L lw +# define SZREG 4 +# define PTRLOG 2 +#endif + +#if __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__ +# error big endian is untested! +# define CZ ctz +# define SHIFT srl +# define SHIFT2 sll +#else +# define CZ ctz +# define SHIFT sll +# define SHIFT2 srl +#endif + +#ifndef STRCMP +# define STRCMP __strcmp_zbb_unaligned +#endif + +.option push +.option arch,+zbb + +ENTRY_ALIGN (STRCMP, 6) + /* off...delta from src1 to src2. */ + sub off, src2, src1 + li m1, -1 + andi tmp, off, SZREG-1 + andi align1, src1, SZREG-1 + bnez tmp, L(misaligned8) + bnez align1, L(mutual_align) + + .p2align 4 +L(loop_aligned): + REG_L data1, 0(src1) + add tmp, src1, off + addi src1, src1, SZREG + REG_L data2, 0(tmp) + +L(start_realigned): + orc.b data1_orcb, data1 + bne data1_orcb, m1, L(end) + beq data1, data2, L(loop_aligned) + +L(fast_end): + /* Words don't match, and no NUL byte in one word. + Get bytes in big-endian order and compare as words. */ +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + rev8 data1, data1 + rev8 data2, data2 +#endif + /* Synthesize (data1 >= data2) ? 1 : -1 in a branchless sequence. */ + sltu result, data1, data2 + neg result, result + ori result, result, 1 + ret + +L(end_orc): + orc.b data1_orcb, data1 +L(end): + /* Words don't match or NUL byte in at least one word. + data1_orcb holds orc.b value of data1. */ + xor tmp, data1, data2 + orc.b tmp, tmp + + orn tmp, tmp, data1_orcb + CZ shift, tmp + +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + rev8 data1, data1 + rev8 data2, data2 +#endif + sll data1, data1, shift + sll data2, data2, shift + srl b1, data1, SZREG*8-8 + srl b2, data2, SZREG*8-8 + +L(end_singlebyte): + sub result, b1, b2 + ret + + .p2align 4 +L(mutual_align): + /* Sources are mutually aligned, but are not currently at an + alignment boundary. Round down the addresses and then mask off + the bytes that precede the start point. */ + andi src1, src1, -SZREG + add tmp, src1, off + REG_L data1, 0(src1) + addi src1, src1, SZREG + REG_L data2, 0(tmp) + /* Get number of bits to mask. */ + sll shift, src2, 3 + /* Bits to mask are now 0, others are 1. */ + SHIFT tmp, m1, shift + /* Or with inverted value -> masked bits become 1. */ + orn data1, data1, tmp + orn data2, data2, tmp + j L(start_realigned) + +L(misaligned8): + /* Skip slow loop if SRC1 is aligned. */ + beqz align1, L(src1_aligned) +L(do_misaligned): + /* Align SRC1 to 8 bytes. */ + lbu b1, 0(src1) + lbu b2, 0(src2) + beqz b1, L(end_singlebyte) + bne b1, b2, L(end_singlebyte) + addi src1, src1, 1 + addi src2, src2, 1 + andi align1, src1, SZREG-1 + bnez align1, L(do_misaligned) + +L(src1_aligned): + /* SRC1 is aligned. Align SRC2 down and check for NUL there. + * If there is no NUL, we may read the next word from SRC2. + * If there is a NUL, we must not read a complete word from SRC2 + * because we might cross a page boundary. */ + /* Get number of bits to mask (upper bits are ignored by shifts). */ + sll shift, src2, 3 + /* src3 := align_down (src2) */ + andi src3, src2, -SZREG + REG_L data3, 0(src3) + addi src3, src3, SZREG + + /* Bits to mask are now 0, others are 1. */ + SHIFT tmp, m1, shift + /* Or with inverted value -> masked bits become 1. */ + orn data3_orcb, data3, tmp + /* Check for NUL in next aligned word. */ + orc.b data3_orcb, data3_orcb + bne data3_orcb, m1, L(unaligned_nul) + + .p2align 4 +L(loop_unaligned): + /* Read the (aligned) data1 and the unaligned data2. */ + REG_L data1, 0(src1) + addi src1, src1, SZREG + REG_L data2, 0(src2) + addi src2, src2, SZREG + orc.b data1_orcb, data1 + bne data1_orcb, m1, L(end) + bne data1, data2, L(end) + + /* Read the next aligned-down word. */ + REG_L data3, 0(src3) + addi src3, src3, SZREG + orc.b data3_orcb, data3 + beq data3_orcb, m1, L(loop_unaligned) + +L(unaligned_nul): + /* src1 points to unread word (only first bytes relevant). + * data3 holds next aligned-down word with NUL. + * Compare the first bytes of data1 with the last bytes of data3. */ + REG_L data1, 0(src1) + /* Shift NUL bytes into data3 to become data2. */ + SHIFT2 data2, data3, shift + bne data1, data2, L(end_orc) + li result, 0 + ret + +.option pop + +END (STRCMP) +libc_hidden_builtin_def (STRCMP) From patchwork Tue Feb 7 00:16:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738550 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=JvCz0Re6; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kNJ4mgFz23r8 for ; Tue, 7 Feb 2023 11:22:24 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2F7973885C3C for ; Tue, 7 Feb 2023 00:22:22 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by sourceware.org (Postfix) with ESMTPS id C62FD3858430 for ; Tue, 7 Feb 2023 00:16:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C62FD3858430 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wr1-x434.google.com with SMTP id y1so12068015wru.2 for ; Mon, 06 Feb 2023 16:16:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6IOSk+1OxTjwROsP2LPHyxBPTrBBo4Xcs3vLgCYTDR8=; b=JvCz0Re60d1+u+Rn3r2Mcg9wi53ckaiFU/3yVmLyYK68lHEFSV22One7l+FaNeEZH9 r4s23zpaDHz8iLTeygeAyzzLMtw5TVfbeGKdTVU0puZmeuxnhTcut62GZqlWAzAEydV0 Ac3sRl3YO04Db3SGqSeFah1mwQrGQ1EN2p2FiO/pvhsl3z3kjweMwBxiByMpb5PsZEuy O15syyUnSowFKABgv9HZaAE9haLO24K18ZHwLRBPSQZ/2r3ImQGcOEI/XsHYxleyjzs4 eAu7N1ZfLaH9WpITMRI8ojxpVrUj/9wvSyCQ3fxEgOHZ+njwHm9e3/pODlwPDCiZ37JU GtWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6IOSk+1OxTjwROsP2LPHyxBPTrBBo4Xcs3vLgCYTDR8=; b=sTzX7W9lWYK2YmF/OBbpK3szoM+AS80k+Ppt/oijpfjxk5fRy18O4QCqSD3XYF+9Oa XfGXYrH1GRtl3O7EoOK3LaafCrfOFHv0qmcaBh08diSNUnf3jWJ1BBmOTpbqfdY0kCgf VWJQ8y7FIDFh1beyORhN6oX8Hn2NLkgk4fDFDP23CrjFGAYnuyD99iX+WkOepcUGfo6o fuMkFlSp7773KFi2GmsUNxPOlUU0nXLgJ8zABNPCVQL4f/82M0slQrgUy6FTnJmotgAd B6K8U4ca/AbxEi7/SRLGRnIQaun7wpUxOyB9//ogR+qvyk6hgvayf5n6w08NYk5q951H ykEQ== X-Gm-Message-State: AO0yUKXpdQj+heextPzfQoWaZDGx4RFMx5Q4xRCoQT2B1J5PwNe94R8P Ei6BG2wPZzyaljZktE81kT0SitW3cx2UOhFE X-Google-Smtp-Source: AK7set8dru0OK3DG0olbD17nhax2OsTtl3e4ni0JasA8t3MGXbolRdOc9rEduIUE47lvtuJd1FIihw== X-Received: by 2002:a5d:6d82:0:b0:2c3:c138:e52d with SMTP id l2-20020a5d6d82000000b002c3c138e52dmr12491631wrs.4.1675729015202; Mon, 06 Feb 2023 16:16:55 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:54 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 17/19] riscv: Add ifunc support for strncmp Date: Tue, 7 Feb 2023 01:16:16 +0100 Message-Id: <20230207001618.458947-18-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner This patch adds ifunc support for calls to strncmp to the RISC-V code. No optimized code is added as part of this patch. Signed-off-by: Christoph Müllner --- sysdeps/riscv/multiarch/Makefile | 3 +- sysdeps/riscv/multiarch/ifunc-impl-list.c | 2 ++ sysdeps/riscv/multiarch/strncmp.c | 40 +++++++++++++++++++++++ sysdeps/riscv/multiarch/strncmp_generic.c | 32 ++++++++++++++++++ 4 files changed, 76 insertions(+), 1 deletion(-) create mode 100644 sysdeps/riscv/multiarch/strncmp.c create mode 100644 sysdeps/riscv/multiarch/strncmp_generic.c diff --git a/sysdeps/riscv/multiarch/Makefile b/sysdeps/riscv/multiarch/Makefile index 73a62be85d..056ce2ffc0 100644 --- a/sysdeps/riscv/multiarch/Makefile +++ b/sysdeps/riscv/multiarch/Makefile @@ -13,5 +13,6 @@ sysdep_routines += \ \ strcmp_generic \ strcmp_zbb \ - strcmp_zbb_unaligned + strcmp_zbb_unaligned \ + strncmp_generic endif diff --git a/sysdeps/riscv/multiarch/ifunc-impl-list.c b/sysdeps/riscv/multiarch/ifunc-impl-list.c index d354aa1178..eb37ed6017 100644 --- a/sysdeps/riscv/multiarch/ifunc-impl-list.c +++ b/sysdeps/riscv/multiarch/ifunc-impl-list.c @@ -63,5 +63,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, IFUNC_IMPL_ADD (array, i, strcmp, 1, __strcmp_zbb) IFUNC_IMPL_ADD (array, i, strcmp, 1, __strcmp_generic)) + IFUNC_IMPL (i, name, strncmp, + IFUNC_IMPL_ADD (array, i, strncmp, 1, __strncmp_generic)) return i; } diff --git a/sysdeps/riscv/multiarch/strncmp.c b/sysdeps/riscv/multiarch/strncmp.c new file mode 100644 index 0000000000..970aeb8b85 --- /dev/null +++ b/sysdeps/riscv/multiarch/strncmp.c @@ -0,0 +1,40 @@ +/* Multiple versions of strncmp. RISC-V version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +/* Define multiple versions only for the definition in libc. */ + +#if IS_IN (libc) +/* Redefine strncmp so that the compiler won't complain about the type + mismatch with the IFUNC selector in strong_alias, below. */ +# undef strncmp +# define strncmp __redirect_strncmp +# include +# include +# include +# include + +extern __typeof (__redirect_strncmp) __libc_strncmp; +extern __typeof (__redirect_strncmp) __strncmp_generic attribute_hidden; + +libc_ifunc (__libc_strncmp, __strncmp_generic); + +# undef strncmp +strong_alias (__libc_strncmp, strncmp); +#else +# include +#endif diff --git a/sysdeps/riscv/multiarch/strncmp_generic.c b/sysdeps/riscv/multiarch/strncmp_generic.c new file mode 100644 index 0000000000..9d8cdf2f1a --- /dev/null +++ b/sysdeps/riscv/multiarch/strncmp_generic.c @@ -0,0 +1,32 @@ +/* strncmp for RISC-V, default version for internal use. + Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include + +#define STRNCMP __strncmp_generic + +#ifdef SHARED +# undef libc_hidden_builtin_def +# define libc_hidden_builtin_def(name) \ + __hidden_ver1(__strncmp_generic, __GI_strncmp, __strncmp_generic); +#endif + +extern int __strncmp_generic(const char *s1, const char *s2, size_t n); + +#include From patchwork Tue Feb 7 00:16:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738547 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=Q+t9a4yH; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kL22xLsz23r8 for ; Tue, 7 Feb 2023 11:20:26 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8AD4C394849E for ; Tue, 7 Feb 2023 00:20:21 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by sourceware.org (Postfix) with ESMTPS id 166D83858439 for ; Tue, 7 Feb 2023 00:16:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 166D83858439 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wr1-x42d.google.com with SMTP id m14so12017850wrg.13 for ; Mon, 06 Feb 2023 16:16:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lNlFZRu7PYiep6w+7Ap3iNMp8G5wsKMbf2Rwc8RV7wE=; b=Q+t9a4yHxHAqiQo7n6SzA/Hd1d37Vqh2i+vNBBDvLlFjVaYeJyxCbGHgQjxFwnK2ib 3ZSjK0czDxPQspxrPjVLfSMWVUXNmUFDs9D5HSFY2/Z/vVaGUcdGwsMC0SwgexCyBhRY VajZnENTUVYMUcSuV3sZm9n9ViG8q6UUWIM8dm05LA36F/XmXpVL9N0anrQTkg6P6XCt O3/IUx8/LwVZRWY2Usm6AvA/zHG/kNjwo9aEXXCVSQnTB+AXfg2PM6ObjAPn1LXIYNru qi4I3Ijx+L1PBDIVtm8OXZLBS+tkqVmDQZll6ozxPNBMSw9aIXVYu1NXxTZYaHE/kMwb Z2lA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lNlFZRu7PYiep6w+7Ap3iNMp8G5wsKMbf2Rwc8RV7wE=; b=Ep89zqxucgJphXXjmzsEOhqoCs8Z7qK3Kxi+3tcladNehRfTYIdyjO5k+9OLxxpzBg 44u8PAsKVXfpnbnnHjmvgvkkNGJWEcmSo2b2CJ5G3jwmTOMpyjQPNiFhiIuBRKx9/Mwi MRStU13LSVIDS/eShpUv2jIm/RA8ha9J/cO1sR69pKHVgJr6JyzmP8k+CyDkj+voZmKq OL5flT0fcYSjeSBycaHjkTb9UO7eRxKU4daxTGejsCIEDlrSyrjXW6RWr0dKZq8/pLDn QPMNdRvhI1IHHxgP13JQVkAcVYEA9X720Aiulb4N3rcZoPNNNFGFOPNjhe4JRaBOtKUr YhZg== X-Gm-Message-State: AO0yUKU9OJ567QCsC8eIcBB0StZPy3QaiukR288tXUNLYY7vClDwCbtV l1hZsyMghd4vstBrvXZqSJQFV43M8FWGmuoL X-Google-Smtp-Source: AK7set/INupsPEU0U/3G01w2Synf9amPHTbWMzdX3l+VPgb5EdLQYD5IF//jA8IG6N3IZSFKtsh+kg== X-Received: by 2002:a5d:54c5:0:b0:242:5563:c3b with SMTP id x5-20020a5d54c5000000b0024255630c3bmr595606wrv.59.1675729016466; Mon, 06 Feb 2023 16:16:56 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:55 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 18/19] riscv: Add an optimized strncmp routine Date: Tue, 7 Feb 2023 01:16:17 +0100 Message-Id: <20230207001618.458947-19-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner The implementation of strncmp() can be accelerated using Zbb's orc.b instruction. Let's add an optimized implementation that makes use of this instruction. Signed-off-by: Christoph Müllner --- sysdeps/riscv/multiarch/Makefile | 3 +- sysdeps/riscv/multiarch/ifunc-impl-list.c | 1 + sysdeps/riscv/multiarch/strncmp.c | 6 +- sysdeps/riscv/multiarch/strncmp_zbb.S | 119 ++++++++++++++++++++++ 4 files changed, 127 insertions(+), 2 deletions(-) create mode 100644 sysdeps/riscv/multiarch/strncmp_zbb.S diff --git a/sysdeps/riscv/multiarch/Makefile b/sysdeps/riscv/multiarch/Makefile index 056ce2ffc0..9f22e31b99 100644 --- a/sysdeps/riscv/multiarch/Makefile +++ b/sysdeps/riscv/multiarch/Makefile @@ -14,5 +14,6 @@ sysdep_routines += \ strcmp_generic \ strcmp_zbb \ strcmp_zbb_unaligned \ - strncmp_generic + strncmp_generic \ + strncmp_zbb endif diff --git a/sysdeps/riscv/multiarch/ifunc-impl-list.c b/sysdeps/riscv/multiarch/ifunc-impl-list.c index eb37ed6017..82fd34d010 100644 --- a/sysdeps/riscv/multiarch/ifunc-impl-list.c +++ b/sysdeps/riscv/multiarch/ifunc-impl-list.c @@ -64,6 +64,7 @@ __libc_ifunc_impl_list (const char *name, struct libc_ifunc_impl *array, IFUNC_IMPL_ADD (array, i, strcmp, 1, __strcmp_generic)) IFUNC_IMPL (i, name, strncmp, + IFUNC_IMPL_ADD (array, i, strncmp, 1, __strncmp_zbb) IFUNC_IMPL_ADD (array, i, strncmp, 1, __strncmp_generic)) return i; } diff --git a/sysdeps/riscv/multiarch/strncmp.c b/sysdeps/riscv/multiarch/strncmp.c index 970aeb8b85..5b0fe08e98 100644 --- a/sysdeps/riscv/multiarch/strncmp.c +++ b/sysdeps/riscv/multiarch/strncmp.c @@ -30,8 +30,12 @@ extern __typeof (__redirect_strncmp) __libc_strncmp; extern __typeof (__redirect_strncmp) __strncmp_generic attribute_hidden; +extern __typeof (__redirect_strncmp) __strncmp_zbb attribute_hidden; -libc_ifunc (__libc_strncmp, __strncmp_generic); +libc_ifunc (__libc_strncmp, + HAVE_RV(zbb) + ? __strncmp_zbb + : __strncmp_generic); # undef strncmp strong_alias (__libc_strncmp, strncmp); diff --git a/sysdeps/riscv/multiarch/strncmp_zbb.S b/sysdeps/riscv/multiarch/strncmp_zbb.S new file mode 100644 index 0000000000..29cff30def --- /dev/null +++ b/sysdeps/riscv/multiarch/strncmp_zbb.S @@ -0,0 +1,119 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include +#include + +/* Assumptions: rvi_zbb. */ + +#define src1 a0 +#define result a0 +#define src2 a1 +#define len a2 +#define data1 a2 +#define data2 a3 +#define align a4 +#define data1_orcb t0 +#define limit t1 +#define fast_limit t2 +#define m1 t3 + +#if __riscv_xlen == 64 +# define REG_L ld +# define SZREG 8 +# define PTRLOG 3 +#else +# define REG_L lw +# define SZREG 4 +# define PTRLOG 2 +#endif + +#ifndef STRNCMP +# define STRNCMP __strncmp_zbb +#endif + +.option push +.option arch,+zbb + +ENTRY_ALIGN (STRNCMP, 6) + beqz len, L(equal) + or align, src1, src2 + and align, align, SZREG-1 + add limit, src1, len + bnez align, L(simpleloop) + li m1, -1 + + /* Adjust limit for fast-path. */ + andi fast_limit, limit, -SZREG + + /* Main loop for aligned string. */ + .p2align 3 +L(loop): + bge src1, fast_limit, L(simpleloop) + REG_L data1, 0(src1) + REG_L data2, 0(src2) + orc.b data1_orcb, data1 + bne data1_orcb, m1, L(foundnull) + addi src1, src1, SZREG + addi src2, src2, SZREG + beq data1, data2, L(loop) + + /* Words don't match, and no null byte in the first + * word. Get bytes in big-endian order and compare. */ +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__ + rev8 data1, data1 + rev8 data2, data2 +#endif + /* Synthesize (data1 >= data2) ? 1 : -1 in a branchless sequence. */ + sltu result, data1, data2 + neg result, result + ori result, result, 1 + ret + +L(foundnull): + /* Found a null byte. + * If words don't match, fall back to simple loop. */ + bne data1, data2, L(simpleloop) + + /* Otherwise, strings are equal. */ + li result, 0 + ret + + /* Simple loop for misaligned strings. */ + .p2align 3 +L(simpleloop): + bge src1, limit, L(equal) + lbu data1, 0(src1) + addi src1, src1, 1 + lbu data2, 0(src2) + addi src2, src2, 1 + bne data1, data2, L(sub) + bnez data1, L(simpleloop) + +L(sub): + sub result, data1, data2 + ret + +L(equal): + li result, 0 + ret + +.option pop + +END (STRNCMP) +libc_hidden_builtin_def (STRNCMP) From patchwork Tue Feb 7 00:16:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 1738546 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=PCk1XbNZ; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4P9kKx5THHz23r8 for ; Tue, 7 Feb 2023 11:20:21 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id ED999385828E for ; Tue, 7 Feb 2023 00:20:16 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by sourceware.org (Postfix) with ESMTPS id 675D1385843E for ; Tue, 7 Feb 2023 00:16:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 675D1385843E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-wm1-x336.google.com with SMTP id n28-20020a05600c3b9c00b003ddca7a2bcbso10222717wms.3 for ; Mon, 06 Feb 2023 16:16:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=i0DJ6AqNXUBpE7nps1g9BQ8jiqrORJhoj5dEAm7vqv4=; b=PCk1XbNZOQtSrI+/VPTYWW5ntLQRp3ua8eSgkcpxWoASv/DyzJC77xwyW7KbxBztdI cXmL3lOLNqoewNujVs0k4H5S2iRAvrkzkQ0+1rJlobDuxlrp7xtaauvYqbhywBYunerj WB1cTD6632NebDtenNyQiJDya6PCMhmubuBl7B11DfxoZRsQiFEGQw8DVcLfgXDSPniU C7fJgF4GCGh3ISpkaJlkdZp4yZVP2inOVNbZoKGpNkBfTs4WqBBp4vh4sHTwwvt5vXs8 7Ukw3ppov1o9lBmemWF2wYypF7JQuYBypUANhPHKY+LIkjZM3PBAipUucl+iNuxQlQEJ EO5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=i0DJ6AqNXUBpE7nps1g9BQ8jiqrORJhoj5dEAm7vqv4=; b=L46cn48vF35Xq8ZTDMYmQmNTgsPJRJjMYfg+3AHLZhVDMAIE1+9S9FDOgZRjy31EIX oLDyBYEAIxKvF9tJkPznNHaJe4aYTBctdOAKSSj2Cv0035HOFIPbIIY2UQPGVR+9wYDi 7mJ1zSClRnEwgwJsioJ/ZgqHM/fPS7qb4BWNAQ28yn4WcJr484PZmRsz3N1ljA5xGqyw PgY1uUjB5Na+VE1tXjMQ/MnpYhSAWJpddp3N90aIiJZpkL+bsfiWpQuW0ZP6H4+Y3O80 MRzkOLqWQg3axBLxg2/42nEP8Qq5Q39z6bWDiWKXvd0iKwJVwiikumpFQEVJzC9Fw8vV JVzQ== X-Gm-Message-State: AO0yUKWrhU2u9EgXCS+Dwc2K++4XpN8gnraZQJmqVxN9E2rHudO0iu1y 6rs/k0jZxkqp5KkEkAX8CpwtWq8N4p2iFb6e X-Google-Smtp-Source: AK7set8+lwqxD2T4Ib/64rkjcillGcrXnIo1f0DtOiS/QDpS8fuG7p3O997fqgHr/7vv1H6uDqNkcw== X-Received: by 2002:a05:600c:3296:b0:3dd:dd46:1274 with SMTP id t22-20020a05600c329600b003dddd461274mr1326723wmp.4.1675729017779; Mon, 06 Feb 2023 16:16:57 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id f1-20020a1cc901000000b003df14531724sm16862050wmb.21.2023.02.06.16.16.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 16:16:57 -0800 (PST) From: Christoph Muellner To: libc-alpha@sourceware.org, Palmer Dabbelt , Darius Rad , Andrew Waterman , DJ Delorie , Vineet Gupta , Kito Cheng , Jeff Law , Philipp Tomsich , Heiko Stuebner Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [RFC PATCH 19/19] riscv: Add __riscv_cpu_relax() to allow yielding in busy loops Date: Tue, 7 Feb 2023 01:16:18 +0100 Message-Id: <20230207001618.458947-20-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230207001618.458947-1-christoph.muellner@vrull.eu> References: <20230207001618.458947-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" From: Christoph Müllner The spinning loop of PTHREAD_MUTEX_ADAPTIVE_NP provides the hook atomic_spin_nop() that can be used by architectures. On RISC-V we have two instructions that can be used here: * WRS.STO from the Zawrs extension * PAUSE from the Zihintpause extension Let's use these instructions and prefer WRS.STO over PAUSE (based on availability of the corresponding ISA extension at runtime). Signed-off-by: Christoph Müllner --- sysdeps/riscv/multiarch/Makefile | 5 +++ sysdeps/riscv/multiarch/cpu_relax.c | 36 +++++++++++++++++ sysdeps/riscv/multiarch/cpu_relax_impl.S | 40 +++++++++++++++++++ .../unix/sysv/linux/riscv/atomic-machine.h | 3 ++ 4 files changed, 84 insertions(+) create mode 100644 sysdeps/riscv/multiarch/cpu_relax.c create mode 100644 sysdeps/riscv/multiarch/cpu_relax_impl.S diff --git a/sysdeps/riscv/multiarch/Makefile b/sysdeps/riscv/multiarch/Makefile index 9f22e31b99..b5b9fcf986 100644 --- a/sysdeps/riscv/multiarch/Makefile +++ b/sysdeps/riscv/multiarch/Makefile @@ -17,3 +17,8 @@ sysdep_routines += \ strncmp_generic \ strncmp_zbb endif + +# nscd uses atomic_spin_nop which in turn requires cpu_relax +ifeq ($(subdir),nscd) +routines += cpu_relax cpu_relax_impl +endif diff --git a/sysdeps/riscv/multiarch/cpu_relax.c b/sysdeps/riscv/multiarch/cpu_relax.c new file mode 100644 index 0000000000..4e6825ca50 --- /dev/null +++ b/sysdeps/riscv/multiarch/cpu_relax.c @@ -0,0 +1,36 @@ +/* CPU strand yielding for busy loops. RISC-V version. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +void __cpu_relax (void); +extern void __cpu_relax_zawrs (void); +extern void __cpu_relax_zihintpause (void); + +static void +__cpu_relax_generic (void) +{ +} + +libc_ifunc (__cpu_relax, + HAVE_RV(zawrs) + ? __cpu_relax_zawrs + : HAVE_RV(zihintpause) + ? __cpu_relax_zihintpause + : __cpu_relax_generic); diff --git a/sysdeps/riscv/multiarch/cpu_relax_impl.S b/sysdeps/riscv/multiarch/cpu_relax_impl.S new file mode 100644 index 0000000000..5d349c351f --- /dev/null +++ b/sysdeps/riscv/multiarch/cpu_relax_impl.S @@ -0,0 +1,40 @@ +/* Copyright (C) 2022 Free Software Foundation, Inc. + + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library. If not, see + . */ + +#include +#include + +.option push +.option arch,+zawrs + +ENTRY_ALIGN (__cpu_relax_zawrs, 4) + wrs.sto + ret +END (__cpu_relax_zawrs) + +.option pop + +.option push +.option arch,+zihintpause + +ENTRY_ALIGN (__cpu_relax_zihintpause, 4) + pause + ret +END (__cpu_relax_zihintpause) + +.option pop diff --git a/sysdeps/unix/sysv/linux/riscv/atomic-machine.h b/sysdeps/unix/sysv/linux/riscv/atomic-machine.h index dbf70d8d57..88aa58ef95 100644 --- a/sysdeps/unix/sysv/linux/riscv/atomic-machine.h +++ b/sysdeps/unix/sysv/linux/riscv/atomic-machine.h @@ -178,4 +178,7 @@ # error "ISAs that do not subsume the A extension are not supported" #endif /* !__riscv_atomic */ +extern void __cpu_relax (void); +#define atomic_spin_nop() __cpu_relax() + #endif /* bits/atomic.h */