From patchwork Mon Apr 18 12:01:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 1618344 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=POl9cjle; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4KhltD6r72z9sG2 for ; Mon, 18 Apr 2022 22:02:28 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D5EAC3858009 for ; Mon, 18 Apr 2022 12:02:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D5EAC3858009 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1650283345; bh=hHvh0KmMKDD/yHghoN+uylYUtQxRIjAUDsn7gMmwRUM=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=POl9cjleiWNo7ai5Pl2zkgd28PmwryGQPEUjq+TMvKNiPAQccxH40SHK64e53xNGU RjEWwww94Z0giEvIlftu042KJ24jQ2VrYp8B4EQ3hvNmWaL4rgEXNoOehorMKp/3IL mDitg7yCHPk3zPwMqvR2bkPu6QPI+RIpyI+XpQgs= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-ot1-x32f.google.com (mail-ot1-x32f.google.com [IPv6:2607:f8b0:4864:20::32f]) by sourceware.org (Postfix) with ESMTPS id 65E4B3858D28 for ; Mon, 18 Apr 2022 12:02:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 65E4B3858D28 Received: by mail-ot1-x32f.google.com with SMTP id e15-20020a9d63cf000000b006054e65aaecso1079988otl.0 for ; Mon, 18 Apr 2022 05:02:10 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:subject:date:message-id:mime-version :content-transfer-encoding; bh=hHvh0KmMKDD/yHghoN+uylYUtQxRIjAUDsn7gMmwRUM=; b=GLNUkedHVjqZQO8CTtB01WrLKvkozgLGBm6OYHmmh4oxsrXkc8po4o5jjCLZe3YhTC LtGWdZvTUN214TEvSsBXH3+4TT6PztNgj0TPmuUzjxIPpS8XFpUt9jWcsV/b3A3ESIMj UgsVvowxYqntt4cVqP1JJ6APTYEhqK08LJ3mfW2mpsuO2t/zZrBRMxoPfl/3J5Fpjhgh Ha1qdoRrv2enZgnS3Ocx7bTjO+H3FRyYUO443QwwyArznJXlNSQNf2ciC2GJsejFLftY sNlrjSZ0AdMSN2h65Xc+9agKr7odfU1FLATwXyoVD+Fzf8wSX7mKxLhqt5NkjkT95Pjr gpVA== X-Gm-Message-State: AOAM530VcMLiNuWdDoTdNUkhLFDwtA2R8HPGSsHEpuE2A8akhEMb/Hym N8vNbe4P6vI54EChgLd/H62jbR8uTsW/dw== X-Google-Smtp-Source: ABdhPJw6pKuKVdg8/nji4czRIhyJdWL5oKGtp+rxs0yF41iQDF0N0LJcQNuuLOu+23nxTUork01oeA== X-Received: by 2002:a9d:7447:0:b0:605:465c:115e with SMTP id p7-20020a9d7447000000b00605465c115emr3122550otk.345.1650283328496; Mon, 18 Apr 2022 05:02:08 -0700 (PDT) Received: from birita.. ([2804:431:c7ca:c9d0:566e:62b0:471b:674d]) by smtp.gmail.com with ESMTPSA id k124-20020aca3d82000000b002ef4c5bb9dbsm3832952oia.0.2022.04.18.05.02.06 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Apr 2022 05:02:07 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v2 0/8] Add arc4random support Date: Mon, 18 Apr 2022 09:01:55 -0300 Message-Id: <20220418120203.3185943-1-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.32.0 MIME-Version: 1.0 X-Spam-Status: No, score=-5.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Adhemerval Zanella via Libc-alpha From: Adhemerval Zanella Reply-To: Adhemerval Zanella Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" This patch adds the arc4random, arc4random_buf, and arc4random_uniform along with optimized versions for x86_64, aarch64, and powerpc64. The generic implementation is based on scalar Chacha20, with a global cache and locking. It uses getrandom or /dev/urandom as fallback to get the initial entropy, and reseeds the internal state on every 16MB of consumed entropy. It maintains an internal buffer which consumes at maximum one page on most systems (assuming 4k pages). The internal buffer optimizes the cipher encrypt calls, by amortize arc4random calls (where both function call and locks cost are the dominating factor). Fork detection is done by checking if MADV_WIPEONFORK supported. If not the fork callback will reset the state on the fork call. It does not handle direct clone calls, nor vfork or _Fork (arc4random is not async-signal-safe due the internal lock usage, althought the implementation does try to handle fork cases). The generic ChaCha20 implementation is based on the RFC8439 [1], which a simple memcpy with xor implementation. The optimized ones for x86_64, aarch64, and powerpc64 use vectorized instruction and they are based on libgcrypt code. This patchset is different than the previous ones by using a much simpler scheme of fork detection (there is no attempt in using a global shared counter to detect direct clone usages), and by using ChaCha20 instead of AES. ChaCha20 is used because is the standard cipher used on different arc4random implementation (BSDs, MacOSX), and recently on Linux random subsystem. It is also a much more simpler implementation than AES and shows better performance when no specialized instructions are present. One possible improvement, not implemented in this patchset, it to use a per-thread cache, since on some architecture the lock cost is somewhat high. Ideally it would reside in TCB to avoid require tuning static TLS size, and it work similar to the malloc tcache where arc4random would initially consume any thread local entropy thus avoid any locking. [1] https://sourceware.org/pipermail/libc-alpha/2018-June/094879.html v2: * Removed the last XOR operation on ChaCha20 implementation (it does not much on arc4random usage). * Add tst-arc4random-chacha20.c and refactor to check against the expected implementation. * Fixed aarch64 implementation (a last change to move symbols to hidden did not change the relocation to use it as well). * Refactor x86 SSSE3 to SSE2. * Fixed powerpc64 implementation on BE (use the correct macro to check for endianess instead the ones from libgcrpyt). * Add s390x optimized ChaCha20 implementation. Adhemerval Zanella (8): stdlib: Add arc4random, arc4random_buf, and arc4random_uniform (BZ #4417) stdlib: Add arc4random tests benchtests: Add arc4random benchtest aarch64: Add optimized chacha20 x86: Add SSE2 optimized chacha20 x86: Add AVX2 optimized chacha20 powerpc64: Add optimized chacha20 s390x: Add optimized chacha20 LICENSES | 22 + NEWS | 4 +- benchtests/Makefile | 6 +- benchtests/bench-arc4random.c | 191 ++++++ include/stdlib.h | 13 + posix/fork.c | 2 + stdlib/Makefile | 7 + stdlib/Versions | 5 + stdlib/arc4random.c | 245 ++++++++ stdlib/arc4random_uniform.c | 152 +++++ stdlib/chacha20.c | 167 ++++++ stdlib/stdlib.h | 14 + stdlib/tst-arc4random-chacha20.c | 262 ++++++++ stdlib/tst-arc4random-fork.c | 174 ++++++ stdlib/tst-arc4random-stats.c | 146 +++++ stdlib/tst-arc4random-thread.c | 278 +++++++++ sysdeps/aarch64/Makefile | 4 + sysdeps/aarch64/chacha20-neon.S | 323 ++++++++++ sysdeps/aarch64/chacha20_arch.h | 40 ++ sysdeps/generic/chacha20_arch.h | 24 + sysdeps/generic/not-cancel.h | 2 + sysdeps/mach/hurd/i386/libc.abilist | 3 + sysdeps/mach/hurd/not-cancel.h | 3 + sysdeps/powerpc/powerpc64/Makefile | 3 + sysdeps/powerpc/powerpc64/chacha20-ppc.c | 236 ++++++++ sysdeps/powerpc/powerpc64/chacha20_arch.h | 47 ++ sysdeps/s390/s390-64/Makefile | 4 + sysdeps/s390/s390-64/chacha20-vx.S | 564 ++++++++++++++++++ sysdeps/s390/s390-64/chacha20_arch.h | 45 ++ sysdeps/unix/sysv/linux/aarch64/libc.abilist | 3 + sysdeps/unix/sysv/linux/alpha/libc.abilist | 3 + sysdeps/unix/sysv/linux/arc/libc.abilist | 3 + sysdeps/unix/sysv/linux/arm/be/libc.abilist | 3 + sysdeps/unix/sysv/linux/arm/le/libc.abilist | 3 + sysdeps/unix/sysv/linux/csky/libc.abilist | 3 + sysdeps/unix/sysv/linux/hppa/libc.abilist | 3 + sysdeps/unix/sysv/linux/i386/libc.abilist | 3 + sysdeps/unix/sysv/linux/ia64/libc.abilist | 3 + .../sysv/linux/m68k/coldfire/libc.abilist | 3 + .../unix/sysv/linux/m68k/m680x0/libc.abilist | 3 + .../sysv/linux/microblaze/be/libc.abilist | 3 + .../sysv/linux/microblaze/le/libc.abilist | 3 + .../sysv/linux/mips/mips32/fpu/libc.abilist | 3 + .../sysv/linux/mips/mips32/nofpu/libc.abilist | 3 + .../sysv/linux/mips/mips64/n32/libc.abilist | 3 + .../sysv/linux/mips/mips64/n64/libc.abilist | 3 + sysdeps/unix/sysv/linux/nios2/libc.abilist | 3 + sysdeps/unix/sysv/linux/not-cancel.h | 7 + sysdeps/unix/sysv/linux/or1k/libc.abilist | 3 + .../linux/powerpc/powerpc32/fpu/libc.abilist | 3 + .../powerpc/powerpc32/nofpu/libc.abilist | 3 + .../linux/powerpc/powerpc64/be/libc.abilist | 3 + .../linux/powerpc/powerpc64/le/libc.abilist | 3 + .../unix/sysv/linux/riscv/rv32/libc.abilist | 3 + .../unix/sysv/linux/riscv/rv64/libc.abilist | 3 + .../unix/sysv/linux/s390/s390-32/libc.abilist | 3 + .../unix/sysv/linux/s390/s390-64/libc.abilist | 3 + sysdeps/unix/sysv/linux/sh/be/libc.abilist | 3 + sysdeps/unix/sysv/linux/sh/le/libc.abilist | 3 + .../sysv/linux/sparc/sparc32/libc.abilist | 3 + .../sysv/linux/sparc/sparc64/libc.abilist | 3 + .../unix/sysv/linux/x86_64/64/libc.abilist | 3 + .../unix/sysv/linux/x86_64/x32/libc.abilist | 3 + sysdeps/x86_64/Makefile | 7 + sysdeps/x86_64/chacha20-avx2.S | 313 ++++++++++ sysdeps/x86_64/chacha20-sse2.S | 314 ++++++++++ sysdeps/x86_64/chacha20_arch.h | 48 ++ 67 files changed, 3772 insertions(+), 2 deletions(-) create mode 100644 benchtests/bench-arc4random.c create mode 100644 stdlib/arc4random.c create mode 100644 stdlib/arc4random_uniform.c create mode 100644 stdlib/chacha20.c create mode 100644 stdlib/tst-arc4random-chacha20.c create mode 100644 stdlib/tst-arc4random-fork.c create mode 100644 stdlib/tst-arc4random-stats.c create mode 100644 stdlib/tst-arc4random-thread.c create mode 100644 sysdeps/aarch64/chacha20-neon.S create mode 100644 sysdeps/aarch64/chacha20_arch.h create mode 100644 sysdeps/generic/chacha20_arch.h create mode 100644 sysdeps/powerpc/powerpc64/chacha20-ppc.c create mode 100644 sysdeps/powerpc/powerpc64/chacha20_arch.h create mode 100644 sysdeps/s390/s390-64/chacha20-vx.S create mode 100644 sysdeps/s390/s390-64/chacha20_arch.h create mode 100644 sysdeps/x86_64/chacha20-avx2.S create mode 100644 sysdeps/x86_64/chacha20-sse2.S create mode 100644 sysdeps/x86_64/chacha20_arch.h