From patchwork Thu Mar 21 08:13:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 1914309 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=PY2+wevt; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4V0dWy56zZz1yWy for ; Thu, 21 Mar 2024 19:13:48 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 951FD3858426 for ; Thu, 21 Mar 2024 08:13:46 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id 17A643858D28 for ; Thu, 21 Mar 2024 08:13:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 17A643858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 17A643858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711008806; cv=none; b=OHmIFlnUsfuOIdj51QegCn7kJopCkcRVIy+vhpy/VQwq1eeWwC6WJuCyIrAhI+ipWD0YanO45r8jbOJxMVne7y5xlCBcoVuIzBItHImLqccZ6SQlOXgKDwyM2H/KhB359Q+XYr8jEVphDij9mis8PY+t4W29mqISRjF9KyMN2+k= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1711008806; c=relaxed/simple; bh=lltYiBOTEf9nggHm8dKjoEGrbyEI0Ozq3MDHPOECKkc=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=FXtXiwe4M1aHRIn36CWL+U2ukdC3PfDKumVTHvtyb2NG6c9AhzvQ24wMPxBrhf3/K82UFPwkPOmSt3vWlgu6yWu5qqptL4nmcq6voTSJ2Fc34deE50PD8C6og9z0TmCUgKvQqrZ9xCgGNuZ3pKKutuR0FUonyU/887jKRJK7HdI= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1711008804; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=rq1XF2ebFiPCEQ05JTRwb2rdsIK32fk0G0NEGuKwnRQ=; b=PY2+wevt1Zr2qyXmzbkmOI/lgxAy//dXVSkU4daEJscNXg4jS3VuOExXblPqAcJ+tSuPbB M0c8OzEXeeyG/i/+tfag7DrA8flx/RQhASim7M7D5gY5MP1QPvEVDLmcddQ9dYm1SUhAQD jzX9hM/BpjyUp3kcj+KDE0HkS3t5fMQ= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-168-GEmW-57JMmmb2oG1nYSFrA-1; Thu, 21 Mar 2024 04:13:19 -0400 X-MC-Unique: GEmW-57JMmmb2oG1nYSFrA-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8877A855300; Thu, 21 Mar 2024 08:13:18 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.57]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4C60510189; Thu, 21 Mar 2024 08:13:18 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 42L8DBIH4069463 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 21 Mar 2024 09:13:12 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 42L8DAkB4069462; Thu, 21 Mar 2024 09:13:10 +0100 Date: Thu, 21 Mar 2024 09:13:10 +0100 From: Jakub Jelinek To: Richard Biener Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] libgcc: Fix up bitint division [PR114397] Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.5 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_LOTSOFHASH, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Jakub Jelinek Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Hi! The Knuth's division algorithm relies on the number of dividend limbs to be greater ore equal to number of divisor limbs, which is why I've added a special case for un < vn at the start of __divmodbitint4. Unfortunately, my assumption that it then implies abs(v) > abs(u) and so quotient must be 0 and remainder same as dividend is incorrect. This is because this check is done before negation of the operands. While bitint_reduce_prec reduces precision from clearly useless limbs, the problematic case is when the dividend is unsigned or non-negative and divisor is negative. We can have limbs (from MS to LS): dividend: 0 M ?... divisor: -1 -N ?... where M has most significant bit set and M >= N (if M == N then it also the following limbs matter) and the most significant limbs can be even partial. In this case, the quotient should be -1 rather than 0. bitint_reduce_prec will reduce the precision of the dividend so that M is the most significant limb, but can't reduce precision of the divisor to more than having the -1 as most significant limb, because -N doesn't have the most significant bit set. The following patch fixes it by detecting this problematic case in the un < vn handling, and instead of assuming q is 0 and r is u will decrease vn by 1 because it knows the later code will negate the divisor and it can be then expressed after negation in one fewer limbs. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2024-03-21 Jakub Jelinek PR libgcc/114397 * libgcc2.c (__divmodbitint4): Don't assume un < vn always means abs(v) > abs(u), check for a special case of un + 1 == vn where u is non-negative and v negative and after v's negation vn could be reduced by 1. * gcc.dg/torture/bitint-65.c: New test. Jakub --- libgcc/libgcc2.c.jj 2024-03-15 19:04:27.000000000 +0100 +++ libgcc/libgcc2.c 2024-03-20 18:23:51.956879521 +0100 @@ -1707,44 +1707,67 @@ __divmodbitint4 (UBILtype *q, SItype qpr USItype vp = avprec % W_TYPE_SIZE; if (__builtin_expect (un < vn, 0)) { - /* If abs(v) > abs(u), then q is 0 and r is u. */ - if (q) - __builtin_memset (q, 0, qn * sizeof (UWtype)); - if (r == NULL) - return; -#if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__ - r += rn - 1; - u += un - 1; -#endif - if (up) - --un; - if (rn < un) - un = rn; - for (rn -= un; un; --un) + /* If abs(v) > abs(u), then q is 0 and r is u. + Unfortunately un < vn doesn't always mean abs(v) > abs(u). + If uprec > 0 and vprec < 0 and vn == un + 1, if the + top limb of v is all ones and the second most significant + limb has most significant bit clear, then just decrease + vn/avprec/vp and continue, after negation both numbers + will have the same number of limbs. */ + if (un + 1 == vn + && uprec >= 0 + && vprec < 0 + && ((v[BITINT_END (0, vn - 1)] | (vp ? ((UWtype) -1 << vp) : 0)) + == (UWtype) -1) + && (Wtype) v[BITINT_END (1, vn - 2)] >= 0) { - *r = *u; - r += BITINT_INC; - u += BITINT_INC; + vp = 0; + --vn; +#if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__ + ++v; +#endif } - if (!rn) - return; - if (up) + else { - if (uprec > 0) - *r = *u & (((UWtype) 1 << up) - 1); - else - *r = *u | ((UWtype) -1 << up); - r += BITINT_INC; - if (!--rn) + /* q is 0 and r is u. */ + if (q) + __builtin_memset (q, 0, qn * sizeof (UWtype)); + if (r == NULL) return; +#if __LIBGCC_BITINT_ORDER__ == __ORDER_BIG_ENDIAN__ + r += rn - 1; + u += un - 1; +#endif + if (up) + --un; + if (rn < un) + un = rn; + for (rn -= un; un; --un) + { + *r = *u; + r += BITINT_INC; + u += BITINT_INC; + } + if (!rn) + return; + if (up) + { + if (uprec > 0) + *r = *u & (((UWtype) 1 << up) - 1); + else + *r = *u | ((UWtype) -1 << up); + r += BITINT_INC; + if (!--rn) + return; + } + UWtype c = uprec < 0 ? (UWtype) -1 : (UWtype) 0; + for (; rn; --rn) + { + *r = c; + r += BITINT_INC; + } + return; } - UWtype c = uprec < 0 ? (UWtype) -1 : (UWtype) 0; - for (; rn; --rn) - { - *r = c; - r += BITINT_INC; - } - return; } USItype qn2 = un - vn + 1; if (qn >= qn2) --- gcc/testsuite/gcc.dg/torture/bitint-65.c.jj 2024-03-20 18:41:38.026311007 +0100 +++ gcc/testsuite/gcc.dg/torture/bitint-65.c 2024-03-20 18:40:18.604397871 +0100 @@ -0,0 +1,44 @@ +/* PR libgcc/114397 */ +/* { dg-do run { target bitint } } */ +/* { dg-options "-std=c23" } */ +/* { dg-skip-if "" { ! run_expensive_tests } { "*" } { "-O0" "-O2" } } */ +/* { dg-skip-if "" { ! run_expensive_tests } { "-flto" } { "" } } */ + +#if __BITINT_MAXWIDTH__ >= 129 +int +foo (unsigned _BitInt (128) a, _BitInt (129) b) +{ + return a / b; +} +#endif + +#if __BITINT_MAXWIDTH__ >= 192 +int +bar (unsigned _BitInt (128) a, _BitInt (192) b) +{ + return a / b; +} +#endif + +int +main () +{ +#if __BITINT_MAXWIDTH__ >= 129 + if (foo (336225022742818342628768636932743029911uwb, + -336225022742818342628768636932743029911wb) != -1 + || foo (336225022742818342628768636932743029912uwb, + -336225022742818342628768636932743029911wb) != -1 + || foo (336225022742818342628768636932743029911uwb, + -336225022742818342628768636932743029912wb) != 0) + __builtin_abort (); +#endif +#if __BITINT_MAXWIDTH__ >= 192 + if (bar (336225022742818342628768636932743029911uwb, + -336225022742818342628768636932743029911wb) != -1 + || bar (336225022742818342628768636932743029912uwb, + -336225022742818342628768636932743029911wb) != -1 + || bar (336225022742818342628768636932743029911uwb, + -336225022742818342628768636932743029912wb) != 0) + __builtin_abort (); +#endif +}