From patchwork Wed Jul 11 14:43:41 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Weimer X-Patchwork-Id: 942539 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-94166-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="f6McDT/X"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 41Qhf53pHlzB4MP for ; Thu, 12 Jul 2018 00:43:57 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:to:subject:mime-version:content-type :content-transfer-encoding:message-id:from; q=dns; s=default; b= M5Uijdv6Ck5AxT04vATi/X33sK0L1AenI+6/+/yZAfyXzedvOubU4diiKNzmXRql 3Ea75KcZq1i3IOWzjmlkND/wtSW/RM8cf6sv9XxZSqxjUzrgL6G64nJifd63TGpI diEvV71+lpu4u1zTnJlmqti2d3v8h4Y7tUQWo8fF+BI= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:to:subject:mime-version:content-type :content-transfer-encoding:message-id:from; s=default; bh=uRFiIz 3UpvY2k+WLFZ9zw/5Y3kY=; b=f6McDT/XO99FdSQkr/zNfvRx8oMmZbEf3wdTGV 4pX1BjvHZ9ryMULiSS7grpX+bdKIivQ5RFALbfEfLf8GcAlsqEFd+qTsKGgRasn1 yS+W6Weh5QRImrsZsh79RZ8++/THylcz7osqa2BMjscgA4GvWqmBVErRfPNDHJKQ UZxVE= Received: (qmail 124646 invoked by alias); 11 Jul 2018 14:43:46 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 124634 invoked by uid 89); 11 Jul 2018 14:43:45 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=weights, H*r:sk:dhcp-19, H*RU:sk:dhcp-19, Hx-spam-relays-external:sk:dhcp-19 X-HELO: mx1.redhat.com Date: Wed, 11 Jul 2018 16:43:41 +0200 To: libc-alpha@sourceware.org Subject: [PATCH] regexec: Fix off-by-one bug in weight comparison [BZ #23036] User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Message-Id: <20180711144341.B61BB43994575@oldenburg.str.redhat.com> From: fweimer@redhat.com (Florian Weimer) Each weight is prefixed by its length, and the length does not include itself in the count. This can be seen clearly from the find_idx function in string/strxfrm_l.c, for example. The old code behaved as if the length itself counted, thus comparing an additional byte after the weight, leading to spurious comparison failures and incorrect further partitioning of character equivalence classes. (cherry picked from commit 7b2f4cedf044ea83f53f6b43a5bf6871eb9ce969) 2018-07-10 Florian Weimer [BZ #23036] * posix/regexec.c (check_node_accept_bytes): When comparing weights, do not compare an extra byte after the end of the weights. diff --git a/NEWS b/NEWS index 2e7e7837ac..c5c78ffd3b 100644 --- a/NEWS +++ b/NEWS @@ -69,6 +69,7 @@ The following bugs are resolved with this release: [22947] FAIL: misc/tst-preadvwritev2 [22963] cs_CZ: Add alternative month names [23005] Crash in __res_context_send after memory allocation failure + [23036] regexec: Fix off-by-one bug in weight comparison [23037] initialize msg_flags to zero for sendmmsg() calls [23069] sigaction broken on riscv64-linux-gnu [23102] Incorrect parsing of consecutive $ variables in runpath entries diff --git a/posix/regexec.c b/posix/regexec.c index 4b1ab4ecff..21129432d1 100644 --- a/posix/regexec.c +++ b/posix/regexec.c @@ -3848,30 +3848,27 @@ check_node_accept_bytes (const re_dfa_t *dfa, int node_idx, indirect = (const int32_t *) _NL_CURRENT (LC_COLLATE, _NL_COLLATE_INDIRECTMB); int32_t idx = findidx (table, indirect, extra, &cp, elem_len); + int32_t rule = idx >> 24; + idx &= 0xffffff; if (idx > 0) - for (i = 0; i < cset->nequiv_classes; ++i) - { - int32_t equiv_class_idx = cset->equiv_classes[i]; - size_t weight_len = weights[idx & 0xffffff]; - if (weight_len == weights[equiv_class_idx & 0xffffff] - && (idx >> 24) == (equiv_class_idx >> 24)) - { - int cnt = 0; - - idx &= 0xffffff; - equiv_class_idx &= 0xffffff; - - while (cnt <= weight_len - && (weights[equiv_class_idx + 1 + cnt] - == weights[idx + 1 + cnt])) - ++cnt; - if (cnt > weight_len) - { - match_len = elem_len; - goto check_node_accept_bytes_match; - } - } - } + { + size_t weight_len = weights[idx]; + for (i = 0; i < cset->nequiv_classes; ++i) + { + int32_t equiv_class_idx = cset->equiv_classes[i]; + int32_t equiv_class_rule = equiv_class_idx >> 24; + equiv_class_idx &= 0xffffff; + if (weights[equiv_class_idx] == weight_len + && equiv_class_rule == rule + && memcmp (weights + idx + 1, + weights + equiv_class_idx + 1, + weight_len) == 0) + { + match_len = elem_len; + goto check_node_accept_bytes_match; + } + } + } } } else