From patchwork Sat Dec 14 08:09:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 1209697 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-515982-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="CInyqfy4"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.b="Nto/abDb"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47ZgFC6B7lz9sPJ for ; Sat, 14 Dec 2019 19:10:05 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type:content-transfer-encoding; q=dns; s=default; b=w8j +klrmTw3ckMbzdrwyl1LFxw9kkNhg80pwhKsAwKnU2Q29xQBY44yv2a8IKqTyU59 dHIx9pllSDO7oK9P+rv28Z2VADDGXYyjHVumwTYfHegPhbC40wS3a6cWbHaAr/7/ 3b9yOAbbXC25uSGN7ronXzruLcZ9keWAJwwMfDrs= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type:content-transfer-encoding; s=default; bh=ZcS+xWL0g gYiTI3ZOe/XEy7OLoM=; b=CInyqfy4WyH8NArOSybqJNHqDAEVG/ZCDmq9QMFXY fOEnIRywoRp5ZlFStI2T6oqxyMiZ44Suv8GyANff25zhc+NY3Ql6VljIwpi1fUwL 1XzkIthxOmZ1V1n5jSjBGsABCxvcSTjdYph6Yn+qIL+fM43wQu7/jzQCpWo1NTgQ 18= Received: (qmail 77321 invoked by alias); 14 Dec 2019 08:09:59 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 77306 invoked by uid 89); 14 Dec 2019 08:09:58 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-7.8 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS autolearn=ham version=3.3.1 spammy=ua, u00110003, punting, char16_t X-HELO: us-smtp-delivery-1.mimecast.com Received: from us-smtp-1.mimecast.com (HELO us-smtp-delivery-1.mimecast.com) (205.139.110.61) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 14 Dec 2019 08:09:57 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1576310995; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=WjEU8TDZ//R2BwooxqC93k7GR3Bg41sNz+kw2ufGqE4=; b=Nto/abDb66jsZQ7qpTa5vujJUDX+F5niUCwhdmUDIw9WMbzfCBdxtgmEY9HXxQPlJq7M5u /T2WJpagmqY4o2HgDRJK/VhC3CNJ+h/Vp/v+Q7ni6Xm13ICUJ9Klctlb2CfW1LcniVn8dC uxXLrRrqy6p9BUTycLJvhsee+peJYj4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-196-9mxYb_8yPfma2Z3nCFCHAg-1; Sat, 14 Dec 2019 03:09:52 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C2BE01852E25; Sat, 14 Dec 2019 08:09:51 +0000 (UTC) Received: from tucnak.zalov.cz (ovpn-117-59.ams2.redhat.com [10.36.117.59]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 286E35D9E2; Sat, 14 Dec 2019 08:09:48 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id xBE89kgC002759; Sat, 14 Dec 2019 09:09:46 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id xBE89ieV002647; Sat, 14 Dec 2019 09:09:44 +0100 Date: Sat, 14 Dec 2019 09:09:44 +0100 From: Jakub Jelinek To: "Joseph S. Myers" , Marek Polacek , Jason Merrill , Nathan Sidwell Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Fix out of bounds array access in the preprocessor (PR preprocessor/92919) Message-ID: <20191214080944.GR10088@tucnak> Reply-To: Jakub Jelinek MIME-Version: 1.0 User-Agent: Mutt/1.11.3 (2019-02-01) X-Mimecast-Spam-Score: 0 Content-Disposition: inline X-IsSubscribed: yes Hi! wide_str_to_charconst function relies on the string passed to it having at least two wide characters, the one we are looking for and the terminating NUL. The empty wide character literal like L'' or u'' or U'' is handled earlier and will not reach this function, but unfortunately for const char16_t p = u'\U00110003'; while we do emit an error wide_str_to_charconst is called with a string that contains just the NUL terminator and nothing else. That is because U110003 is too large and can't be represented even as a surrogate pair in char16_t, but the handling of it doesn't give up on the whole string, because other wide characters could be fine. Say u'a\U00110003' would be passed to wide_str_to_charconst after diagnosing an error because the too large char would be thrown away and we'd end up with u'a'. The following patch fixes it by just checking for this condition and punting. I think it is undesirable to print further error. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2019-12-13 Jakub Jelinek PR preprocessor/92919 * charset.c (wide_str_to_charconst): If str contains just the NUL terminator, punt quietly. Jakub --- libcpp/charset.c.jj 2019-12-10 00:56:07.552291870 +0100 +++ libcpp/charset.c 2019-12-13 12:23:59.096150225 +0100 @@ -1970,6 +1970,17 @@ wide_str_to_charconst (cpp_reader *pfile size_t off, i; cppchar_t result = 0, c; + if (str.len <= nbwc) + { + /* Error recovery, if no errors have been diagnosed previously, + there should be at least two wide characters. Empty literals + are diagnosed earlier and we can get just the zero terminator + only if there were errors diagnosed during conversion. */ + *pchars_seen = 0; + *unsignedp = 0; + return 0; + } + /* This is finicky because the string is in the target's byte order, which may not be our byte order. Only the last character, ignoring the NUL terminator, is relevant. */