From patchwork Wed Apr 4 13:57:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Weimer X-Patchwork-Id: 895011 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-91420-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="Diq/Ri6u"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40GSG80VP4z9ryr for ; Wed, 4 Apr 2018 23:57:51 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:to:subject:mime-version:content-type :content-transfer-encoding:message-id:from; q=dns; s=default; b= iP84h6xGnBU8NBhP+Ogga2NylPZnDxZvjEY09k071DLkUfxk3kRWZlr0wkVNi4Js 0ON0eWhUZrlg8A4dVABr6+6uWZv3Xpc+WZtyt29vqEj3JWqD0K75lZwmAvewzxwB S7lq41vDjdzF0EsWKGOarQ7fVUWxelVyeEqHVDm1n8U= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:date:to:subject:mime-version:content-type :content-transfer-encoding:message-id:from; s=default; bh=meb6YJ bgAnfabQ6Y75FgUuH6+5Q=; b=Diq/Ri6uqVf788AfoYdiV3CyP13hjXyIzWIwlD vRoB8iivPVHgNU/bMPiFJgpcIN+kE+3PEYhCKYKf/ZfYBfFjMKPFcaSwd0zDwtGr ApMu12BjLr20FXoDEOp98nu4ecHhH0KzNjIkV0fN+PIT+pTjDWoRPoixMdIUuwQp HtyaE= Received: (qmail 62179 invoked by alias); 4 Apr 2018 13:57:45 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 62088 invoked by uid 89); 4 Apr 2018 13:57:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_LAZY_DOMAIN_SECURITY, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=2019 X-HELO: mx1.redhat.com Date: Wed, 04 Apr 2018 15:57:40 +0200 To: libc-alpha@sourceware.org Subject: [PATCH] manual: Various fixes to the mbstouwcs example User-Agent: Heirloom mailx 12.5 7/5/10 MIME-Version: 1.0 Message-Id: <20180404135740.A283E406F5A23@oldenburg.str.redhat.com> From: fweimer@redhat.com (Florian Weimer) The example did not work because the NUL byte was not converted, and mbrtowc was called with a zero-length input string. This results in a (size_t) -2 return value, so the function always returns NULL. The size computation for the heap allocation of the result was incorrect because it did not deal with integer overflow. Error checking was missing, and the allocated memory was not freed on error paths. All error returns now set errno. (Note that there is an assumption that free does not clobber errno.) The slightly unportable comparision against (size_t) -2 to catch both (size_t) -1 and (size_t) -2 return values is gone as well. 2018-04-04 Florian Weimer * manual/examples/mbstouwcs.c (mbstouwcs): Fix loop termination, integer overflow, memory leak on error, and indeterminate errno value. * manual/charset.texi (Converting a Character): Adjust. diff --git a/manual/charset.texi b/manual/charset.texi index b37fac4df1..270995f602 100644 --- a/manual/charset.texi +++ b/manual/charset.texi @@ -681,9 +681,7 @@ is declared in @file{wchar.h}. Use of @code{mbrtowc} is straightforward. A function that copies a multibyte string into a wide character string while at the same time -converting all lowercase characters into uppercase could look like this -(this is not the final version, just an example; it has no error -checking, and sometimes leaks memory): +converting all lowercase characters into uppercase could look like this: @smallexample @include mbstouwcs.c.texi diff --git a/manual/examples/mbstouwcs.c b/manual/examples/mbstouwcs.c index 3a8b9a65f9..4012606bf1 100644 --- a/manual/examples/mbstouwcs.c +++ b/manual/examples/mbstouwcs.c @@ -7,8 +7,11 @@ wchar_t * mbstouwcs (const char *s) { - size_t len = strlen (s); - wchar_t *result = malloc ((len + 1) * sizeof (wchar_t)); + /* Include the NUL terminator in the conversion. */ + size_t len = strlen (s) + 1; + wchar_t *result = reallocarray (NULL, len + 1, sizeof (wchar_t)); + if (result == NULL) + return NULL; wchar_t *wcp = result; wchar_t tmp[1]; mbstate_t state; @@ -17,9 +20,19 @@ mbstouwcs (const char *s) memset (&state, '\0', sizeof (state)); while ((nbytes = mbrtowc (tmp, s, len, &state)) > 0) { - if (nbytes >= (size_t) -2) - /* Invalid input string. */ - return NULL; + if (nbytes == (size_t) -2) + { + /* Truncated input string. */ + errno = EILSEQ; + free (result); + return NULL; + } + if (nbytes >= (size_t) -1) + { + /* Some other error (including EILSEQ). */ + free (result); + return NULL; + } *wcp++ = towupper (*tmp); len -= nbytes; s += nbytes;