From patchwork Sat Feb 27 13:08:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Heinrich Schuchardt X-Patchwork-Id: 1445172 X-Patchwork-Delegate: xypron.glpk@gmx.de Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.denx.de (client-ip=2a01:238:438b:c500:173d:9f52:ddab:ee01; helo=phobos.denx.de; envelope-from=u-boot-bounces@lists.denx.de; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=gmx.net header.i=@gmx.net header.a=rsa-sha256 header.s=badeba3b8450 header.b=lXsPGKXi; dkim-atps=neutral Received: from phobos.denx.de (phobos.denx.de [IPv6:2a01:238:438b:c500:173d:9f52:ddab:ee01]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Dnn1H3XPRz9sBJ for ; Sun, 28 Feb 2021 00:09:39 +1100 (AEDT) Received: from h2850616.stratoserver.net (localhost [IPv6:::1]) by phobos.denx.de (Postfix) with ESMTP id 1019E8207B; Sat, 27 Feb 2021 14:09:35 +0100 (CET) Authentication-Results: phobos.denx.de; dmarc=fail (p=none dis=none) header.from=gmx.de Authentication-Results: phobos.denx.de; spf=pass smtp.mailfrom=u-boot-bounces@lists.denx.de Authentication-Results: phobos.denx.de; dkim=pass (1024-bit key; secure) header.d=gmx.net header.i=@gmx.net header.b="lXsPGKXi"; dkim-atps=neutral Received: by phobos.denx.de (Postfix, from userid 109) id 4716E82047; Sat, 27 Feb 2021 14:09:15 +0100 (CET) X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on phobos.denx.de X-Spam-Level: X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,FREEMAIL_FROM,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE autolearn=ham autolearn_force=no version=3.4.2 Received: from mout.gmx.net (mout.gmx.net [212.227.17.20]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (No client certificate requested) by phobos.denx.de (Postfix) with ESMTPS id EF87681FBA for ; Sat, 27 Feb 2021 14:09:05 +0100 (CET) Authentication-Results: phobos.denx.de; dmarc=pass (p=none dis=none) header.from=gmx.de Authentication-Results: phobos.denx.de; spf=pass smtp.mailfrom=xypron.glpk@gmx.de DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=gmx.net; s=badeba3b8450; t=1614431345; bh=Vn4z/LFX95Sy+15hI/hHUBVo9DQ2vg1POxiAbWnEaBs=; h=X-UI-Sender-Class:From:To:Cc:Subject:Date:In-Reply-To:References; b=lXsPGKXi9c9sD1ai596kc4CCiCBLn+h4hrc2rWhBQSXf8WR7kIroSpa8k50g75DZo t5GlVVBPrRugb8kqMnPcpDHgZ71UBkcf/UCp/ogaLjSWaJlDFOWM6rpQyZ8QYkybXx W4kHistnIJ4nDAPcAfI0URtVfqk5OmLAB4IfQmJE= X-UI-Sender-Class: 01bb95c1-4bf8-414a-932a-4f6e2808ef9c Received: from LT02.fritz.box ([62.143.246.89]) by mail.gmx.net (mrgmx105 [212.227.17.174]) with ESMTPSA (Nemesis) id 1N49hB-1lyMgT3c9p-0105Xs; Sat, 27 Feb 2021 14:09:04 +0100 From: Heinrich Schuchardt To: Alexander Graf , Anatolij Gustschin Cc: u-boot@lists.denx.de, Heinrich Schuchardt Subject: [PATCH 3/6] lib/charset: utf8_get() should return error Date: Sat, 27 Feb 2021 14:08:37 +0100 Message-Id: <20210227130840.166193-4-xypron.glpk@gmx.de> X-Mailer: git-send-email 2.30.0 In-Reply-To: <20210227130840.166193-1-xypron.glpk@gmx.de> References: <20210227130840.166193-1-xypron.glpk@gmx.de> MIME-Version: 1.0 X-Provags-ID: V03:K1:alRset+TKqoSuYIaBBa2zSqh2EHv4e9S79z3s4FVHzBvVv6HlEz 0YLL56tqsEOi82TdBqPS0GygC5JKDW6GcSkZqLvU2FgKMsVfGVxCmz9gYw7mdD2DF9wuqqH moNPQC9cwLMZhgu5OZ4aoApZocOuwtZ+5nDkyMvKD6oXSiYoSh1vpVm2DNINgQkny15LDfx Q2s6MX1IpcR//95+/Y07A== X-UI-Out-Filterresults: notjunk:1;V03:K0:AWQe4+jJg4w=:o4DcpTsbE113WPq8gT7Sfx 92hr24Vnuk+hBcGGQhdrPi7FCZpsjpPloWEt+T+ybNTpMtxHVzdJ7HZXr2B9ZMV+wT9t+4Y4d L0tLb/mzYVNav18pUUHJ3y7craXj6ybs915yJV5aoEpccnUyHXM66Kz6qU+zPpb9nmq7Ec/Em 6H34aiAH0BcWB8RhWCNHPlun4cIW9fZfYmmkVwAKwE+9oDaGPI3EeMr+/9bjAUNLO1t4H4Qw/ 31Fj8KRYgMMMPgimuR23BFQZmlEqY2aq3l9BnZVlb7oaY7h1PUPUcZvcCtvh1ZZkPFPBLyoyZ oWv8/k9Ta6FdxxiVrEViPGlJrScuA+ZJJ3Jv+Xqo02QabNHFeRYUqg2yzgWbYmo7l5VTWgt3Y cOKN+LN+SlxAAJTmPNjXgSuzCtQv7SaXjV9j3wG1iAo/PIhwBrfI344C172D0LJ/9PPus/BXW a2SOgZLDxCoi06NOqHC2PWzZx+TIxjehnno0NVEeXqLMdkxhPFNdijfu3k2vtxdMGoNE6aJ2W QC+22cPJ/TmgrxloB9hSrtJDD9rb8xrJVTRbR8pj4G9yG0PcnJnwaMoonLN5fS8GVOcVQBIt3 53VNWf4ZxnSP3J5hghbJln9ttVbe6kpw57SmFWrErMO5wslkhuLWM7idJ5WuBl1plxtRXbbnT A6+Nqtr/IrnInHvGFqJKeKQowPlWKad6zijCqeBVaWhLD7mTFBK+io1/oo9dIrCBW9QlzFh1n 2WPK2dIu20oCrtGIW1u7vaSQFzJM5DJjZNto+AUcsE2KtLPSwb4ra+LEhB/OIoINne+78Bc7Z e3b1EU9VU+kaioCDrg5TsoaJ6cAbj6XR5dYdgLjvypwkGwzPyHVkPno4RXBRAVUF0D4XP206U szriwAB3ezrNbIed995g== X-BeenThere: u-boot@lists.denx.de X-Mailman-Version: 2.1.34 Precedence: list List-Id: U-Boot discussion List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: u-boot-bounces@lists.denx.de Sender: "U-Boot" X-Virus-Scanned: clamav-milter 0.102.4 at phobos.denx.de X-Virus-Status: Clean utf8_get() should return an error if hitting an illegal UTF-8 sequence and not silently convert the input to a question mark. Correct utf_8() and the its unit test. console_read_unicode() now will ignore illegal UTF-8 sequences. Signed-off-by: Heinrich Schuchardt --- lib/charset.c | 25 ++++++++++++++++--------- test/unicode_ut.c | 7 +++++++ 2 files changed, 23 insertions(+), 9 deletions(-) -- 2.30.0 diff --git a/lib/charset.c b/lib/charset.c index 1345c8f9f0..946d5ee23e 100644 --- a/lib/charset.c +++ b/lib/charset.c @@ -32,7 +32,7 @@ static struct capitalization_table capitalization_table[] = * * @read_u8: - stream reader * @src: - string buffer passed to stream reader, optional - * Return: - Unicode code point + * Return: - Unicode code point, or -1 */ static int get_code(u8 (*read_u8)(void *data), void *data) { @@ -78,7 +78,7 @@ static int get_code(u8 (*read_u8)(void *data), void *data) } return ch; error: - return '?'; + return -1; } /** @@ -120,14 +120,21 @@ static u8 read_console(void *data) int console_read_unicode(s32 *code) { - if (!tstc()) { - /* No input available */ - return 1; - } + for (;;) { + s32 c; - /* Read Unicode code */ - *code = get_code(read_console, NULL); - return 0; + if (!tstc()) { + /* No input available */ + return 1; + } + + /* Read Unicode code */ + c = get_code(read_console, NULL); + if (c > 0) { + *code = c; + return 0; + } + } } s32 utf8_get(const char **src) diff --git a/test/unicode_ut.c b/test/unicode_ut.c index 2cc6b5feff..154361aea7 100644 --- a/test/unicode_ut.c +++ b/test/unicode_ut.c @@ -52,6 +52,7 @@ static const char d4[] = {0xf0, 0x90, 0x92, 0x8d, 0xf0, 0x90, 0x92, 0x96, static const char j1[] = {0x6a, 0x31, 0xa1, 0x6c, 0x00}; static const char j2[] = {0x6a, 0x32, 0xc3, 0xc3, 0x6c, 0x00}; static const char j3[] = {0x6a, 0x33, 0xf0, 0x90, 0xf0, 0x00}; +static const char j4[] = {0xa1, 0x00}; static int unicode_test_u16_strlen(struct unit_test_state *uts) { @@ -165,6 +166,12 @@ static int unicode_test_utf8_get(struct unit_test_state *uts) ut_asserteq(0x0001048d, code); ut_asserteq_ptr(s, d4 + 4); + /* Check illegal character */ + s = j4; + code = utf8_get((const char **)&s); + ut_asserteq(-1, code); + ut_asserteq_ptr(j4 + 1, s); + return 0; } UNICODE_TEST(unicode_test_utf8_get);