[{"id":3675524,"web_url":"http://patchwork.ozlabs.org/comment/3675524/","msgid":"<f5331e7c-aece-4bc7-88c8-a9c221843485@redhat.com>","list_archive_url":null,"date":"2026-04-09T21:47:58","subject":"Re: [PATCH v3] Use pending character state in IBM1390, IBM1399\n character sets (CVE-2026-4046)","submitter":{"id":68478,"url":"http://patchwork.ozlabs.org/api/people/68478/","name":"Carlos O'Donell","email":"codonell@redhat.com"},"content":"On 4/9/26 8:32 AM, Florian Weimer wrote:\n> Follow the example in iso-2022-jp-3.c and use the __count state\n> variable to store the pending character.  This avoids restarting\n> the conversion if the output buffer ends between two 4-byte UCS-4\n> code points, so that the assert reported in the bug can no longer\n> happen.\n\nLooking forward to a v4.\n\n> Even though the fix is applied to ibm1364.c, the change is only\n> effective for the two HAS_COMBINED codecs for IBM1390, IBM1399.\n> \n> The test case was mostly auto-generated using\n> claude-4.6-opus-high-thinking, and composer-2-fast shows up in the\n> log as well.  During review, gpt-5.4-xhigh flagged that the original\n> version of the test case was not exercising the new character\n> flush logic.\n\nPlease add the following tag to your commit message?\n~~~\nAssisted-by: Claude:claude-opus-4-6\n~~~\nThis follows Linux kernel convention and acts as due diligence in our\nrecord keeping that this contribution is a mix of human and machine\ngenerated content which is still copyrightable.\n\nMy position here is that we can accept this contribution because it is\na test case that uses custom glibc logic and exists nowhere else in the\necosystem. This reduces the risk of infringement through copying to near\nzero. Likewise the non-copyrightable parts are just that, and your human\ncontributions are recorded in the commit authorship. The test is very\nsimilar to other tests in this area and is largely boiler plate in that\nsense.\n\n> This fixes bug 33980.\n> \n> ---\n> v3: Fix typo in comment in test.  Avoid cast to (char **) in test case.\n> \n>   iconvdata/Makefile       |   4 +-\n>   iconvdata/ibm1364.c      |  70 ++++++++++++++++++-----\n>   iconvdata/tst-bug33980.c | 145 +++++++++++++++++++++++++++++++++++++++++++++++\n>   3 files changed, 203 insertions(+), 16 deletions(-)\n> \n> diff --git a/iconvdata/Makefile b/iconvdata/Makefile\n> index 26e888b443..fbb0067302 100644\n> --- a/iconvdata/Makefile\n> +++ b/iconvdata/Makefile\n> @@ -76,7 +76,7 @@ tests = bug-iconv1 bug-iconv2 tst-loading tst-e2big tst-iconv4 bug-iconv4 \\\n>   \ttst-iconv6 bug-iconv5 bug-iconv6 tst-iconv7 bug-iconv8 bug-iconv9 \\\n>   \tbug-iconv10 bug-iconv11 bug-iconv12 tst-iconv-big5-hkscs-to-2ucs4 \\\n>   \tbug-iconv13 bug-iconv14 bug-iconv15 \\\n> -\ttst-iconv-iso-2022-cn-ext\n> +\ttst-iconv-iso-2022-cn-ext tst-bug33980\n\nOK.\n\n>   ifeq ($(have-thread-library),yes)\n>   tests += bug-iconv3\n>   endif\n> @@ -333,6 +333,8 @@ $(objpfx)bug-iconv15.out: $(addprefix $(objpfx), $(gconv-modules)) \\\n>   \t\t\t  $(addprefix $(objpfx),$(modules.so))\n>   $(objpfx)tst-iconv-iso-2022-cn-ext.out: $(addprefix $(objpfx), $(gconv-modules)) \\\n>   \t\t\t\t\t$(addprefix $(objpfx),$(modules.so))\n> +$(objpfx)tst-bug33980.out: $(addprefix $(objpfx), $(gconv-modules)) \\\n> +\t\t\t   $(addprefix $(objpfx),$(modules.so))\n\nOK.\n\n>   \n>   $(objpfx)iconv-test.out: run-iconv-test.sh \\\n>   \t\t\t $(addprefix $(objpfx), $(gconv-modules)) \\\n> diff --git a/iconvdata/ibm1364.c b/iconvdata/ibm1364.c\n> index 4f41f22c12..8df66ea048 100644\n> --- a/iconvdata/ibm1364.c\n> +++ b/iconvdata/ibm1364.c\n> @@ -67,12 +67,29 @@\n>   \n>   /* Since this is a stateful encoding we have to provide code which resets\n>      the output state to the initial state.  This has to be done during the\n> -   flushing.  */\n> +   flushing.  For the to-internal direction (FROM_DIRECTION is true),\n> +   there may be a pending character that needs flushing.  */\n\nOK. Correct.\n\n>   #define EMIT_SHIFT_TO_INIT \\\n>     if ((data->__statep->__count & ~7) != sb)\t\t\t\t      \\\n>       {\t\t\t\t\t\t\t\t\t      \\\n>         if (FROM_DIRECTION)\t\t\t\t\t\t      \\\n> -\tdata->__statep->__count &= 7;\t\t\t\t\t      \\\n> +\t{\t\t\t\t\t\t\t\t      \\\n> +\t  uint32_t ch = data->__statep->__count >> 7;\t\t\t      \\\n> +\t  if (__glibc_unlikely (ch != 0))\t\t\t\t      \\\n\nOK. Check the state bit.\n\n> +\t    {\t\t\t\t\t\t\t\t      \\\n> +\t      if (__glibc_unlikely (outend - outbuf < 4))\t\t      \\\n> +\t\tstatus = __GCONV_FULL_OUTPUT;\t\t\t\t      \\\n\nOK. Return full again without doing anything.\n\n> +\t      else\t\t\t\t\t\t\t      \\\n> +\t\t{\t\t\t\t\t\t\t      \\\n> +\t\t  put32 (outbuf, ch);\t\t\t\t\t      \\\n> +\t\t  outbuf += 4;\t\t\t\t\t\t      \\\n> +\t\t  /* Clear character and db bit.  */\t\t\t      \\\n> +\t\t  data->__statep->__count &= 7;\t\t\t\t      \\\n\nOK.\n\n> +\t\t}\t\t\t\t\t\t\t      \\\n> +\t    }\t\t\t\t\t\t\t\t      \\\n> +\t  else\t\t\t\t\t\t\t\t      \\\n> +\t    data->__statep->__count &= 7;\t\t\t\t      \\\n\nOK.\n\n> +\t}\t\t\t\t\t\t\t\t      \\\n>         else\t\t\t\t\t\t\t\t      \\\n>   \t{\t\t\t\t\t\t\t\t      \\\n>   \t  /* We are not in the initial state.  To switch back we have\t      \\\n> @@ -99,11 +116,13 @@\n>       *curcsp = save_curcs\n>   \n>   \n> -/* Current codeset type.  */\n> +/* Current codeset type.  The bit is stored in the __count variable of\n> +   the conversion state.  If the db bit is set, bit 7 and above store\n> +   a pending UCS-4 code point if non-zero.  */\n>   enum\n>   {\n> -  sb = 0,\n> -  db = 64\n> +  sb = 0,\t\t\t/* Single byte mode.  */\n> +  db = 64\t\t\t/* Double byte mode.  */\n>   };\n>   \n>   \n> @@ -119,21 +138,29 @@ enum\n>         }\t\t\t\t\t\t\t\t\t      \\\n>       else\t\t\t\t\t\t\t\t      \\\n>         {\t\t\t\t\t\t\t\t\t      \\\n> -\t/* This is a combined character.  Make sure we have room.  */\t      \\\n> -\tif (__glibc_unlikely (outptr + 8 > outend))\t\t\t      \\\n> -\t  {\t\t\t\t\t\t\t\t      \\\n> -\t    result = __GCONV_FULL_OUTPUT;\t\t\t\t      \\\n> -\t    break;\t\t\t\t\t\t\t      \\\n> -\t  }\t\t\t\t\t\t\t\t      \\\n> -\t\t\t\t\t\t\t\t\t      \\\n\nOK. Moved to below...\n\n>   \tconst struct divide *cmbp\t\t\t\t\t      \\\n>   \t  = &DB_TO_UCS4_COMB[ch - __TO_UCS4_COMBINED_MIN];\t\t      \\\n>   \tassert (cmbp->res1 != 0 && cmbp->res2 != 0);\t\t\t      \\\n>   \t\t\t\t\t\t\t\t\t      \\\n>   \tput32 (outptr, cmbp->res1);\t\t\t\t\t      \\\n>   \toutptr += 4;\t\t\t\t\t\t\t      \\\n> -\tput32 (outptr, cmbp->res2);\t\t\t\t\t      \\\n> -\toutptr += 4;\t\t\t\t\t\t\t      \\\n> +\t\t\t\t\t\t\t\t\t      \\\n> +\t/* See whether we have room for the second character.  */\t      \\\n> +\tif (outend - outptr >= 4)\t\t\t\t\t      \\\n> +\t  {\t\t\t\t\t\t\t\t      \\\n> +\t    put32 (outptr, cmbp->res2);\t\t\t\t\t      \\\n> +\t    outptr += 4;\t\t\t\t\t\t      \\\n\nOK. Check then adjust.\n\n> +\t  }\t\t\t\t\t\t\t\t      \\\n> +\telse\t\t\t\t\t\t\t\t      \\\n> +\t  {\t\t\t\t\t\t\t\t      \\\n> +\t    /* Otherwise store only the first character now, and\t      \\\n> +\t       put the second one into the queue.  */\t\t\t      \\\n> +\t    curcs |= cmbp->res2 << 7;\t\t\t\t\t      \\\n> +\t    inptr += 2;\t\t\t\t\t\t\t      \\\n\nOK.\n\n> +\t    /* Tell the caller why we terminate the loop.  */\t\t      \\\n> +\t    result = __GCONV_FULL_OUTPUT;\t\t\t\t      \\\n> +\t    break;\t\t\t\t\t\t\t      \\\n> +\t  }\t\t\t\t\t\t\t\t      \\\n>         }\t\t\t\t\t\t\t\t\t      \\\n>     }\n>   #else\n> @@ -153,7 +180,20 @@ enum\n>   #define LOOPFCT \t\tFROM_LOOP\n>   #define BODY \\\n>     {\t\t\t\t\t\t\t\t\t      \\\n> -    uint32_t ch = *inptr;\t\t\t\t\t\t      \\\n> +    uint32_t ch;\t\t\t\t\t\t\t      \\\n> +\t\t\t\t\t\t\t\t\t      \\\n> +    ch = curcs >> 7;\t\t\t\t\t\t\t      \\\n> +    if (__glibc_unlikely (ch != 0))\t\t\t\t\t      \\\n> +      {\t\t\t\t\t\t\t\t\t      \\\n> +\tput32 (outptr, ch);\t\t\t\t\t\t      \\\n> +\toutptr += 4;\t\t\t\t\t\t\t      \\\n> +\t/* Remove the pending character, but preserve state bits.  */\t      \\\n> +\tcurcs &= (1 << 7) - 1;\t\t\t\t\t\t      \\\n> +\tcontinue;\t\t\t\t\t\t\t      \\\n\nOK. Start by clearing pending.\n\n> +      }\t\t\t\t\t\t\t\t\t      \\\n> +\t\t\t\t\t\t\t\t\t      \\\n> +    /* Otherwise read the next input byte.  */\t\t\t\t      \\\n> +    ch = *inptr;\t\t\t\t\t\t\t      \\\n>   \t\t\t\t\t\t\t\t\t      \\\n>       if (__builtin_expect (ch, 0) == SO)\t\t\t\t\t      \\\n>         {\t\t\t\t\t\t\t\t\t      \\\n> diff --git a/iconvdata/tst-bug33980.c b/iconvdata/tst-bug33980.c\n> new file mode 100644\n> index 0000000000..da8fd336e6\n> --- /dev/null\n> +++ b/iconvdata/tst-bug33980.c\n> @@ -0,0 +1,145 @@\n> +/* Test for bug 33980: combining characters in IBM1390/IBM1399.\n\nOK.\n\n> +   Copyright (C) 2026 Free Software Foundation, Inc.\n> +   This file is part of the GNU C Library.\n> +\n> +   The GNU C Library is free software; you can redistribute it and/or\n> +   modify it under the terms of the GNU Lesser General Public\n> +   License as published by the Free Software Foundation; either\n> +   version 2.1 of the License, or (at your option) any later version.\n> +\n> +   The GNU C Library is distributed in the hope that it will be useful,\n> +   but WITHOUT ANY WARRANTY; without even the implied warranty of\n> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU\n> +   Lesser General Public License for more details.\n> +\n> +   You should have received a copy of the GNU Lesser General Public\n> +   License along with the GNU C Library; if not, see\n> +   <https://www.gnu.org/licenses/>.  */\n> +\n> +#include <alloc_buffer.h>\n> +#include <errno.h>\n> +#include <iconv.h>\n> +#include <stdbool.h>\n> +#include <string.h>\n> +\n> +#include <support/check.h>\n> +#include <support/next_to_fault.h>\n> +#include <support/support.h>\n> +\n> +/* Run iconv in a loop with a small output buffer of OUTBUFSIZE bytes\n> +   starting at OUTBUF.  OUTBUF should be right before an unmapped page\n> +   so that writing past the end will fault.  Skip SHIFT bytes at the\n> +   start of the input and output, to exercise different buffer\n> +   alignment.  TRUNCATE indicates skipped bytes at the end of\n> +   input (0 and 1 a valid).  */\n> +static void\n> +test_one (const char *encoding, unsigned int shift, unsigned int truncate,\n> +          char *outbuf, size_t outbufsize)\n> +{\n> +  /* In IBM1390 and IBM1399, the DBCS code 0xECB5 expands to two\n> +     Unicode code points when translated.  */\n> +  static char input[] =\n> +    {\n> +      /* 8 letters X.  */\n> +      0xe7, 0xe7, 0xe7, 0xe7, 0xe7, 0xe7, 0xe7, 0xe7,\n> +      /* SO, 0xECB5, SI: shift to DBCS, special character, shift back.  */\n> +      0x0e, 0xec, 0xb5, 0x0f\n> +    };\n> +\n> +  /* Expected output after UTF-8 conversion.  */\n> +  static char expected[] =\n> +    {\n> +      'X', 'X', 'X', 'X', 'X', 'X', 'X', 'X',\n> +      /* U+304B (HIRAGANA LETTER KA).  */\n> +      0xe3, 0x81, 0x8b,\n> +      /* U+309A (COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK).  */\n> +      0xe3, 0x82, 0x9a\n> +    };\n> +\n> +  iconv_t cd = iconv_open (\"UTF-8\", encoding);\n> +  TEST_VERIFY_EXIT (cd != (iconv_t) -1);\n> +\n> +  char result_storage[64];\n> +  struct alloc_buffer result_buf\n> +    = alloc_buffer_create (result_storage, sizeof (result_storage));\n> +\n> +  char *inptr = &input[shift];\n\nOK.\n\n> +  size_t inleft = sizeof (input) - shift - truncate;\n> +\n> +  while (inleft > 0)\n> +    {\n> +      char *outptr = outbuf;\n> +      size_t outleft = outbufsize;\n> +      size_t inleft_before = inleft;\n> +\n> +      size_t ret = iconv (cd, &inptr, &inleft, &outptr, &outleft);\n\nOK.\n\n> +      size_t produced = outptr - outbuf;\n> +      alloc_buffer_copy_bytes (&result_buf, outbuf, produced);\n> +\n> +      if (ret == (size_t) -1 && errno == E2BIG)\n> +        {\n> +          if (produced == 0 && inleft == inleft_before)\n> +            {\n> +              /* Output buffer too small to make progress.  This is\n> +                 expected for very small output buffer sizes.  */\n> +              TEST_VERIFY_EXIT (outbufsize < 3);\n> +              break;\n> +            }\n> +          continue;\n> +        }\n> +      if (ret == (size_t) -1)\n> +        FAIL_EXIT1 (\"%s (outbufsize %zu): iconv: %m\", encoding, outbufsize);\n> +      break;\n> +    }\n> +\n> +  /* Flush any pending state (e.g. a buffered combined character).  */\n> +  while (true)\n> +    {\n> +      char *outptr = outbuf;\n> +      size_t outleft = outbufsize;\n> +\n> +      size_t ret = iconv (cd, NULL, NULL, &outptr, &outleft);\n> +      size_t produced = outptr - outbuf;\n> +      alloc_buffer_copy_bytes (&result_buf, outbuf, produced);\n> +\n> +      if (ret == (size_t) -1 && errno == E2BIG)\n> +        continue;\n\nWhat happens when we don't make forward progress?\n\nDo we expect the test to loop forever and timeout or should we just\ncheck produced == 0?\n\nNoted by claude-opus-4.6.\n\n> +      TEST_VERIFY_EXIT (ret == 0);\n> +      break;\n> +    }\n> +\n> +  TEST_VERIFY_EXIT (!alloc_buffer_has_failed (&result_buf));\n> +  size_t result_used\n> +    = sizeof (result_storage) - alloc_buffer_size (&result_buf);\n> +\n> +  if (outbufsize >= 3)\n> +    {\n> +      TEST_COMPARE (inleft, 0);\n> +      TEST_COMPARE (result_used, sizeof (expected) - shift);\n> +      TEST_COMPARE_BLOB (result_storage, result_used,\n> +                         &expected[shift], sizeof (expected) - shift);\n> +    }\n> +\n> +  TEST_VERIFY_EXIT (iconv_close (cd) == 0);\n> +}\n> +\n> +static int\n> +do_test (void)\n> +{\n> +  struct support_next_to_fault ntf\n> +    = support_next_to_fault_allocate (8);\n> +\n> +  for (int shift = 0; shift <= 8; ++shift)\n> +    for (int truncate = 0; truncate < 2; ++truncate)\n> +      for (size_t outbufsize = 1; outbufsize <= 8; outbufsize++)\n> +        {\n> +          char *outbuf = ntf.buffer + ntf.length - outbufsize;\n> +          test_one (\"IBM1390\", shift, truncate, outbuf, outbufsize);\n> +          test_one (\"IBM1399\", shift, truncate, outbuf, outbufsize);\n> +        }\n> +\n> +  support_next_to_fault_free (&ntf);\n> +  return 0;\n> +}\n> +\n> +#include <support/test-driver.c>\n> \n> base-commit: 1b2f868fb4958fd59875695c1828d9804b116dc2\n>","headers":{"Return-Path":"<libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":["incoming@patchwork.ozlabs.org","libc-alpha@sourceware.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","libc-alpha@sourceware.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256\n header.s=mimecast20190719 header.b=MSu/xsJM;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org\n (client-ip=38.145.34.32; helo=vm01.sourceware.org;\n envelope-from=libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n\tdkim=pass (1024-bit key,\n unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256\n header.s=mimecast20190719 header.b=MSu/xsJM","sourceware.org; dmarc=pass (p=quarantine dis=none)\n header.from=redhat.com","sourceware.org; spf=pass smtp.mailfrom=redhat.com","server2.sourceware.org;\n arc=none smtp.remote-ip=170.10.133.124"],"Received":["from vm01.sourceware.org (vm01.sourceware.org [38.145.34.32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4fsD8H13dYz1y05\n\tfor <incoming@patchwork.ozlabs.org>; Fri, 10 Apr 2026 07:48:31 +1000 (AEST)","from vm01.sourceware.org (localhost [127.0.0.1])\n\tby sourceware.org (Postfix) with ESMTP id 0C7004BA2E1B\n\tfor <incoming@patchwork.ozlabs.org>; Thu,  9 Apr 2026 21:48:29 +0000 (GMT)","from us-smtp-delivery-124.mimecast.com\n (us-smtp-delivery-124.mimecast.com [170.10.133.124])\n by sourceware.org (Postfix) with ESMTP id B70464BA2E0C\n for <libc-alpha@sourceware.org>; Thu,  9 Apr 2026 21:48:03 +0000 (GMT)","from mail-qv1-f70.google.com (mail-qv1-f70.google.com\n [209.85.219.70]) by relay.mimecast.com with ESMTP with STARTTLS\n (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id\n us-mta-682-PB3pbnvxMdulMMav-lbSuQ-1; Thu, 09 Apr 2026 17:48:01 -0400","by mail-qv1-f70.google.com with SMTP id\n 6a1803df08f44-89ce5eec0f0so44040346d6.3\n for <libc-alpha@sourceware.org>; Thu, 09 Apr 2026 14:48:01 -0700 (PDT)","from [192.168.0.116] ([198.48.244.52])\n by smtp.gmail.com with ESMTPSA id\n 6a1803df08f44-8ac849dbb32sm6946106d6.6.2026.04.09.14.47.58\n (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);\n Thu, 09 Apr 2026 14:47:59 -0700 (PDT)"],"DKIM-Filter":["OpenDKIM Filter v2.11.0 sourceware.org 0C7004BA2E1B","OpenDKIM Filter v2.11.0 sourceware.org B70464BA2E0C"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org B70464BA2E0C","ARC-Filter":"OpenARC Filter v1.0.0 sourceware.org B70464BA2E0C","ARC-Seal":"i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1775771283; cv=none;\n b=eeZ0CFD3ImZaHZluVurqWjQ01aypOs67xXYjhoKi7Hrapqf5yt+tii6cSNFD6J3XbF7sv1BCV5MBJrWUsD80UPvZQDYeaoKrSADjGB6Zz4aE80RxKznJJjZiIKiGarO8GTdrQYbnMLF3GMiqy1vA2KXkEf7DSAhMko+OB5vUWCo=","ARC-Message-Signature":"i=1; a=rsa-sha256; d=sourceware.org; s=key;\n t=1775771283; c=relaxed/simple;\n bh=DjKFO1k+Twvu8hALm94rFQWN9AmU1CdrsKrxe1hcpA8=;\n h=DKIM-Signature:Message-ID:Date:MIME-Version:Subject:To:From;\n b=nlBQdpLPW4DYLwKF1lPZJuFWL6NCzbLYp9mTX/R891y5JF8WLfzgodC+MvHZyjRfSEhPKNQm9p00ZPgpV7KhU9I+NYvj5eBdfG3YHy79ft+AkbobvNIRbePfsAT1ySq2ewAhG+IHZsizIUkSfVCIwqe12ho9JUJORyZPjxIz02s=","ARC-Authentication-Results":"i=1; server2.sourceware.org","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;\n s=mimecast20190719; t=1775771283;\n h=from:from:reply-to:subject:subject:date:date:message-id:message-id:\n to:to:cc:mime-version:mime-version:content-type:content-type:\n content-transfer-encoding:content-transfer-encoding:\n in-reply-to:in-reply-to:references:references;\n bh=KIxX1zcF88ZsyT/WmyE429Vo7mQ/fSgCfQYnSg+8/Kc=;\n b=MSu/xsJMFvBaRODTIp7eL7xdGl3XQa9jVxFsDRRVSLmvCJamRgrE3wOnBBYAqR+3cpIeyP\n swx5fADmWkuMhnjvfGlEDTXPbEt1XqYDWjH8EcOzpo1mLhGVYr09YTcrnEaVpiBUet/8ls\n k/XHYuWOj8qSkIHOqYiNvX8wUJwNC3c=","X-MC-Unique":"PB3pbnvxMdulMMav-lbSuQ-1","X-Mimecast-MFC-AGG-ID":"PB3pbnvxMdulMMav-lbSuQ_1775771281","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=1e100.net; s=20251104; t=1775771281; x=1776376081;\n h=content-transfer-encoding:in-reply-to:from:content-language\n :references:to:subject:user-agent:mime-version:date:message-id\n :x-gm-gg:x-gm-message-state:from:to:cc:subject:date:message-id\n :reply-to;\n bh=KIxX1zcF88ZsyT/WmyE429Vo7mQ/fSgCfQYnSg+8/Kc=;\n b=C/SJT75e+Tc4dHhS35RSrG0IulEV+zmEWnZ6naRiDW2lVcQ9cM2lPxcTz0v0M1D+P7\n d1IuM9qBluAelAQvDFQoUCL2rWRogtJoc5rhGNMj4fxpaPt4Tnur5T3mT2SdgHBNrGB8\n 6LYk3LzrSfj+3pmGb6ZzBWys6aizWARB7UzvyuFCi7AmA+mY+Kiczh5sVtrJyFiFY3uW\n upUDe+XsmqfZlo1YElTfHDYtSGZDl7ErvtFJCPVgsQzkVPIQ57nc5NW8Qm9Byd2nUoXE\n j8IWvxTFazBYzsx0iH4Qxh51gP+wz+kEU+p8DfgIvGTV7uNL7SQwRm3xsKdKRCWeJtCK\n p/aQ==","X-Forwarded-Encrypted":"i=1;\n AJvYcCUiteR2SFG03sYRcGnBIsQNMYFsmgzuQ7eKYOMCEynpqieS7b1H6Phus1D0IwTHsqjUsGmVsGJB5Ayw@sourceware.org","X-Gm-Message-State":"AOJu0Yw4KsEnw4E2znCyjBa58OE2nWjgPKkHeTExJfnpa5xu466qFLtW\n RR4648MU33Dix7LIdrp8/w+RtFuWV5HC1iSIP5Ig+36YQP4ruVL18SUbZ9wkPU0AJuE1mFWo5JP\n wX3FUNyMhelFdf9lE2+iuKSAJK7E9TRI3l4ldRUpaPZ2eRCmPDHHlEeqJn0ntqcq7lWUZ2A==","X-Gm-Gg":"AeBDies6FYBfz8xxiNAFp0D6sUmWKXI7tDqxMZ3KTx/oW0IygwJzlWaOKnfm/etV2gY\n SsUxkefU6DCnd19i4lDNmK2zKNh3b7gT3u1OORnMdp+ucEIC5JfXrh4YZXQCLtpwY1MQCYwzV43\n /d8UxGvWfl3Cr0ErUHzcOv0vMRY+nptu9tcQ/ipw9D9REvyG2bCHkw+H1EMbeOxg4FSTq4IU9IC\n 9CpZhRJrDGKWZ1MGxN8Pb4IToc86QebD0kCGxCNyf8yutVXpZrxGNml7HiBLgMFbPCOkNg+YfdN\n nxka0w1ecoRv+6KPEbMN81cEnqiqrK08uphIT0DE2u63HBQHo/uKA1UP7hn7x7di7YiL+rR1xMr\n mXmArObIOWCMcgJ4LOB4i0L0MNAaRS8p44ezyMZY2iBEXj1H6FUrPLw4rmhLpXot3OYKeTGEB4J\n fsd3y0CwmyAaib7AqzNiXQyNbh","X-Received":["by 2002:a0c:f09b:0:b0:8a3:1a24:8e95 with SMTP id\n 6a1803df08f44-8ac8625b232mr7605056d6.27.1775771280526;\n Thu, 09 Apr 2026 14:48:00 -0700 (PDT)","by 2002:a0c:f09b:0:b0:8a3:1a24:8e95 with SMTP id\n 6a1803df08f44-8ac8625b232mr7604596d6.27.1775771279848;\n Thu, 09 Apr 2026 14:47:59 -0700 (PDT)"],"Message-ID":"<f5331e7c-aece-4bc7-88c8-a9c221843485@redhat.com>","Date":"Thu, 9 Apr 2026 17:47:58 -0400","MIME-Version":"1.0","User-Agent":"Mozilla Thunderbird","Subject":"Re: [PATCH v3] Use pending character state in IBM1390, IBM1399\n character sets (CVE-2026-4046)","To":"Florian Weimer <fweimer@redhat.com>, libc-alpha@sourceware.org","References":"<lhuwlygl3g6.fsf@oldenburg.str.redhat.com>","From":"Carlos O'Donell <codonell@redhat.com>","In-Reply-To":"<lhuwlygl3g6.fsf@oldenburg.str.redhat.com>","X-Mimecast-Spam-Score":"0","X-Mimecast-MFC-PROC-ID":"KDNJYgclZeTVuy7Bs294BTxEpI-dlsKetcydoH2d-pA_1775771281","X-Mimecast-Originator":"redhat.com","Content-Language":"en-US","Content-Type":"text/plain; charset=UTF-8; format=flowed","Content-Transfer-Encoding":"7bit","X-BeenThere":"libc-alpha@sourceware.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Libc-alpha mailing list <libc-alpha.sourceware.org>","List-Unsubscribe":"<https://sourceware.org/mailman/options/libc-alpha>,\n <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe>","List-Archive":"<https://sourceware.org/pipermail/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-request@sourceware.org?subject=help>","List-Subscribe":"<https://sourceware.org/mailman/listinfo/libc-alpha>,\n <mailto:libc-alpha-request@sourceware.org?subject=subscribe>","Errors-To":"libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org"}},{"id":3676032,"web_url":"http://patchwork.ozlabs.org/comment/3676032/","msgid":"<lhujyued1xx.fsf@oldenburg.str.redhat.com>","list_archive_url":null,"date":"2026-04-10T19:56:26","subject":"Re: [PATCH v3] Use pending character state in IBM1390, IBM1399\n character sets (CVE-2026-4046)","submitter":{"id":14312,"url":"http://patchwork.ozlabs.org/api/people/14312/","name":"Florian Weimer","email":"fweimer@redhat.com"},"content":"* Carlos O'Donell:\n\n> On 4/9/26 8:32 AM, Florian Weimer wrote:\n>> Follow the example in iso-2022-jp-3.c and use the __count state\n>> variable to store the pending character.  This avoids restarting\n>> the conversion if the output buffer ends between two 4-byte UCS-4\n>> code points, so that the assert reported in the bug can no longer\n>> happen.\n>\n> Looking forward to a v4.\n>\n>> Even though the fix is applied to ibm1364.c, the change is only\n>> effective for the two HAS_COMBINED codecs for IBM1390, IBM1399.\n>> The test case was mostly auto-generated using\n>> claude-4.6-opus-high-thinking, and composer-2-fast shows up in the\n>> log as well.  During review, gpt-5.4-xhigh flagged that the original\n>> version of the test case was not exercising the new character\n>> flush logic.\n>\n> Please add the following tag to your commit message?\n> ~~~\n> Assisted-by: Claude:claude-opus-4-6\n> ~~~\n> This follows Linux kernel convention and acts as due diligence in our\n> record keeping that this contribution is a mix of human and machine\n> generated content which is still copyrightable.\n\nI feel more comfortable documenting this in unstructured text, telling\nwhat I see in the Anysphere dashboard as the models used.  This avoids\nthe need to define common names for model names.  The names are likely\nto be imprecise anyway.\n\n> What happens when we don't make forward progress?\n>\n> Do we expect the test to loop forever and timeout or should we just\n> check produced == 0?\n>\n> Noted by claude-opus-4.6.\n\nThe test timeout would catch it.\n\nThe combining character tables look like this (IBM1390 followed by IBM 1399):\n\n  [0xecb5 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x304b, .res2 = 0x309a },\n  [0xecb6 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x304d, .res2 = 0x309a },\n  [0xecb7 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x304f, .res2 = 0x309a },\n  [0xecb8 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x3051, .res2 = 0x309a },\n  [0xecb9 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x3053, .res2 = 0x309a },\n  [0xecba - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30ab, .res2 = 0x309a },\n  [0xecbb - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30ad, .res2 = 0x309a },\n  [0xecbc - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30af, .res2 = 0x309a },\n  [0xecbd - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30b1, .res2 = 0x309a },\n  [0xecbe - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30b3, .res2 = 0x309a },\n  [0xecbf - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30bb, .res2 = 0x309a },\n  [0xecc0 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30c4, .res2 = 0x309a },\n  [0xecc1 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30c8, .res2 = 0x309a },\n  [0xecc2 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x31f7, .res2 = 0x309a },\n  [0xecc3 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x00e6, .res2 = 0x0300 },\n  [0xecc4 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x0254, .res2 = 0x0300 },\n  [0xecc5 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x0254, .res2 = 0x0301 },\n  [0xecc6 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x028c, .res2 = 0x0300 },\n  [0xecc7 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x028c, .res2 = 0x0301 },\n  [0xecc8 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x0259, .res2 = 0x0300 },\n  [0xecc9 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x0259, .res2 = 0x0301 },\n  [0xecca - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x025a, .res2 = 0x0300 },\n  [0xeccb - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x025a, .res2 = 0x0301 },\n  [0xeccc - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x02e9, .res2 = 0x02e5 },\n  [0xeccd - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x02e5, .res2 = 0x02e9 }\n\n  [0xecb5 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x304b, .res2 = 0x309a },\n  [0xecb6 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x304d, .res2 = 0x309a },\n  [0xecb7 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x304f, .res2 = 0x309a },\n  [0xecb8 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x3051, .res2 = 0x309a },\n  [0xecb9 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x3053, .res2 = 0x309a },\n  [0xecba - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30ab, .res2 = 0x309a },\n  [0xecbb - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30ad, .res2 = 0x309a },\n  [0xecbc - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30af, .res2 = 0x309a },\n  [0xecbd - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30b1, .res2 = 0x309a },\n  [0xecbe - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30b3, .res2 = 0x309a },\n  [0xecbf - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30bb, .res2 = 0x309a },\n  [0xecc0 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30c4, .res2 = 0x309a },\n  [0xecc1 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x30c8, .res2 = 0x309a },\n  [0xecc2 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x31f7, .res2 = 0x309a },\n  [0xecc3 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x00e6, .res2 = 0x0300 },\n  [0xecc4 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x0254, .res2 = 0x0300 },\n  [0xecc5 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x0254, .res2 = 0x0301 },\n  [0xecc6 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x028c, .res2 = 0x0300 },\n  [0xecc7 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x028c, .res2 = 0x0301 },\n  [0xecc8 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x0259, .res2 = 0x0300 },\n  [0xecc9 - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x0259, .res2 = 0x0301 },\n  [0xecca - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x025a, .res2 = 0x0300 },\n  [0xeccb - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x025a, .res2 = 0x0301 },\n  [0xeccc - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x02e9, .res2 = 0x02e5 },\n  [0xeccd - __TO_UCS4_COMBINED_MIN] = { .res1 = 0x02e5, .res2 = 0x02e9 }\n\nOnly U+309A needs three bytes in UTF-8, the other combining characters\nhave a two-byte representation.  However, U+309A only follows a\ncharacter that has a three-byte representation.  A two-byte buffer\ncannot store that first character, so U+309A never becomes pending.\n\nI'm going to remove the loop from the test case and test this behavior\nmore directly.\n\nThanks,\nFlorian","headers":{"Return-Path":"<libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org>","X-Original-To":["incoming@patchwork.ozlabs.org","libc-alpha@sourceware.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","libc-alpha@sourceware.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256\n header.s=mimecast20190719 header.b=H5NVsINW;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org\n (client-ip=2620:52:6:3111::32; helo=vm01.sourceware.org;\n envelope-from=libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n\tdkim=pass (1024-bit key,\n unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256\n header.s=mimecast20190719 header.b=H5NVsINW","sourceware.org; dmarc=pass (p=quarantine dis=none)\n header.from=redhat.com","sourceware.org; spf=pass smtp.mailfrom=redhat.com","server2.sourceware.org;\n arc=none smtp.remote-ip=170.10.133.124"],"Received":["from vm01.sourceware.org (vm01.sourceware.org\n [IPv6:2620:52:6:3111::32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4fsnd373b5z1yCv\n\tfor <incoming@patchwork.ozlabs.org>; Sat, 11 Apr 2026 05:56:55 +1000 (AEST)","from vm01.sourceware.org (localhost [127.0.0.1])\n\tby sourceware.org (Postfix) with ESMTP id 1F01B4BA2E1E\n\tfor <incoming@patchwork.ozlabs.org>; Fri, 10 Apr 2026 19:56:54 +0000 (GMT)","from us-smtp-delivery-124.mimecast.com\n (us-smtp-delivery-124.mimecast.com [170.10.133.124])\n by sourceware.org (Postfix) with ESMTP id DA8224BA2E10\n for <libc-alpha@sourceware.org>; Fri, 10 Apr 2026 19:56:33 +0000 (GMT)","from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com\n (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by\n relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3,\n cipher=TLS_AES_256_GCM_SHA384) id us-mta-28-76mLKgscMx2o-viDlyYjkQ-1; Fri,\n 10 Apr 2026 15:56:32 -0400","from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com\n (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111])\n (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest\n SHA256)\n (No client certificate requested)\n by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS\n id 7823219560B4\n for <libc-alpha@sourceware.org>; Fri, 10 Apr 2026 19:56:30 +0000 (UTC)","from fweimer-oldenburg.csb.redhat.com (unknown [10.2.16.5])\n by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with\n ESMTPS\n id 8E0301800B7F; Fri, 10 Apr 2026 19:56:28 +0000 (UTC)"],"DKIM-Filter":["OpenDKIM Filter v2.11.0 sourceware.org 1F01B4BA2E1E","OpenDKIM Filter v2.11.0 sourceware.org DA8224BA2E10"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org DA8224BA2E10","ARC-Filter":"OpenARC Filter v1.0.0 sourceware.org DA8224BA2E10","ARC-Seal":"i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1775850994; cv=none;\n b=kxHl+C8KOLtgx6HsFLBrOdf3NU0812kNl7rBsOqyFJSreiUrDsMTNRZO0qt4niSpWzbzXiGW/AKLKX+2MoDVHJoKsq51wa5b+KExKZ5Tr6qajMw57tSwQJery1u0UQwI1LR314pjZEio1nD8S9kybSWVpY2b0/boBBEYGs5rBBM=","ARC-Message-Signature":"i=1; a=rsa-sha256; d=sourceware.org; s=key;\n t=1775850994; c=relaxed/simple;\n bh=aiNJaIPAnhpL9cHQi5rkYCVldbslsQY3wULvHtW9S3g=;\n h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version;\n b=B4478SVvTXcmfWuwXqYAALLNDlKpQUh9bH02jRQAomT8Bivnaw2v8nDE9yGJwL+QGVABwLX9A7tpC1B3tt+8l43odDJanAwtCDGs2mjJ/+jKHBM+ZhtJMXPz69mEobiYQHQriS62eV9WhBDNadUlt9jYcjb/LiMm1m5tbE2r/+U=","ARC-Authentication-Results":"i=1; server2.sourceware.org","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;\n s=mimecast20190719; t=1775850993;\n h=from:from:reply-to:subject:subject:date:date:message-id:message-id:\n to:to:cc:cc:mime-version:mime-version:content-type:content-type:\n in-reply-to:in-reply-to:references:references;\n bh=d4Dh2g0ZeUhLAjAMD93AXk7XeFUvfaFvyzTyJKCAIoc=;\n b=H5NVsINWr1bmsNAVEuthaWTMD3Ke/z90lutAbvZPwuKpMRVK2Efd8ATq6bwDOaJ4rlWJdt\n /pBZ27FwQWomFpYNTHTCD/GR1csVxspgkFJjccsKOm0/U3yKgRELqQeAOOVJbqIw+kCjsh\n Z9M3cZvg7BN9LY5PGR2AAPskKrvHiVA=","X-MC-Unique":"76mLKgscMx2o-viDlyYjkQ-1","X-Mimecast-MFC-AGG-ID":"76mLKgscMx2o-viDlyYjkQ_1775850990","From":"Florian Weimer <fweimer@redhat.com>","To":"Carlos O'Donell <codonell@redhat.com>","Cc":"libc-alpha@sourceware.org","Subject":"Re: [PATCH v3] Use pending character state in IBM1390, IBM1399\n character sets (CVE-2026-4046)","In-Reply-To":"<f5331e7c-aece-4bc7-88c8-a9c221843485@redhat.com> (Carlos\n O'Donell's message of \"Thu, 9 Apr 2026 17:47:58 -0400\")","References":"<lhuwlygl3g6.fsf@oldenburg.str.redhat.com>\n <f5331e7c-aece-4bc7-88c8-a9c221843485@redhat.com>","Date":"Fri, 10 Apr 2026 21:56:26 +0200","Message-ID":"<lhujyued1xx.fsf@oldenburg.str.redhat.com>","User-Agent":"Gnus/5.13 (Gnus v5.13)","MIME-Version":"1.0","X-Scanned-By":"MIMEDefang 3.4.1 on 10.30.177.111","X-Mimecast-Spam-Score":"0","X-Mimecast-MFC-PROC-ID":"WsZbmkTwrereWR2Jve88NTOe-QTyE2_WKBuFyLknoS8_1775850990","X-Mimecast-Originator":"redhat.com","Content-Type":"text/plain","X-BeenThere":"libc-alpha@sourceware.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Libc-alpha mailing list <libc-alpha.sourceware.org>","List-Unsubscribe":"<https://sourceware.org/mailman/options/libc-alpha>,\n <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe>","List-Archive":"<https://sourceware.org/pipermail/libc-alpha/>","List-Post":"<mailto:libc-alpha@sourceware.org>","List-Help":"<mailto:libc-alpha-request@sourceware.org?subject=help>","List-Subscribe":"<https://sourceware.org/mailman/listinfo/libc-alpha>,\n <mailto:libc-alpha-request@sourceware.org?subject=subscribe>","Errors-To":"libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org"}}]