From patchwork Tue May 17 06:57:13 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Liebler X-Patchwork-Id: 622879 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3r87SL4kFcz9t4P for ; Tue, 17 May 2016 16:57:38 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b=B04TP/bC; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:to:from:subject:date:message-id:mime-version :content-type; q=dns; s=default; b=kW90+Geotc0CiKJcw8a74S5E/LdjC OpQRBS96kpnTSIqojJnvq7Rn5c4iDGURBrDfKUK+4OCAKWidxmQC+5Z/Ro/84t6c l7nQyy3n5KNeA2dV8n0xPjiVMWBrAiZQYVSFpaOh515ZlThtqYjSDR5E5PoEGZtg cJ/zT+5r2slwyY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:to:from:subject:date:message-id:mime-version :content-type; s=default; bh=4Dv83NW4SAW6IFLoIHDK/H01TME=; b=B04 TP/bCRuyfNDqFyudy7MWCylmzBH+CHc1B0un42FIyyq81xqRHA/hgb0D59K15y27 St6K1vqKTyC2xKGFPe5prtN2bNRFDO9gB1VLaayS0Rc5RVli4+N/XYk9ufiHTX0n 4wDZseRROIeZVtvrm9/QTy8qWK2RgcdkRPpxh+90= Received: (qmail 102255 invoked by alias); 17 May 2016 06:57:32 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 102205 invoked by uid 89); 17 May 2016 06:57:31 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-3.6 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, RP_MATCHES_RCVD, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=D*linux.vnet.ibm.com, D*vnet.ibm.com, cancellation, 33322 X-HELO: plane.gmane.org To: libc-alpha@sourceware.org From: Stefan Liebler Subject: [PATCH] Fix tst-cancel17/tst-cancelx17, which sometimes segfaults while exiting. Date: Tue, 17 May 2016 08:57:13 +0200 Lines: 115 Message-ID: Mime-Version: 1.0 X-Mozilla-News-Host: news://news.gmane.org:119 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.0 Hi, The testcase tst-cancel[x]17 ends sometimes with a segmentation fault. This happens in one of 10000 cases. Then the real testcase has already exited with success and returned from do_test(). The segmentation fault occurs after returning from main in _dl_fini(). In those cases, the aio_read(&a) was not canceled, because the read request was already in progress. In the meanwhile aio_write(ap) wrote something to the pipe and the read request is able to read the requested byte. The read request hasn't finished before returning from do_test(). After it finishes, it writes the return value and error code from the read syscall to the struct aiocb a, which lays on the stack of do_test. The stack of the subsequent function call of _dl_fini or _dl_sort_fini, which is inlined in _dl_fini, is corrupted. In case of S390, it reads a zero and decrements it by 1: unsigned int k = nmaps - 1; struct link_map **runp = maps[k]->l_initfini; The subsequent load from unmapped memory leads to the segmentation fault. The stack corruption also happens on other architectures. I saw them e.g. on x86 and ppc, too. This patch adds an aio_suspend call to ensure, that the read request is finished before returning from do_test(). Okay to commit? Bye Stefan ChangeLog: * nptl/tst-cancel17.c (do_test): Wait for finishing aio_read(&a). commit fd27163dc65df76132946c3eca29dfb82fea2fcc Author: Stefan Liebler Date: Fri May 13 15:10:30 2016 -0400 Fix tst-cancel17/tst-cancelx17, which sometimes segfaults while exiting. The testcase tst-cancel[x]17 ends sometimes with a segmentation fault. This happens in one of 10000 cases. Then the real testcase has already exited with success and returned from do_test(). The segmentation fault occurs after returning from main in _dl_fini(). In those cases, the aio_read(&a) was not canceled, because the read request was already in progress. In the meanwhile aio_write(ap) wrote something to the pipe and the read request is able to read the requested byte. The read request hasn't finished before returning from do_test(). After it finishes, it writes the return value and error code from the read syscall to the struct aiocb a, which lays on the stack of do_test. The stack of the subsequent function call of _dl_fini or _dl_sort_fini, which is inlined in _dl_fini is corrupted. In case of S390, it reads a zero and decrements it by 1: unsigned int k = nmaps - 1; struct link_map **runp = maps[k]->l_initfini; The load from unmapped memory leads to the segmentation fault. The stack corruption also happens on other architectures. I saw them e.g. on x86 and ppc, too. This patch adds an aio_suspend call to ensure, that the read request is finished before returning from do_test(). ChangeLog: * nptl/tst-cancel17.c (do_test): Wait for finishing aio_read(&a). diff --git a/nptl/tst-cancel17.c b/nptl/tst-cancel17.c index fb89292..9ff4e27 100644 --- a/nptl/tst-cancel17.c +++ b/nptl/tst-cancel17.c @@ -333,6 +333,22 @@ do_test (void) puts ("early cancellation succeeded"); + if (ap == &a2) + { + /* The aio_read(&a) was not canceled, because the read request was + already in progress. In the meanwhile aio_write(ap) wrote something + to the pipe and the read request either has already been finished or + is able to read the requested byte. + Wait for the read request before returning from this function, because + the return value and error code from the read syscall will be written + to the struct aiocb a, which lays on the stack of this function. + Otherwise the stack from subsequent function calls - e.g. _dl_fini - + will be corrupted, which can lead to undefined behaviour like a + segmentation fault. */ + const struct aiocb *l[1] = { &a }; + TEMP_FAILURE_RETRY (aio_suspend(l, 1, NULL)); + } + return 0; }