From patchwork Tue Nov 26 20:20:52 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 294396 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id CDA322C00B1 for ; Wed, 27 Nov 2013 07:21:35 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; q=dns; s=default; b=sZ6jXDjVpbbuSLUj9rL5IQcACHZXG Me5ld7nwqMujZ0/bbazl4yFJCCbUyTn7n6UN/LLMVDskI/mIcvvsgBTJrDbohVDm mxxS4LnZS56+5xHAFgAjbTo2+r1BT8CQmRiu5q8brdR6YJLIdJNiScLwJOQxcRxv 3m85Vb8OwooUT4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; s=default; bh=6QMndO/toMLd4IYL6p11f6BIsps=; b=HSo xJtMn75LKgSk67zqP9lQjgWp6tXS9Qq4X04wbzuZrTnTAOeKAJqAcsb61NbQvv15 6TJ6P9Y+wPO0HtIvvNIvyDIad36swSE/6dYYjELMrortSOgadMj4SgQBAXePNE74 T8b6j7avsMh6c0Zwb9xlFoSQOoERuatww++Qhs+M= Received: (qmail 15593 invoked by alias); 26 Nov 2013 20:21:25 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 15572 invoked by uid 89); 26 Nov 2013 20:21:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=AWL, BAYES_50, RDNS_NONE, SPF_HELO_PASS, SPF_PASS autolearn=no version=3.3.2 X-HELO: mx1.redhat.com Received: from Unknown (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 26 Nov 2013 20:21:23 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id rAQKLFVd011134 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 26 Nov 2013 15:21:15 -0500 Received: from tucnak.zalov.cz (vpn1-4-96.ams2.redhat.com [10.36.4.96]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id rAQKL7hs000489 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Tue, 26 Nov 2013 15:21:12 -0500 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.14.7/8.14.7) with ESMTP id rAQKL0lX018518; Tue, 26 Nov 2013 21:21:01 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.14.7/8.14.7/Submit) id rAQKKqx9018515; Tue, 26 Nov 2013 21:20:52 +0100 Date: Tue, 26 Nov 2013 21:20:52 +0100 From: Jakub Jelinek To: Jan Hubicka Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Fix i386 memcpy/memset expansion (PR target/59229) Message-ID: <20131126202052.GM892@tucnak.redhat.com> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Hi! As the testcase in the patch shows, if exact memcpy or memset count is unknown, but max_size is smaller than epilogue_size_needed, ix86_expand_set_or_movmem can ICE. The following patch fixes that, bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Though, the resulting code doesn't look very good, as everything is expanded as just epilogue of the copying/memset, I think the probabilities on the branches expect that all bits of the remaining size are 0 after the main loop (which isn't done in this case). 2013-11-26 Jakub Jelinek PR target/59229 * config/i386/i386.c (device_alg): Fix up formatting. (ix86_expand_set_or_movmem): Handle max_size < epilogue_size_needed similarly to count && count < epilogue_size_needed. Fix up comment typo. * builtins.c (determine_block_size): Fix comment typo. * gcc.c-torture/execute/pr59229.c: New test. Jakub --- gcc/config/i386/i386.c.jj 2013-11-25 18:30:18.000000000 +0100 +++ gcc/config/i386/i386.c 2013-11-26 11:27:38.116198901 +0100 @@ -23453,7 +23453,8 @@ decide_alg (HOST_WIDE_INT count, HOST_WI /* If expected size is not known but max size is small enough so inline version is a win, set expected size into the range. */ - if (max > 1 && (unsigned HOST_WIDE_INT)max >= max_size && expected_size == -1) + if (max > 1 && (unsigned HOST_WIDE_INT) max >= max_size + && expected_size == -1) expected_size = min_size / 2 + max_size / 2; /* If user specified the algorithm, honnor it if possible. */ @@ -23752,7 +23753,7 @@ ix86_expand_set_or_movmem (rtx dst, rtx bool noalign; enum machine_mode move_mode = VOIDmode; int unroll_factor = 1; - /* TODO: Once vlaue ranges are available, fill in proper data. */ + /* TODO: Once value ranges are available, fill in proper data. */ unsigned HOST_WIDE_INT min_size = 0; unsigned HOST_WIDE_INT max_size = -1; unsigned HOST_WIDE_INT probable_max_size = -1; @@ -23967,21 +23968,19 @@ ix86_expand_set_or_movmem (rtx dst, rtx loop variant. */ if (issetmem && epilogue_size_needed > 2 && !promoted_val) force_loopy_epilogue = true; - if (count) + if ((count && count < (unsigned HOST_WIDE_INT) epilogue_size_needed) + || max_size < (unsigned HOST_WIDE_INT) epilogue_size_needed) { - if (count < (unsigned HOST_WIDE_INT)epilogue_size_needed) - { - /* If main algorithm works on QImode, no epilogue is needed. - For small sizes just don't align anything. */ - if (size_needed == 1) - desired_align = align; - else - goto epilogue; - } + /* If main algorithm works on QImode, no epilogue is needed. + For small sizes just don't align anything. */ + if (size_needed == 1) + desired_align = align; + else + goto epilogue; } - else if (min_size < (unsigned HOST_WIDE_INT)epilogue_size_needed) + else if (!count + && min_size < (unsigned HOST_WIDE_INT) epilogue_size_needed) { - gcc_assert (max_size >= (unsigned HOST_WIDE_INT)epilogue_size_needed); label = gen_label_rtx (); emit_cmp_and_jump_insns (count_exp, GEN_INT (epilogue_size_needed), --- gcc/builtins.c.jj 2013-11-22 21:03:07.000000000 +0100 +++ gcc/builtins.c 2013-11-26 11:15:11.992044093 +0100 @@ -3146,7 +3146,7 @@ determine_block_size (tree len, rtx len_ } else if (range_type == VR_ANTI_RANGE) { - /* Anti range 0...N lets us to determine minmal size to N+1. */ + /* Anti range 0...N lets us to determine minimal size to N+1. */ if (min.is_zero ()) { if ((max + double_int_one).fits_uhwi ()) @@ -3156,7 +3156,7 @@ determine_block_size (tree len, rtx len_ int n; if (n < 100) - memcpy (a,b, n) + memcpy (a, b, n) Produce anti range allowing negative values of N. We still can use the information and make a guess that N is not negative. --- gcc/testsuite/gcc.c-torture/execute/pr59229.c.jj 2013-11-26 11:32:07.590806813 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr59229.c 2013-11-26 11:31:56.000000000 +0100 @@ -0,0 +1,29 @@ +int i; + +__attribute__((noinline, noclone)) void +bar (char *p) +{ + if (i < 1 || i > 6) + __builtin_abort (); + if (__builtin_memcmp (p, "abcdefg", i + 1) != 0) + __builtin_abort (); + __builtin_memset (p, ' ', 7); +} + +__attribute__((noinline, noclone)) void +foo (char *p, unsigned long l) +{ + if (l < 1 || l > 6) + return; + char buf[7]; + __builtin_memcpy (buf, p, l + 1); + bar (buf); +} + +int +main () +{ + for (i = 0; i < 16; i++) + foo ("abcdefghijklmnop", i); + return 0; +}