From patchwork Sat Nov 3 18:24:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sandra Loosemore X-Patchwork-Id: 992677 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-488946-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="J4j6LoGj"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42nS6R3Q8fzB4gK for ; Sun, 4 Nov 2018 05:25:15 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=v3h+v9mI0QSQ5NL3p55LScY8UL0Q6ahSitIeDXvhZj0LHOgUJ8 OjYfXz62lbRFodhSUvuvzVd6paxxWfbbTvO7Ald+1VuCiDDZ1OuBFrmqLsuq9k3k jLYdZmc0D/9u4ruTYcppu6jAWupDjeeIN6BTQaYPu6twLcdhDpRvYVgkM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=zVyTPVZZ0bcq2nNgVm1iz08ZSr0=; b=J4j6LoGj+Tx35tyWeBoO 8lq8ZJldrxSqF2xLBXhuqWfC2OVdM9JprGUSjI/ogyfxZMD47Yz8BnzEPHR2v6Cl VOCs1PydW2+h5DydavlWOfbBxRvFo6ncBNm5XIvnwMfTC8HVlCSo5FzMZFuc3USZ B6X0MHoHzirTW4cfhj9UP8I= Received: (qmail 103036 invoked by alias); 3 Nov 2018 18:24:48 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 102794 invoked by uid 89); 3 Nov 2018 18:24:27 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-11.1 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_PASS, TIME_LIMIT_EXCEEDED autolearn=unavailable version=3.3.2 spammy=enemy, tracked, get_mode, nonexistent X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 03 Nov 2018 18:24:14 +0000 Received: from svr-orw-mbx-03.mgc.mentorg.com ([147.34.90.203]) by relay1.mentorg.com with esmtps (TLSv1.2:ECDHE-RSA-AES256-SHA384:256) id 1gJ0ag-0006ZG-CT from Sandra_Loosemore@mentor.com for gcc-patches@gcc.gnu.org; Sat, 03 Nov 2018 11:24:10 -0700 Received: from [127.0.0.1] (147.34.91.1) by svr-orw-mbx-03.mgc.mentorg.com (147.34.90.203) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Sat, 3 Nov 2018 11:24:07 -0700 To: "gcc-patches@gcc.gnu.org" From: Sandra Loosemore Subject: [nios2, committed] correct sidi3 costs Message-ID: <038b9b5f-533c-6b4a-3622-95d2df04965b@codesourcery.com> Date: Sat, 3 Nov 2018 12:24:05 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 PR target/87079 reported that with -Os, the nios2 back end was emitting an inferior code sequence for widening multiply instead of using mulx. I tracked this down to the rtx costs hook not recognizing the RTL pattern for sidi3 so it would overestimate the cost. I've been aware for a while that the RTX costs computation in the nios2 backend is far from optimal or even correct :-P but giving it a complete workover is a pretty big project requiring benchmarking etc as well as unit tests. I don't want the perfect to be the enemy of the good, so I've checked in the attached patch to fix this issue and add the test case (both -Os and -O2 variants). -Sandra Index: gcc/config/nios2/nios2.c =================================================================== --- gcc/config/nios2/nios2.c (revision 265561) +++ gcc/config/nios2/nios2.c (working copy) @@ -1539,6 +1539,19 @@ nios2_rtx_costs (rtx x, machine_mode mod *total = COSTS_N_INSNS (2); /* Latency adjustment. */ else *total = COSTS_N_INSNS (1); + if (TARGET_HAS_MULX && GET_MODE (x) == DImode) + { + enum rtx_code c0 = GET_CODE (XEXP (x, 0)); + enum rtx_code c1 = GET_CODE (XEXP (x, 1)); + if ((c0 == SIGN_EXTEND && c1 == SIGN_EXTEND) + || (c0 == ZERO_EXTEND && c1 == ZERO_EXTEND)) + /* This is the sidi3 pattern, which expands into 4 insns, + 2 multiplies and 2 moves. */ + { + *total = *total * 2 + COSTS_N_INSNS (2); + return true; + } + } return false; } Index: gcc/testsuite/gcc.target/nios2/pr87079-1.c =================================================================== --- gcc/testsuite/gcc.target/nios2/pr87079-1.c (nonexistent) +++ gcc/testsuite/gcc.target/nios2/pr87079-1.c (working copy) @@ -0,0 +1,34 @@ +/* { dg-do compile } */ +/* { dg-options "-Os -mhw-div -mhw-mul -mhw-mulx" } */ + +#include +#include + +void foo(const uint8_t* str, uint32_t* res) +{ + uint32_t rdVal0, rdVal1, rdVal2; + rdVal0 = rdVal1 = rdVal2 = 0; + unsigned c; + for (;;) { + c = *str++; + unsigned dig = c - '0'; + if (dig > 9) + break; // non-digit + uint64_t x10; + + x10 = (uint64_t)rdVal0*10 + dig; + rdVal0 = (uint32_t)x10; + dig = (uint32_t)(x10 >> 32); + + x10 = (uint64_t)rdVal1*10 + dig; + rdVal1 = (uint32_t)x10; + dig = (uint32_t)(x10 >> 32); + + rdVal2 = rdVal2*10 + dig; + } + res[0] = rdVal0; + res[1] = rdVal1; + res[2] = rdVal2; +} + +/* { dg-final { scan-assembler-times "mulxuu\t" 2 } } */ Index: gcc/testsuite/gcc.target/nios2/pr87079-2.c =================================================================== --- gcc/testsuite/gcc.target/nios2/pr87079-2.c (nonexistent) +++ gcc/testsuite/gcc.target/nios2/pr87079-2.c (working copy) @@ -0,0 +1,34 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mhw-div -mhw-mul -mhw-mulx" } */ + +#include +#include + +void foo(const uint8_t* str, uint32_t* res) +{ + uint32_t rdVal0, rdVal1, rdVal2; + rdVal0 = rdVal1 = rdVal2 = 0; + unsigned c; + for (;;) { + c = *str++; + unsigned dig = c - '0'; + if (dig > 9) + break; // non-digit + uint64_t x10; + + x10 = (uint64_t)rdVal0*10 + dig; + rdVal0 = (uint32_t)x10; + dig = (uint32_t)(x10 >> 32); + + x10 = (uint64_t)rdVal1*10 + dig; + rdVal1 = (uint32_t)x10; + dig = (uint32_t)(x10 >> 32); + + rdVal2 = rdVal2*10 + dig; + } + res[0] = rdVal0; + res[1] = rdVal1; + res[2] = rdVal2; +} + +/* { dg-final { scan-assembler-times "mulxuu\t" 2 } } */