From patchwork Wed Jun 13 22:21:23 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Oleg Endo X-Patchwork-Id: 164771 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id CDFD61007D3 for ; Thu, 14 Jun 2012 08:21:56 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1340230918; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: Message-ID:Subject:From:To:Date:Content-Type:Mime-Version: Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:Sender:Delivered-To; bh=f2WeYyS74n6VXJX/VqGK 402xkZE=; b=NLqRG/oO0sGPz6V/BajME9yynx417X9I9DX+FZJPbGRWAGl+KqEu VeC4TOUlAxaRJDf6CnIDmQ5FNp1HgtuTMLSld/d/cj7Z48dsmMnkKVAvnFu355wT 4FTkheGRP0hmKk4CKy44fEEEdwFzMm8Ni0G58kfKJgI+0VxYhDunJ/s= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Received:Message-ID:Subject:From:To:Date:Content-Type:Mime-Version:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=jp8iJVBeGZEyw+jIPHdl4UE8+7cRTCgkzChibER/pugPjsidZYvc63aIEg889J fq1NUUBVNiexplW0b5Ad5EjDnIHaFpglPqD5SgNlaF1g1x/KJ8nKp4i/dldIV4TD bYKijzPXfwMBjLsUarMsNj/yDBc3zh8E2t4DAM3HvSt7Y=; Received: (qmail 1437 invoked by alias); 13 Jun 2012 22:21:52 -0000 Received: (qmail 1428 invoked by uid 22791); 13 Jun 2012 22:21:50 -0000 X-SWARE-Spam-Status: No, hits=-1.9 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, RCVD_IN_HOSTKARMA_NO, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY X-Spam-Check-By: sourceware.org Received: from mailout05.t-online.de (HELO mailout05.t-online.de) (194.25.134.82) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 13 Jun 2012 22:21:36 +0000 Received: from fwd19.aul.t-online.de (fwd19.aul.t-online.de ) by mailout05.t-online.de with smtp id 1SevwU-00089g-Rw; Thu, 14 Jun 2012 00:21:34 +0200 Received: from [192.168.0.104] (ECbUETZUoh6MJLncr6MoL8SDSkldK5urjjfFN+oQuaNiPTQSdFXJVmj+7wSp83eQxW@[87.157.52.221]) by fwd19.t-online.de with esmtp id 1SevwR-02NODI0; Thu, 14 Jun 2012 00:21:31 +0200 Message-ID: <1339626083.2198.22.camel@yam-132-YW-E178-FTW> Subject: [SH] PR 53568 - Add support for bswap built-ins From: Oleg Endo To: gcc-patches Date: Thu, 14 Jun 2012 00:21:23 +0200 Mime-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hello, The attached patch improves code generated for byte swap expressions such as ((x & 0xFF) << 8) | ((x >> 8) & 0xFF). It seems that currently the tree optimizers only detect bswap32 and bswap64 but not bswap16 patterns. The patch adds detection for bswap16 patterns by playing along with the combine pass. Tested with make -k -j8 check RUNTESTFLAGS="--target_board=sh-sim \{-m2/-ml,-m2/-mb,-m2a/-mb,-m2a-single/-mb,-m4/-ml, -m4/-mb,-m4-single/-ml,-m4-single/-mb,-m4a-single/-ml, -m4a-single/-mb}" and no new failures. Test cases for this patch and the previous bswap32 patch will follow shortly. Cheers, Oleg ChangeLog: PR target/53568 * config/sh/sh.md: Add peephole for swapbsi2. (*swapbisi2_and_shl8, *swapbhisi2): New insns and splits. Index: gcc/config/sh/sh.md =================================================================== --- gcc/config/sh/sh.md (revision 188525) +++ gcc/config/sh/sh.md (working copy) @@ -4561,6 +4561,81 @@ "swap.b %1,%0" [(set_attr "type" "arith")]) +;; The *swapbisi2_and_shl8 pattern helps the combine pass simplifying +;; partial byte swap expressions such as... +;; ((x & 0xFF) << 8) | ((x >> 8) & 0xFF). +;; ...which are currently not handled by the tree optimizers. +;; The combine pass will not initially try to combine the full expression, +;; but only some sub-expressions. In such a case the *swapbisi2_and_shl8 +;; pattern acts as an intermediate pattern that will eventually lead combine +;; to the swapbsi2 pattern above. +;; As a side effect this also improves code that does (x & 0xFF) << 8 +;; or (x << 8) & 0xFF00. +(define_insn_and_split "*swapbisi2_and_shl8" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") + (ior:SI (and:SI (ashift:SI (match_operand:SI 1 "arith_reg_operand" "r") + (const_int 8)) + (const_int 65280)) + (match_operand:SI 2 "arith_reg_operand" "r")))] + "TARGET_SH1 && ! reload_in_progress && ! reload_completed" + "#" + "&& can_create_pseudo_p ()" + [(const_int 0)] +{ + rtx tmp0 = gen_reg_rtx (SImode); + rtx tmp1 = gen_reg_rtx (SImode); + + emit_insn (gen_zero_extendqisi2 (tmp0, gen_lowpart (QImode, operands[1]))); + emit_insn (gen_swapbsi2 (tmp1, tmp0)); + emit_insn (gen_iorsi3 (operands[0], tmp1, operands[2])); + DONE; +}) + +;; The *swapbhisi2 pattern is, like the *swapbisi2_and_shl8 pattern, another +;; intermediate pattern that will help the combine pass arriving at swapbsi2. +(define_insn_and_split "*swapbhisi2" + [(set (match_operand:SI 0 "arith_reg_dest" "=r") + (ior:SI (and:SI (ashift:SI (match_operand:SI 1 "arith_reg_operand" "r") + (const_int 8)) + (const_int 65280)) + (zero_extract:SI (match_dup 1) (const_int 8) (const_int 8))))] + "TARGET_SH1 && ! reload_in_progress && ! reload_completed" + "#" + "&& can_create_pseudo_p ()" + [(const_int 0)] +{ + rtx tmp = gen_reg_rtx (SImode); + + emit_insn (gen_zero_extendhisi2 (tmp, gen_lowpart (HImode, operands[1]))); + emit_insn (gen_swapbsi2 (operands[0], tmp)); + DONE; +}) + +;; In some cases the swapbsi2 pattern might leave a sequence such as... +;; swap.b r4,r4 +;; mov r4,r0 +;; +;; which can be simplified to... +;; swap.b r4,r0 +(define_peephole2 + [(set (match_operand:SI 0 "arith_reg_dest" "") + (ior:SI (and:SI (match_operand:SI 1 "arith_reg_operand" "") + (const_int 4294901760)) + (ior:SI (and:SI (ashift:SI (match_dup 1) (const_int 8)) + (const_int 65280)) + (and:SI (ashiftrt:SI (match_dup 1) (const_int 8)) + (const_int 255))))) + (set (match_operand:SI 2 "arith_reg_dest" "") + (match_dup 0))] + "TARGET_SH1 && peep2_reg_dead_p (2, operands[0])" + [(set (match_dup 2) + (ior:SI (and:SI (match_operand:SI 1 "arith_reg_operand" "") + (const_int 4294901760)) + (ior:SI (and:SI (ashift:SI (match_dup 1) (const_int 8)) + (const_int 65280)) + (and:SI (ashiftrt:SI (match_dup 1) (const_int 8)) + (const_int 255)))))]) + ;; ------------------------------------------------------------------------- ;; Zero extension instructions