From patchwork Thu Sep 22 00:28:57 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Miller X-Patchwork-Id: 115873 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 4AFFA1007D3 for ; Thu, 22 Sep 2011 10:29:23 +1000 (EST) Received: (qmail 9262 invoked by alias); 22 Sep 2011 00:29:19 -0000 Received: (qmail 9252 invoked by uid 22791); 22 Sep 2011 00:29:16 -0000 X-SWARE-Spam-Status: No, hits=-1.7 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: sourceware.org Received: from shards.monkeyblade.net (HELO shards.monkeyblade.net) (198.137.202.13) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 22 Sep 2011 00:28:58 +0000 Received: from localhost (cpe-66-65-62-183.nyc.res.rr.com [66.65.62.183]) (authenticated bits=0) by shards.monkeyblade.net (8.14.4/8.14.4) with ESMTP id p8M0SvFI006570 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Wed, 21 Sep 2011 17:28:57 -0700 Date: Wed, 21 Sep 2011 20:28:57 -0400 (EDT) Message-Id: <20110921.202857.949679174400106081.davem@davemloft.net> To: gcc-patches@gcc.gnu.org Subject: [PATCH] Add pixel compare VIS intrinsics. From: David Miller Mime-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Unfortunately, we can't use these for comparisons on vectors as the compiler wants them. Instead of producing a vector of comparison results, these instructions produce a small bitmask of those results in a destination integer register. Therefore we have to provide them using unspecs and builtins. The bitmasks produced by these instructions are in the same form as those produced by the 'edge' instructions. They are meant to be used along-side the floating point partial-store instructions. In such partial stores, the address is given as rs1 and the bitmask is given as rs2. The bitmask controls the byte enables of the store. This allows one to do image processing, storing pixels into the destination based upon some set of conditions. Committed to trunk. gcc/ * config/sparc/sparc.md (UNSPEC_FCMPLE, UNSPEC_FCMPNE, UNSPEC_FCMPGT, UNSPEC_FCMPEQ): New unspec codes. (fcmple16_vis, fcmple32_vis, fcmpne16_vis, fcmpne32_vis, fcmpgt16_vis, fcmpgt32_vis, fcmpeq16_vis, fcmpeq32_vis): New patterns. * config/sparc/sparc.c (sparc_vis_init_builtins): Create builtins for new pixel compare VIS patterns. * config/sparc/visintrin.h (__vis_fcmple16, __vis_fcmple32, __vis_fcmpne16, __vis_fcmpne32, __vis_fcmpgt16, __vis_fcmpgt32, __vis_fcmpeq16, __vis_fcmpeq32): New. * doc/extend.texi: Document new pixel compare VIS intrinsics. --- gcc/ChangeLog | 11 +++++ gcc/config/sparc/sparc.c | 21 ++++++++++ gcc/config/sparc/sparc.md | 85 ++++++++++++++++++++++++++++++++++++++++++ gcc/config/sparc/visintrin.h | 56 +++++++++++++++++++++++++++ gcc/doc/extend.texi | 9 ++++ 5 files changed, 182 insertions(+), 0 deletions(-) diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 3f95622..44e9513 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -76,6 +76,17 @@ (sparc_vis_init_builtins): Use def_builtin_const for all VIS builtins other than alignaddr and falignaddr. + * config/sparc/sparc.md (UNSPEC_FCMPLE, UNSPEC_FCMPNE, UNSPEC_FCMPGT, + UNSPEC_FCMPEQ): New unspec codes. + (fcmple16_vis, fcmple32_vis, fcmpne16_vis, fcmpne32_vis, fcmpgt16_vis, + fcmpgt32_vis, fcmpeq16_vis, fcmpeq32_vis): New patterns. + * config/sparc/sparc.c (sparc_vis_init_builtins): Create builtins for + new pixel compare VIS patterns. + * config/sparc/visintrin.h (__vis_fcmple16, __vis_fcmple32, + __vis_fcmpne16, __vis_fcmpne32, __vis_fcmpgt16, __vis_fcmpgt32, + __vis_fcmpeq16, __vis_fcmpeq32): New. + * doc/extend.texi: Document new pixel compare VIS intrinsics. + 2011-09-21 Tom de Vries * final.c (final): Handle if JUMP_LABEL is not LABEL_P. diff --git a/gcc/config/sparc/sparc.c b/gcc/config/sparc/sparc.c index 7533307..a4917da 100644 --- a/gcc/config/sparc/sparc.c +++ b/gcc/config/sparc/sparc.c @@ -9164,6 +9164,10 @@ sparc_vis_init_builtins (void) tree si_ftype_ptr_ptr = build_function_type_list (intSI_type_node, ptr_type_node, ptr_type_node, 0); + tree si_ftype_v4hi_v4hi = build_function_type_list (intSI_type_node, + v4hi, v4hi, 0); + tree si_ftype_v2si_v2si = build_function_type_list (intSI_type_node, + v2si, v2si, 0); /* Packing and expanding vectors. */ def_builtin_const ("__builtin_vis_fpack16", CODE_FOR_fpack16_vis, @@ -9252,6 +9256,23 @@ sparc_vis_init_builtins (void) def_builtin_const ("__builtin_vis_edge32l", CODE_FOR_edge32lsi_vis, si_ftype_ptr_ptr); } + + def_builtin_const ("__builtin_vis_fcmple16", CODE_FOR_fcmple16_vis, + si_ftype_v4hi_v4hi); + def_builtin_const ("__builtin_vis_fcmple32", CODE_FOR_fcmple32_vis, + si_ftype_v2si_v2si); + def_builtin_const ("__builtin_vis_fcmpne16", CODE_FOR_fcmpne16_vis, + si_ftype_v4hi_v4hi); + def_builtin_const ("__builtin_vis_fcmpne32", CODE_FOR_fcmpne32_vis, + si_ftype_v2si_v2si); + def_builtin_const ("__builtin_vis_fcmpgt16", CODE_FOR_fcmpgt16_vis, + si_ftype_v4hi_v4hi); + def_builtin_const ("__builtin_vis_fcmpgt32", CODE_FOR_fcmpgt32_vis, + si_ftype_v2si_v2si); + def_builtin_const ("__builtin_vis_fcmpeq16", CODE_FOR_fcmpeq16_vis, + si_ftype_v4hi_v4hi); + def_builtin_const ("__builtin_vis_fcmpeq32", CODE_FOR_fcmpeq32_vis, + si_ftype_v2si_v2si); } /* Handle TARGET_EXPAND_BUILTIN target hook. diff --git a/gcc/config/sparc/sparc.md b/gcc/config/sparc/sparc.md index 1fb59cc..812ae7b 100644 --- a/gcc/config/sparc/sparc.md +++ b/gcc/config/sparc/sparc.md @@ -70,6 +70,11 @@ (UNSPEC_SP_SET 60) (UNSPEC_SP_TEST 61) + + (UNSPEC_FCMPLE 70) + (UNSPEC_FCMPNE 71) + (UNSPEC_FCMPGT 72) + (UNSPEC_FCMPEQ 73) ]) (define_constants @@ -7886,4 +7891,84 @@ "edge32l\t%r1, %r2, %0" [(set_attr "type" "edge")]) +(define_insn "fcmple16_vis" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI [(match_operand:V4HI 1 "register_operand" "e") + (match_operand:V4HI 2 "register_operand" "e")] + UNSPEC_FCMPLE))] + "TARGET_VIS" + "fcmple16\t%1, %2, %0" + [(set_attr "type" "fpmul") + (set_attr "fptype" "double")]) + +(define_insn "fcmple32_vis" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI [(match_operand:V2SI 1 "register_operand" "e") + (match_operand:V2SI 2 "register_operand" "e")] + UNSPEC_FCMPLE))] + "TARGET_VIS" + "fcmple32\t%1, %2, %0" + [(set_attr "type" "fpmul") + (set_attr "fptype" "double")]) + +(define_insn "fcmpne16_vis" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI [(match_operand:V4HI 1 "register_operand" "e") + (match_operand:V4HI 2 "register_operand" "e")] + UNSPEC_FCMPNE))] + "TARGET_VIS" + "fcmpne16\t%1, %2, %0" + [(set_attr "type" "fpmul") + (set_attr "fptype" "double")]) + +(define_insn "fcmpne32_vis" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI [(match_operand:V2SI 1 "register_operand" "e") + (match_operand:V2SI 2 "register_operand" "e")] + UNSPEC_FCMPNE))] + "TARGET_VIS" + "fcmpne32\t%1, %2, %0" + [(set_attr "type" "fpmul") + (set_attr "fptype" "double")]) + +(define_insn "fcmpgt16_vis" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI [(match_operand:V4HI 1 "register_operand" "e") + (match_operand:V4HI 2 "register_operand" "e")] + UNSPEC_FCMPGT))] + "TARGET_VIS" + "fcmpgt16\t%1, %2, %0" + [(set_attr "type" "fpmul") + (set_attr "fptype" "double")]) + +(define_insn "fcmpgt32_vis" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI [(match_operand:V2SI 1 "register_operand" "e") + (match_operand:V2SI 2 "register_operand" "e")] + UNSPEC_FCMPGT))] + "TARGET_VIS" + "fcmpgt32\t%1, %2, %0" + [(set_attr "type" "fpmul") + (set_attr "fptype" "double")]) + +(define_insn "fcmpeq16_vis" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI [(match_operand:V4HI 1 "register_operand" "e") + (match_operand:V4HI 2 "register_operand" "e")] + UNSPEC_FCMPEQ))] + "TARGET_VIS" + "fcmpeq16\t%1, %2, %0" + [(set_attr "type" "fpmul") + (set_attr "fptype" "double")]) + +(define_insn "fcmpeq32_vis" + [(set (match_operand:SI 0 "register_operand" "=r") + (unspec:SI [(match_operand:V2SI 1 "register_operand" "e") + (match_operand:V2SI 2 "register_operand" "e")] + UNSPEC_FCMPEQ))] + "TARGET_VIS" + "fcmpeq32\t%1, %2, %0" + [(set_attr "type" "fpmul") + (set_attr "fptype" "double")]) + (include "sync.md") diff --git a/gcc/config/sparc/visintrin.h b/gcc/config/sparc/visintrin.h index 1b31451..4c2fa18 100644 --- a/gcc/config/sparc/visintrin.h +++ b/gcc/config/sparc/visintrin.h @@ -206,4 +206,60 @@ __vis_edge32l (void *__A, void *__B) return __builtin_vis_edge32l (__A, __B); } +extern __inline int +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +__vis_fcmple16 (__v4hi __A, __v4hi __B) +{ + return __builtin_vis_fcmple16 (__A, __B); +} + +extern __inline int +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +__vis_fcmple32 (__v2si __A, __v2si __B) +{ + return __builtin_vis_fcmple32 (__A, __B); +} + +extern __inline int +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +__vis_fcmpne16 (__v4hi __A, __v4hi __B) +{ + return __builtin_vis_fcmpne16 (__A, __B); +} + +extern __inline int +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +__vis_fcmpne32 (__v2si __A, __v2si __B) +{ + return __builtin_vis_fcmpne32 (__A, __B); +} + +extern __inline int +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +__vis_fcmpgt16 (__v4hi __A, __v4hi __B) +{ + return __builtin_vis_fcmpgt16 (__A, __B); +} + +extern __inline int +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +__vis_fcmpgt32 (__v2si __A, __v2si __B) +{ + return __builtin_vis_fcmpgt32 (__A, __B); +} + +extern __inline int +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +__vis_fcmpeq16 (__v4hi __A, __v4hi __B) +{ + return __builtin_vis_fcmpeq16 (__A, __B); +} + +extern __inline int +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +__vis_fcmpeq32 (__v2si __A, __v2si __B) +{ + return __builtin_vis_fcmpeq32 (__A, __B); +} + #endif /* _VISINTRIN_H_INCLUDED */ diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 3e6e05e..1f54ef1 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -12965,6 +12965,15 @@ int __builtin_vis_edge16 (void *, void *); int __builtin_vis_edge16l (void *, void *); int __builtin_vis_edge32 (void *, void *); int __builtin_vis_edge32l (void *, void *); + +int __builtin_vis_fcmple16 (v4hi, v4hi); +int __builtin_vis_fcmple32 (v2si, v2si); +int __builtin_vis_fcmpne16 (v4hi, v4hi); +int __builtin_vis_fcmpne32 (v2si, v2si); +int __builtin_vis_fcmpgt16 (v4hi, v4hi); +int __builtin_vis_fcmpgt32 (v2si, v2si); +int __builtin_vis_fcmpeq16 (v4hi, v4hi); +int __builtin_vis_fcmpeq32 (v2si, v2si); @end smallexample @node SPU Built-in Functions