From patchwork Tue Jun 7 17:47:52 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bill Seurer X-Patchwork-Id: 631741 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rPJvK12JWz9t2D for ; Wed, 8 Jun 2016 03:48:12 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=eJECxUGZ; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type :content-transfer-encoding; q=dns; s=default; b=u1Il7pANSBw3ZZTs hgIS8RBSd62weOCYsJsII9khO9YbwWx3uBo3si9abx2SQKB+ErZ+Ts+kGiPEja/k gABq3N56FML7dENX/zH62zJaaM3nRQ1IxebZclf03JMjoZgSJRRm3kYOC6RzWC8m 8tDTLSVSm3ahnIMIH+ODVXgN0+4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type :content-transfer-encoding; s=default; bh=p0khmkbVFe9mWHtf7WAEve lOcZI=; b=eJECxUGZw9Sow0na1/HiR4xRnOpxiTmVIwX8jIVt8lQ045r6RNhVvC HvxGZyNG4d1XPJ7tkYB+dzz4nmKb1aP1s60GeGflD+qkkrnnxJ+pO7TyBnYvH5yt QR8qYgbVIrili9JpJB4WpJqIsU23mEGufuP93MWgmT/qPzCzzC8a0= Received: (qmail 75603 invoked by alias); 7 Jun 2016 17:48:03 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 75525 invoked by uid 89); 7 Jun 2016 17:48:01 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL, BAYES_00, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=UD:rs6000-c.c, sk:altivec, rs6000-c.c, UD:rs6000-builtin.def X-HELO: e33.co.us.ibm.com Received: from e33.co.us.ibm.com (HELO e33.co.us.ibm.com) (32.97.110.151) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (CAMELLIA256-SHA encrypted) ESMTPS; Tue, 07 Jun 2016 17:48:00 +0000 Received: from localhost by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Tue, 7 Jun 2016 11:47:55 -0600 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e33.co.us.ibm.com (192.168.1.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Tue, 7 Jun 2016 11:47:54 -0600 X-IBM-Helo: d03dlp02.boulder.ibm.com X-IBM-MailFrom: seurer@linux.vnet.ibm.com X-IBM-RcptTo: gcc-patches@gcc.gnu.org; dje.gcc@gmail.com; segher@kernel.crashing.org Received: from b01cxnp23032.gho.pok.ibm.com (b01cxnp23032.gho.pok.ibm.com [9.57.198.27]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id AFE2C3E40054; Tue, 7 Jun 2016 11:47:53 -0600 (MDT) Received: from b01ledav005.gho.pok.ibm.com (b01ledav005.gho.pok.ibm.com [9.57.199.110]) by b01cxnp23032.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id u57HlsMe36110414; Tue, 7 Jun 2016 17:47:54 GMT Received: from b01ledav005.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3E97AAE052; Tue, 7 Jun 2016 13:47:53 -0400 (EDT) Received: from spaceviking.ibm.com (unknown [9.10.86.198]) by b01ledav005.gho.pok.ibm.com (Postfix) with ESMTP id F2CA2AE056; Tue, 7 Jun 2016 13:47:52 -0400 (EDT) To: GCC Patches Cc: David Edelsohn , Segher Boessenkool From: Bill Seurer Subject: [PATCH, rs6000] Add support for char, short, and int versions of vec_mul Message-ID: <575708C8.4060705@linux.vnet.ibm.com> Date: Tue, 7 Jun 2016 12:47:52 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.8.0 MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16060717-0009-0000-0000-00003824FCA2 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused X-IsSubscribed: yes This patch adds support for the missing versions of the vec_mul altivec builtins from the Power Architecture 64-Bit ELF V2 ABI OpenPOWER ABI for Linux Supplement (16 July 2015 Version 1.1). There are many of the builtins that are missing and this is part of a series of patches to add them. There aren't instructions for the {un}signed char, {un}signed short, and {un}signed int versions of vec_mul so the output code is built from other built-ins and operations that do have instructions. The new test case is an executable test which verifies that the generated code produces expected values. C macros were used so that the same test case could be used for all the various supported types. Bootstrapped and tested on powerpc64le-unknown-linux-gnu and powerpc64-unknown-linux-gnu with no regressions. Is this ok for trunk? [gcc] 2016-06-07 Bill Seurer * config/rs6000/altivec.h: Add __builtin_vec_mul. * config/rs6000/rs6000-builtin.def (vec_mul): Change vec_mul to a special case Altivec builtin. * config/rs6000/rs6000-c.c (altivec_overloaded_builtins): Remove VSX_BUILTIN_VEC_MUL (replaced with special case code). * config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin): Add code for ALTIVEC_BUILTIN_VEC_MUL. * config/rs6000/rs6000.c (altivec_init_builtins): Add definition for __builtin_vec_mul. [gcc/testsuite] 2016-06-07 Bill Seurer * gcc.target/powerpc/vec-mul.c: New test. Index: gcc/config/rs6000/altivec.h =================================================================== --- gcc/config/rs6000/altivec.h (revision 237175) +++ gcc/config/rs6000/altivec.h (working copy) @@ -229,6 +229,7 @@ #define vec_mladd __builtin_vec_mladd #define vec_msum __builtin_vec_msum #define vec_msums __builtin_vec_msums +#define vec_mul __builtin_vec_mul #define vec_mule __builtin_vec_mule #define vec_mulo __builtin_vec_mulo #define vec_nor __builtin_vec_nor Index: gcc/config/rs6000/rs6000-builtin.def =================================================================== --- gcc/config/rs6000/rs6000-builtin.def (revision 237175) +++ gcc/config/rs6000/rs6000-builtin.def (working copy) @@ -1300,6 +1300,7 @@ BU_ALTIVEC_OVERLOAD_X (LVRX, "lvrx") BU_ALTIVEC_OVERLOAD_X (LVRXL, "lvrxl") BU_ALTIVEC_OVERLOAD_X (LVSL, "lvsl") BU_ALTIVEC_OVERLOAD_X (LVSR, "lvsr") +BU_ALTIVEC_OVERLOAD_X (MUL, "mul") BU_ALTIVEC_OVERLOAD_X (PROMOTE, "promote") BU_ALTIVEC_OVERLOAD_X (SLD, "sld") BU_ALTIVEC_OVERLOAD_X (SPLAT, "splat") @@ -1600,7 +1601,6 @@ BU_VSX_OVERLOAD_3V (XXPERMDI, "xxpermdi") BU_VSX_OVERLOAD_3V (XXSLDWI, "xxsldwi") /* 2 argument VSX overloaded builtin functions. */ -BU_VSX_OVERLOAD_2 (MUL, "mul") BU_VSX_OVERLOAD_2 (DIV, "div") BU_VSX_OVERLOAD_2 (XXMRGHW, "xxmrghw") BU_VSX_OVERLOAD_2 (XXMRGLW, "xxmrglw") Index: gcc/config/rs6000/rs6000-c.c =================================================================== --- gcc/config/rs6000/rs6000-c.c (revision 237175) +++ gcc/config/rs6000/rs6000-c.c (working copy) @@ -1941,14 +1941,6 @@ const struct altivec_builtin_types altivec_overloa RS6000_BTI_unsigned_V16QI, RS6000_BTI_bool_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_VMINUB, ALTIVEC_BUILTIN_VMINUB, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_bool_V16QI, 0 }, - { VSX_BUILTIN_VEC_MUL, VSX_BUILTIN_XVMULSP, - RS6000_BTI_V4SF, RS6000_BTI_V4SF, RS6000_BTI_V4SF, 0 }, - { VSX_BUILTIN_VEC_MUL, VSX_BUILTIN_XVMULDP, - RS6000_BTI_V2DF, RS6000_BTI_V2DF, RS6000_BTI_V2DF, 0 }, - { VSX_BUILTIN_VEC_MUL, VSX_BUILTIN_MUL_V2DI, - RS6000_BTI_V2DI, RS6000_BTI_V2DI, RS6000_BTI_V2DI, 0 }, - { VSX_BUILTIN_VEC_MUL, VSX_BUILTIN_MUL_V2DI, - RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, RS6000_BTI_unsigned_V2DI, 0 }, { ALTIVEC_BUILTIN_VEC_MULE, ALTIVEC_BUILTIN_VMULEUB, RS6000_BTI_unsigned_V8HI, RS6000_BTI_unsigned_V16QI, RS6000_BTI_unsigned_V16QI, 0 }, { ALTIVEC_BUILTIN_VEC_MULE, ALTIVEC_BUILTIN_VMULESB, @@ -4683,7 +4675,58 @@ assignment for unaligned loads and stores"); warning (OPT_Wdeprecated, "vec_lvsr is deprecated for little endian; use \ assignment for unaligned loads and stores"); + if (fcode == ALTIVEC_BUILTIN_VEC_MUL) + { + /* vec_mul needs to be special cased because there are no instructions + for it for the {un}signed char, {un}signed short, and {un}signed int + types. */ + if (nargs != 2) + { + error ("vec_mul only accepts 2 arguments"); + return error_mark_node; + } + tree arg0 = (*arglist)[0]; + tree arg0_type = TREE_TYPE (arg0); + tree arg1 = (*arglist)[1]; + tree arg1_type = TREE_TYPE (arg1); + + /* Both arguments must be vectors and the types must match. */ + if (arg0_type != arg1_type) + goto bad; + if (TREE_CODE (arg0_type) != VECTOR_TYPE) + goto bad; + + switch (TYPE_MODE (TREE_TYPE (arg0_type))) + { + case QImode: + case HImode: + case SImode: + case DImode: + case TImode: + { + /* For scalar types just use a multiply expression. */ + return fold_build2_loc (loc, MULT_EXPR, TREE_TYPE (arg0), + arg0, arg1); + } + case SFmode: + { + /* For floats use the xvmulsp instruction directly. */ + tree call = rs6000_builtin_decls[VSX_BUILTIN_XVMULSP]; + return build_call_expr (call, 2, arg0, arg1); + } + case DFmode: + { + /* For doubles use the xvmuldp instruction directly. */ + tree call = rs6000_builtin_decls[VSX_BUILTIN_XVMULDP]; + return build_call_expr (call, 2, arg0, arg1); + } + /* Other types are errors. */ + default: + goto bad; + } + } + if (fcode == ALTIVEC_BUILTIN_VEC_CMPNE) { /* vec_cmpne needs to be special cased because there are no instructions Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 237175) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -16573,6 +16573,8 @@ altivec_init_builtins (void) ALTIVEC_BUILTIN_VEC_ADDEC); def_builtin ("__builtin_vec_cmpne", opaque_ftype_opaque_opaque, ALTIVEC_BUILTIN_VEC_CMPNE); + def_builtin ("__builtin_vec_mul", opaque_ftype_opaque_opaque, + ALTIVEC_BUILTIN_VEC_MUL); /* Cell builtins. */ def_builtin ("__builtin_altivec_lvlx", v16qi_ftype_long_pcvoid, ALTIVEC_BUILTIN_LVLX); Index: gcc/testsuite/gcc.target/powerpc/vec-mul.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/vec-mul.c (revision 0) +++ gcc/testsuite/gcc.target/powerpc/vec-mul.c (working copy) @@ -0,0 +1,86 @@ +/* { dg-do run { target { powerpc64*-*-* } } } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-mvsx -O3" } */ + +/* Test that the vec_mul builtin works as expected. */ + +#include "altivec.h" + +#define N 4096 + +void abort (); + +#define define_test_functions(STYPE, NAMESUFFIX) \ +\ +STYPE result_##NAMESUFFIX[N]; \ +STYPE operand1_##NAMESUFFIX[N]; \ +STYPE operand2_##NAMESUFFIX[N]; \ +STYPE expected_##NAMESUFFIX[N]; \ +\ +__attribute__((noinline)) void vector_tests_##NAMESUFFIX () \ +{ \ + int i; \ + vector STYPE v1, v2, tmp; \ + for (i = 0; i < N; i+=16/sizeof (STYPE)) \ + { \ + /* result=operand1*operand2. */ \ + v1 = vec_vsx_ld (0, &operand1_##NAMESUFFIX[i]); \ + v2 = vec_vsx_ld (0, &operand2_##NAMESUFFIX[i]); \ +\ + tmp = vec_mul (v1, v2); \ + vec_vsx_st (tmp, 0, &result_##NAMESUFFIX[i]); \ + } \ +} \ +\ +__attribute__((noinline)) void init_##NAMESUFFIX () \ +{ \ + int i; \ + for (i = 0; i < N; ++i) \ + { \ + result_##NAMESUFFIX[i] = 0; \ + operand1_##NAMESUFFIX[i] = (i+1) % 31; \ + operand2_##NAMESUFFIX[i] = (i*2) % 15; \ + expected_##NAMESUFFIX[i] = operand1_##NAMESUFFIX[i] * \ + operand2_##NAMESUFFIX[i]; \ + } \ +} \ +\ +__attribute__((noinline)) void verify_results_##NAMESUFFIX () \ +{ \ + int i; \ + for (i = 0; i < N; ++i) \ + { \ + if (result_##NAMESUFFIX[i] != expected_##NAMESUFFIX[i]) \ + abort (); \ + } \ +} + + +#define execute_test_functions(STYPE, NAMESUFFIX) \ +{ \ + init_##NAMESUFFIX (); \ + vector_tests_##NAMESUFFIX (); \ + verify_results_##NAMESUFFIX (); \ +} + + +define_test_functions (signed int, si); +define_test_functions (unsigned int, ui); +define_test_functions (signed short, ss); +define_test_functions (unsigned short, us); +define_test_functions (signed char, sc); +define_test_functions (unsigned char, uc); +define_test_functions (float, f); + +int main () +{ + execute_test_functions (signed int, si); + execute_test_functions (unsigned int, ui); + execute_test_functions (signed short, ss); + execute_test_functions (unsigned short, us); + execute_test_functions (signed char, sc); + execute_test_functions (unsigned char, uc); + execute_test_functions (float, f); + + return 0; +}