From patchwork Wed Jul 1 13:35:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Kewen.Lin" X-Patchwork-Id: 1320471 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=btzTunuG; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49xj0S6sxBz9sTZ for ; Wed, 1 Jul 2020 23:35:35 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AFFAF3857031; Wed, 1 Jul 2020 13:35:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AFFAF3857031 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1593610531; bh=N+Mc4M68zYrZKh1HOOPfqZBqS0jChNVqGQaGGdZzaOg=; h=Subject:To:References:Date:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=btzTunuGjclLD5QLkB6QdDhEHse7r/Qn1aJKheqYJCQuurSlRsdcFse7NFpeOT4QQ N7v3+NfgvIpq7IiCQIjS1dkrUSb2vRFHHLfx5GH54RbbmFN+K1n3DxhEWmPprocIzc iNDr0WRum+/GZ1jrCjCCRfTmrfz+XY9gfwWQA1iI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 2F5B93857007 for ; Wed, 1 Jul 2020 13:35:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 2F5B93857007 Received: from pps.filterd (m0098414.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 061DXRSs166246; Wed, 1 Jul 2020 09:35:25 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com with ESMTP id 320pw7hmmq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 01 Jul 2020 09:35:25 -0400 Received: from m0098414.ppops.net (m0098414.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 061DZOee172347; Wed, 1 Jul 2020 09:35:24 -0400 Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0b-001b2d01.pphosted.com with ESMTP id 320pw7hmkw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 01 Jul 2020 09:35:24 -0400 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.16.0.42/8.16.0.42) with SMTP id 061DTjtS021008; Wed, 1 Jul 2020 13:35:23 GMT Received: from b06avi18878370.portsmouth.uk.ibm.com (b06avi18878370.portsmouth.uk.ibm.com [9.149.26.194]) by ppma04ams.nl.ibm.com with ESMTP id 31wwr8cybp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 01 Jul 2020 13:35:22 +0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06avi18878370.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 061DZK7L54198660 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 1 Jul 2020 13:35:20 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 91E8BAE057; Wed, 1 Jul 2020 13:35:20 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3D117AE053; Wed, 1 Jul 2020 13:35:17 +0000 (GMT) Received: from KewenLins-MacBook-Pro.local (unknown [9.200.59.231]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 1 Jul 2020 13:35:16 +0000 (GMT) Subject: [PATCH 1/7 v8] ifn/optabs: Support vector load/store with length To: GCC Patches , richard.sandiford@arm.com References: <30906c0d-3b9f-e1e6-156f-c01fcf229cb9@linux.ibm.com> <93f9e4cf-351d-5f89-65b5-7dbc97ce13b9@linux.ibm.com> <8f64b58b-050c-2e1c-36c6-049a07eceee7@linux.ibm.com> <99427c42-113d-7fbc-5617-f64c2317250d@linux.ibm.com> <0fccef6b-2885-d91d-ff05-b23c2315fc7d@linux.ibm.com> <380e1418-f262-7a09-e24a-2eb14817662f@linux.ibm.com> <058eb28c-bc40-7df9-f540-b75b1bb84b3b@linux.ibm.com> Message-ID: <7ea2bf61-5710-d3d4-26f0-665504e061c4@linux.ibm.com> Date: Wed, 1 Jul 2020 21:35:15 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: Content-Language: en-US X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-07-01_08:2020-07-01, 2020-07-01 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 adultscore=0 mlxlogscore=999 cotscore=-2147483648 phishscore=0 mlxscore=0 suspectscore=0 bulkscore=0 spamscore=0 priorityscore=1501 malwarescore=0 clxscore=1015 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2004280000 definitions=main-2007010100 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Kewen.Lin via Gcc-patches" From: "Kewen.Lin" Reply-To: "Kewen.Lin" Cc: Segher Boessenkool , Bill Schmidt , Jim Wilson , David Edelsohn Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi Richard, on 2020/6/30 下午11:32, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi Richard, >> >> Thanks for the comments! >> >> on 2020/6/29 下午6:07, Richard Sandiford wrote: >>> Thanks for the update. I agree with the summary of the IRC discussion >>> except for… >>> >>> "Kewen.Lin" writes: >>>> Hi Richard S./Richi/Jim/Segher, >>>> >>>> Thanks a lot for your comments to make this patch more solid. >>>> >>>> Based on our discussion, for the vector load/store with length >>>> optab, the length unit would be measured in lanes by default. >>>> For the targets which support length measured in bytes like Power, >>>> they should only define VnQI modes to wrap the other same size >>>> vector modes. If the length is larger than total lane/byte count >>>> of the given mode, it's taken to load all lanes/bytes implicitly. >>> >>> …this last bit. IMO the behaviour of the optab should be undefined >>> when the supplied length is greater than the number of lanes. >>> >>> I think that also makes things better for the lxvl implementation, >>> which ignores the upper 56 bits of the length. It sounds like the >>> above semantics would instead require Power to saturate the value >>> at 255 before shifting it. >>> >> >> Good catch, I just realized that this part is inconsistent to what I >> implemented in patch 5/7, where the function vect_gen_len still does >> the min operation between the given length and length_limit. >> >> This patch is updated accordingly to state the behavior to be undefined. >> The others aren't required to change. >> >> Could you have a further look? Thanks in advance! >> >> v6/v7: Updated optab descriptions. >> >> v5: >> - Updated lenload/lenstore optab to len_load/len_store and the docs. >> - Rename expand_mask_{load,store}_optab_fn to expand_partial_{load,store}_optab_fn >> - Added/updated macros for expand_mask_{load,store}_optab_fn >> and expand_len_{load,store}_optab_fn >> >> v4: Update len_load_direct/len_store_direct to align with direct optab. >> >> v3: Get rid of length mode hook. > > Thanks, mostly looks good, just some comments about the documentation… > Thanks here again!!! V8 attached with updates according to your comments! Could you have a check again? Thanks! ----- v6/v7/v8: Updated optab descriptions. v5: - Updated lenload/lenstore optab to len_load/len_store and the docs. - Rename expand_mask_{load,store}_optab_fn to expand_partial_{load,store}_optab_fn - Added/updated macros for expand_mask_{load,store}_optab_fn and expand_len_{load,store}_optab_fn v4: Update len_load_direct/len_store_direct to align with direct optab. v3: Get rid of length mode hook. BR, Kewen ----- gcc/ChangeLog: 2020-MM-DD Kewen Lin * doc/md.texi (len_load_@var{m}): Document. (len_store_@var{m}): Likewise. * internal-fn.c (len_load_direct): New macro. (len_store_direct): Likewise. (expand_len_load_optab_fn): Likewise. (expand_len_store_optab_fn): Likewise. (direct_len_load_optab_supported_p): Likewise. (direct_len_store_optab_supported_p): Likewise. (expand_mask_load_optab_fn): New macro. Original renamed to ... (expand_partial_load_optab_fn): ... here. Add handlings for len_load_optab. (expand_mask_store_optab_fn): New macro. Original renamed to ... (expand_partial_store_optab_fn): ... here. Add handlings for len_store_optab. (internal_load_fn_p): Handle IFN_LEN_LOAD. (internal_store_fn_p): Handle IFN_LEN_STORE. (internal_fn_stored_value_index): Handle IFN_LEN_STORE. * internal-fn.def (LEN_LOAD): New internal function. (LEN_STORE): Likewise. * optabs.def (len_load_optab, len_store_optab): New optab. diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 2c67c818da5..2b462869437 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -5167,6 +5167,32 @@ mode @var{n}. This pattern is not allowed to @code{FAIL}. +@cindex @code{len_load_@var{m}} instruction pattern +@item @samp{len_load_@var{m}} +Load the number of vector elements specified by operand 2 from memory +operand 1 into vector register operand 0, setting the other elements of +operand 0 to undefined values. Operands 0 and 1 have mode @var{m}, +which must be a vector mode. Operand 2 has whichever integer mode the +target prefers. If operand 2 exceeds the number of elements in mode +@var{m}, the behavior is undefined. If the target prefers the length +to be measured in bytes rather than elements, it should only implement +this pattern for vectors of @code{QI} elements. + +This pattern is not allowed to @code{FAIL}. + +@cindex @code{len_store_@var{m}} instruction pattern +@item @samp{len_store_@var{m}} +Store the number of vector elements specified by operand 2 from vector +register operand 1 into memory operand 0, leaving the other elements of +operand 0 unchanged. Operands 0 and 1 have mode @var{m}, which must be +a vector mode. Operand 2 has whichever integer mode the target prefers. +If operand 2 exceeds the number of elements in mode @var{m}, the behavior +is undefined. If the target prefers the length to be measured in bytes +rather than elements, it should only implement this pattern for vectors +of @code{QI} elements. + +This pattern is not allowed to @code{FAIL}. + @cindex @code{vec_perm@var{m}} instruction pattern @item @samp{vec_perm@var{m}} Output a (variable) vector permutation. Operand 0 is the destination diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c index 4f088de48d5..1e53ced60eb 100644 --- a/gcc/internal-fn.c +++ b/gcc/internal-fn.c @@ -104,10 +104,12 @@ init_internal_fns () #define load_lanes_direct { -1, -1, false } #define mask_load_lanes_direct { -1, -1, false } #define gather_load_direct { 3, 1, false } +#define len_load_direct { -1, -1, false } #define mask_store_direct { 3, 2, false } #define store_lanes_direct { 0, 0, false } #define mask_store_lanes_direct { 0, 0, false } #define scatter_store_direct { 3, 1, false } +#define len_store_direct { 3, 3, false } #define unary_direct { 0, 0, true } #define binary_direct { 0, 0, true } #define ternary_direct { 0, 0, true } @@ -2478,10 +2480,10 @@ expand_call_mem_ref (tree type, gcall *stmt, int index) return fold_build2 (MEM_REF, type, addr, build_int_cst (alias_ptr_type, 0)); } -/* Expand MASK_LOAD{,_LANES} call STMT using optab OPTAB. */ +/* Expand MASK_LOAD{,_LANES} or LEN_LOAD call STMT using optab OPTAB. */ static void -expand_mask_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab) +expand_partial_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab) { class expand_operand ops[3]; tree type, lhs, rhs, maskt; @@ -2497,6 +2499,8 @@ expand_mask_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab) if (optab == vec_mask_load_lanes_optab) icode = get_multi_vector_move (type, optab); + else if (optab == len_load_optab) + icode = direct_optab_handler (optab, TYPE_MODE (type)); else icode = convert_optab_handler (optab, TYPE_MODE (type), TYPE_MODE (TREE_TYPE (maskt))); @@ -2507,18 +2511,24 @@ expand_mask_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab) target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); create_output_operand (&ops[0], target, TYPE_MODE (type)); create_fixed_operand (&ops[1], mem); - create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt))); + if (optab == len_load_optab) + create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)), + TYPE_UNSIGNED (TREE_TYPE (maskt))); + else + create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt))); expand_insn (icode, 3, ops); if (!rtx_equal_p (target, ops[0].value)) emit_move_insn (target, ops[0].value); } +#define expand_mask_load_optab_fn expand_partial_load_optab_fn #define expand_mask_load_lanes_optab_fn expand_mask_load_optab_fn +#define expand_len_load_optab_fn expand_partial_load_optab_fn -/* Expand MASK_STORE{,_LANES} call STMT using optab OPTAB. */ +/* Expand MASK_STORE{,_LANES} or LEN_STORE call STMT using optab OPTAB. */ static void -expand_mask_store_optab_fn (internal_fn, gcall *stmt, convert_optab optab) +expand_partial_store_optab_fn (internal_fn, gcall *stmt, convert_optab optab) { class expand_operand ops[3]; tree type, lhs, rhs, maskt; @@ -2532,6 +2542,8 @@ expand_mask_store_optab_fn (internal_fn, gcall *stmt, convert_optab optab) if (optab == vec_mask_store_lanes_optab) icode = get_multi_vector_move (type, optab); + else if (optab == len_store_optab) + icode = direct_optab_handler (optab, TYPE_MODE (type)); else icode = convert_optab_handler (optab, TYPE_MODE (type), TYPE_MODE (TREE_TYPE (maskt))); @@ -2542,11 +2554,17 @@ expand_mask_store_optab_fn (internal_fn, gcall *stmt, convert_optab optab) reg = expand_normal (rhs); create_fixed_operand (&ops[0], mem); create_input_operand (&ops[1], reg, TYPE_MODE (type)); - create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt))); + if (optab == len_store_optab) + create_convert_operand_from (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)), + TYPE_UNSIGNED (TREE_TYPE (maskt))); + else + create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt))); expand_insn (icode, 3, ops); } +#define expand_mask_store_optab_fn expand_partial_store_optab_fn #define expand_mask_store_lanes_optab_fn expand_mask_store_optab_fn +#define expand_len_store_optab_fn expand_partial_store_optab_fn static void expand_ABNORMAL_DISPATCHER (internal_fn, gcall *) @@ -3128,10 +3146,12 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types, #define direct_load_lanes_optab_supported_p multi_vector_optab_supported_p #define direct_mask_load_lanes_optab_supported_p multi_vector_optab_supported_p #define direct_gather_load_optab_supported_p convert_optab_supported_p +#define direct_len_load_optab_supported_p direct_optab_supported_p #define direct_mask_store_optab_supported_p convert_optab_supported_p #define direct_store_lanes_optab_supported_p multi_vector_optab_supported_p #define direct_mask_store_lanes_optab_supported_p multi_vector_optab_supported_p #define direct_scatter_store_optab_supported_p convert_optab_supported_p +#define direct_len_store_optab_supported_p direct_optab_supported_p #define direct_while_optab_supported_p convert_optab_supported_p #define direct_fold_extract_optab_supported_p direct_optab_supported_p #define direct_fold_left_optab_supported_p direct_optab_supported_p @@ -3498,6 +3518,7 @@ internal_load_fn_p (internal_fn fn) case IFN_MASK_LOAD_LANES: case IFN_GATHER_LOAD: case IFN_MASK_GATHER_LOAD: + case IFN_LEN_LOAD: return true; default: @@ -3517,6 +3538,7 @@ internal_store_fn_p (internal_fn fn) case IFN_MASK_STORE_LANES: case IFN_SCATTER_STORE: case IFN_MASK_SCATTER_STORE: + case IFN_LEN_STORE: return true; default: @@ -3577,6 +3599,7 @@ internal_fn_stored_value_index (internal_fn fn) case IFN_MASK_STORE: case IFN_SCATTER_STORE: case IFN_MASK_SCATTER_STORE: + case IFN_LEN_STORE: return 3; default: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 1d190d492ff..17dac128e83 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -49,11 +49,13 @@ along with GCC; see the file COPYING3. If not see - load_lanes: currently just vec_load_lanes - mask_load_lanes: currently just vec_mask_load_lanes - gather_load: used for {mask_,}gather_load + - len_load: currently just len_load - mask_store: currently just maskstore - store_lanes: currently just vec_store_lanes - mask_store_lanes: currently just vec_mask_store_lanes - scatter_store: used for {mask_,}scatter_store + - len_store: currently just len_store - unary: a normal unary optab, such as vec_reverse_ - binary: a normal binary optab, such as vec_interleave_lo_ @@ -127,6 +129,8 @@ DEF_INTERNAL_OPTAB_FN (GATHER_LOAD, ECF_PURE, gather_load, gather_load) DEF_INTERNAL_OPTAB_FN (MASK_GATHER_LOAD, ECF_PURE, mask_gather_load, gather_load) +DEF_INTERNAL_OPTAB_FN (LEN_LOAD, ECF_PURE, len_load, len_load) + DEF_INTERNAL_OPTAB_FN (SCATTER_STORE, 0, scatter_store, scatter_store) DEF_INTERNAL_OPTAB_FN (MASK_SCATTER_STORE, 0, mask_scatter_store, scatter_store) @@ -136,6 +140,8 @@ DEF_INTERNAL_OPTAB_FN (STORE_LANES, ECF_CONST, vec_store_lanes, store_lanes) DEF_INTERNAL_OPTAB_FN (MASK_STORE_LANES, 0, vec_mask_store_lanes, mask_store_lanes) +DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, len_store, len_store) + DEF_INTERNAL_OPTAB_FN (WHILE_ULT, ECF_CONST | ECF_NOTHROW, while_ult, while) DEF_INTERNAL_OPTAB_FN (CHECK_RAW_PTRS, ECF_CONST | ECF_NOTHROW, check_raw_ptrs, check_ptrs) diff --git a/gcc/optabs.def b/gcc/optabs.def index 0c64eb52a8d..78409aa1453 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -435,3 +435,5 @@ OPTAB_D (check_war_ptrs_optab, "check_war_ptrs$a") OPTAB_DC (vec_duplicate_optab, "vec_duplicate$a", VEC_DUPLICATE) OPTAB_DC (vec_series_optab, "vec_series$a", VEC_SERIES) OPTAB_D (vec_shl_insert_optab, "vec_shl_insert_$a") +OPTAB_D (len_load_optab, "len_load_$a") +OPTAB_D (len_store_optab, "len_store_$a")