From patchwork Fri Jul 14 15:56:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1807910 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=M78K6nOW; dkim-atps=neutral Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4R2bh26v4gz20c1 for ; Sat, 15 Jul 2023 01:56:49 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 77B96385DC29 for ; Fri, 14 Jul 2023 15:56:47 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 77B96385DC29 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689350207; bh=uI6WoKmAtOq3bsXuZGEOSS+JgEiX/ejTFUqVyxdB4SM=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=M78K6nOWC/i6HLbDIbqGkKdxAhL8fRi/CGT3Bz+yd39yn37nKoCpnRrtwIS5fy+Hf 1LP3QKfAlTvOshlGfNww7mgHYf0Ms4UdGvNxUf2JefRr6/cqVANEAEjVkuKH8y3klf m52C/zBZRjIAviI83kH3+3b+8XJVa8z99eMNsXW8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id B5C9E3857722 for ; Fri, 14 Jul 2023 15:56:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B5C9E3857722 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D7AB31570; Fri, 14 Jul 2023 08:57:02 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B7E7C3F740; Fri, 14 Jul 2023 08:56:19 -0700 (PDT) To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, joseph@codesourcery.com, polacek@redhat.com, jason@redhat.com, nathan@acm.org, richard.sandiford@arm.com Cc: joseph@codesourcery.com, polacek@redhat.com, jason@redhat.com, nathan@acm.org Subject: [WIP RFC] Add support for keyword-based attributes Date: Fri, 14 Jul 2023 16:56:18 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-26.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Richard Sandiford via Gcc-patches From: Richard Sandiford Reply-To: Richard Sandiford Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Summary: We'd like to be able to specify some attributes using keywords, rather than the traditional __attribute__ or [[...]] syntax. Would that be OK? In more detail: We'd like to add some new target-specific attributes for Arm SME. These attributes affect semantics and code generation and so they can't simply be ignored. Traditionally we've done this kind of thing by adding GNU attributes, via TARGET_ATTRIBUTE_TABLE in GCC's case. The problem is that both GCC and Clang have traditionally only warned about unrecognised GNU attributes, rather than raising an error. Older compilers might therefore be able to look past some uses of the new attributes and still produce object code, even though that object code is almost certainly going to be wrong. (The compilers will also emit a default-on warning, but that might go unnoticed when building a big project.) There are some existing attributes that similarly affect semantics in ways that cannot be ignored. vector_size is one obvious example. But that doesn't make it a good thing. :) Also, C++ says this for standard [[...]] attributes: For an attribute-token (including an attribute-scoped-token) not specified in this document, the behavior is implementation-defined; any such attribute-token that is not recognized by the implementation is ignored. which doubles down on the idea that attributes should not be used for necessary semantic information. One of the attributes we'd like to add provides a new way of compiling existing code. The attribute doesn't require SME to be available; it just says that the code must be compiled so that it can run in either of two modes. This is probably the most dangerous attribute of the set, since compilers that ignore it would just produce normal code. That code might work in some test scenarios, but it would fail in others. The feeling from the Clang community was therefore that these SME attributes should use keywords instead, so that the keywords trigger an error with older compilers. However, it seemed wrong to define new SME-specific grammar rules, since the underlying problem is pretty generic. We therefore proposed having a type of keyword that can appear exactly where a standard [[...]] attribute can appear and that appertains to exactly what a standard [[...]] attribute would appertain to. No divergence or cherry-picking is allowed. For example: [[arm::foo]] would become: __arm_foo and: [[arm::bar(args)]] would become: __arm_bar(args) It wouldn't be possible to retrofit arguments to a keyword that previously didn't take arguments, since that could lead to parsing ambiguities. So when a keyword is first added, a binding decision would need to be made whether the keyword always takes arguments or is always standalone. For that reason, empty argument lists are allowed for keywords, even though they're not allowed for [[...]] attributes. The argument-less version was accepted into Clang, and I have a follow-on patch for handling arguments. Would the same thing be OK for GCC, in both the C and C++ frontends? The patch below is a proof of concept for the C frontend. It doesn't bootstrap due to warnings about uninitialised fields. And it doesn't have tests. But I did test it locally with various combinations of attribute_spec and it seemed to work as expected. The impact on the C frontend seems to be pretty small. It looks like the impact on the C++ frontend would be a bit bigger, but not much. The patch contains a logically unrelated change: c-common.h set aside 16 keywords for address spaces, but of the in-tree ports, the maximum number of keywords used is 6 (for amdgcn). The patch therefore changes the limit to 8 and uses 8 keywords for the new attributes. This keeps the number of reserved ids <= 256. A real, non-proof-of-concept patch series would: - Change the address-space keywords separately, and deal with any fallout. - Clean up the way that attributes are specified, so that it isn't necessary to update all definitions when adding a new field. - Allow more precise attribute requirements, such as "function decl only". - Add tests :) WDYT? Does this approach look OK in principle, or is it a non-starter? If it is a non-starter, the fallback would be to predefine macros that expand to [[...]] or __attribute__. Having the keywords gives more precise semantics and better error messages though. Thanks, Richard --- gcc/attribs.cc | 30 +++++++++++- gcc/c-family/c-common.h | 13 ++---- gcc/c/c-parser.cc | 88 +++++++++++++++++++++++++++++++++-- gcc/config/aarch64/aarch64.cc | 1 + gcc/tree-core.h | 19 ++++++++ 5 files changed, 135 insertions(+), 16 deletions(-) diff --git a/gcc/attribs.cc b/gcc/attribs.cc index b8cb55b97df..706cd81329c 100644 --- a/gcc/attribs.cc +++ b/gcc/attribs.cc @@ -752,6 +752,11 @@ decl_attributes (tree *node, tree attributes, int flags, if (spec->decl_required && !DECL_P (*anode)) { + if (spec->is_keyword) + { + error ("%qE does not apply to types", name); + continue; + } if (flags & ((int) ATTR_FLAG_DECL_NEXT | (int) ATTR_FLAG_FUNCTION_NEXT | (int) ATTR_FLAG_ARRAY_NEXT)) @@ -775,6 +780,11 @@ decl_attributes (tree *node, tree attributes, int flags, the decl's type in place here. */ if (spec->type_required && DECL_P (*anode)) { + if (spec->is_keyword) + { + error ("%qE only applies to types", name); + continue; + } anode = &TREE_TYPE (*anode); flags &= ~(int) ATTR_FLAG_TYPE_IN_PLACE; } @@ -782,6 +792,11 @@ decl_attributes (tree *node, tree attributes, int flags, if (spec->function_type_required && TREE_CODE (*anode) != FUNCTION_TYPE && TREE_CODE (*anode) != METHOD_TYPE) { + if (spec->is_keyword) + { + error ("%qE only applies to function types", name); + continue; + } if (TREE_CODE (*anode) == POINTER_TYPE && FUNC_OR_METHOD_TYPE_P (TREE_TYPE (*anode))) { @@ -821,7 +836,12 @@ decl_attributes (tree *node, tree attributes, int flags, && (flags & (int) ATTR_FLAG_TYPE_IN_PLACE) && COMPLETE_TYPE_P (*anode)) { - warning (OPT_Wattributes, "type attributes ignored after type is already defined"); + if (spec->is_keyword) + error ("cannot apply %qE to a type that has already been" + " defined", name); + else + warning (OPT_Wattributes, "type attributes ignored after type" + " is already defined"); continue; } @@ -895,7 +915,13 @@ decl_attributes (tree *node, tree attributes, int flags, *anode = cur_and_last_decl[0]; if (ret == error_mark_node) { - warning (OPT_Wattributes, "%qE attribute ignored", name); + if (spec->is_keyword) + /* This error is only a last resort, Hopefully the + handler will report a better error in most cases, + return NULL_TREE, and set no_add_attrs. */ + error ("invalid %qE attribute", name); + else + warning (OPT_Wattributes, "%qE attribute ignored", name); no_add_attrs = true; } else diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h index b5ef5ff6b2c..4443ccb874d 100644 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -222,17 +222,9 @@ enum rid RID_ADDR_SPACE_5, RID_ADDR_SPACE_6, RID_ADDR_SPACE_7, - RID_ADDR_SPACE_8, - RID_ADDR_SPACE_9, - RID_ADDR_SPACE_10, - RID_ADDR_SPACE_11, - RID_ADDR_SPACE_12, - RID_ADDR_SPACE_13, - RID_ADDR_SPACE_14, - RID_ADDR_SPACE_15, RID_FIRST_ADDR_SPACE = RID_ADDR_SPACE_0, - RID_LAST_ADDR_SPACE = RID_ADDR_SPACE_15, + RID_LAST_ADDR_SPACE = RID_ADDR_SPACE_7, /* __intN keywords. The _N_M here doesn't correspond to the intN in the keyword; use the bitsize in int_n_t_data_t[M] for that. @@ -251,6 +243,9 @@ enum rid RID_FIRST_INT_N = RID_INT_N_0, RID_LAST_INT_N = RID_INT_N_3, + RID_FIRST_KEYWORD_ATTR, + RID_LAST_KEYWORD_ATTR = RID_FIRST_KEYWORD_ATTR + 7, + RID_MAX, RID_FIRST_MODIFIER = RID_STATIC, diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index 24a6eb6e459..b02b082b140 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -104,6 +104,17 @@ set_c_expr_source_range (c_expr *expr, } +/* Register keyword attribute AS with reserved identifier code RID. */ + +void +c_register_keyword_attribute (const attribute_spec *as, int rid) +{ + tree id = get_identifier (as->name); + C_SET_RID_CODE (id, rid); + C_IS_RESERVED_WORD (id) = 1; + ridpointers [rid] = id; +} + /* Initialization routine for this file. */ void @@ -180,6 +191,16 @@ c_parse_init (void) C_IS_RESERVED_WORD (id) = 1; ridpointers [RID_OMP_ALL_MEMORY] = id; } + + int rid = RID_FIRST_KEYWORD_ATTR; + if (const attribute_spec *attrs = targetm.attribute_table) + for (const attribute_spec *attr = attrs; attr->name; ++attr) + if (attr->is_keyword) + { + gcc_assert (rid <= RID_LAST_KEYWORD_ATTR); + c_register_keyword_attribute (attr, rid); + rid += 1; + } } /* A parser structure recording information about the state and @@ -4951,7 +4972,8 @@ c_parser_gnu_attribute_any_word (c_parser *parser) { /* ??? See comment above about what keywords are accepted here. */ bool ok; - switch (c_parser_peek_token (parser)->keyword) + int rid = c_parser_peek_token (parser)->keyword; + switch (rid) { case RID_STATIC: case RID_UNSIGNED: @@ -4994,7 +5016,9 @@ c_parser_gnu_attribute_any_word (c_parser *parser) ok = true; break; default: - ok = false; + /* Accept these now so that we can reject them with a specific + error later. */ + ok = (rid >= RID_FIRST_KEYWORD_ATTR && rid <= RID_LAST_KEYWORD_ATTR); break; } if (!ok) @@ -5156,6 +5180,10 @@ c_parser_gnu_attribute (c_parser *parser, tree attrs, return NULL_TREE; attr_name = canonicalize_attr_name (attr_name); + const attribute_spec *as = lookup_attribute_spec (attr_name); + if (as && as->is_keyword) + error ("%qE cannot be used in %<__attribute__%> constructs", attr_name); + c_parser_consume_token (parser); tree attr; @@ -5330,6 +5358,42 @@ c_parser_balanced_token_sequence (c_parser *parser) } } +/* Parse a keyword attribute. This is simply: + + keyword + + if the attribute never takes arguments, otherwise it is: + + keyword ( balanced-token-sequence[opt] ) +*/ + +static tree +c_parser_keyword_attribute (c_parser *parser) +{ + c_token *token = c_parser_peek_token (parser); + gcc_assert (token->type == CPP_KEYWORD); + tree name = canonicalize_attr_name (token->value); + c_parser_consume_token (parser); + + tree attribute = build_tree_list (name, NULL_TREE); + const attribute_spec *as = lookup_attribute_spec (name); + gcc_assert (as && as->is_keyword); + if (as->max_length > 0) + { + matching_parens parens; + if (!parens.require_open (parser)) + return error_mark_node; + /* Allow zero-length arguments regardless of as->min_length, since + the usual error "parentheses must be omitted if attribute argument + list is empty" suggests an invalid change. We'll reject incorrect + argument lists later. */ + TREE_VALUE (attribute) + = c_parser_attribute_arguments (parser, false, false, false, true); + parens.require_close (parser); + } + return attribute; +} + /* Parse standard (C2X) attributes (including GNU attributes in the gnu:: namespace). @@ -5396,8 +5460,11 @@ c_parser_std_attribute (c_parser *parser, bool for_tm) ns = NULL_TREE; attribute = build_tree_list (build_tree_list (ns, name), NULL_TREE); - /* Parse the arguments, if any. */ const attribute_spec *as = lookup_attribute_spec (TREE_PURPOSE (attribute)); + if (as && as->is_keyword) + error ("%qE cannot be used in %<[[...]]%> constructs", name); + + /* Parse the arguments, if any. */ if (c_parser_next_token_is_not (parser, CPP_OPEN_PAREN)) goto out; { @@ -5456,7 +5523,13 @@ c_parser_std_attribute (c_parser *parser, bool for_tm) static tree c_parser_std_attribute_specifier (c_parser *parser, bool for_tm) { - location_t loc = c_parser_peek_token (parser)->location; + auto first_token = c_parser_peek_token (parser); + location_t loc = first_token->location; + if (first_token->type == CPP_KEYWORD + && first_token->keyword >= RID_FIRST_KEYWORD_ATTR + && first_token->keyword <= RID_LAST_KEYWORD_ATTR) + return c_parser_keyword_attribute (parser); + if (!c_parser_require (parser, CPP_OPEN_SQUARE, "expected %<[%>")) return NULL_TREE; if (!c_parser_require (parser, CPP_OPEN_SQUARE, "expected %<[%>")) @@ -5571,7 +5644,12 @@ c_parser_check_balanced_raw_token_sequence (c_parser *parser, unsigned int *n) static bool c_parser_nth_token_starts_std_attributes (c_parser *parser, unsigned int n) { - if (!(c_parser_peek_nth_token (parser, n)->type == CPP_OPEN_SQUARE + auto token_n = c_parser_peek_nth_token (parser, n); + if (token_n->type == CPP_KEYWORD + && token_n->keyword >= RID_FIRST_KEYWORD_ATTR + && token_n->keyword <= RID_LAST_KEYWORD_ATTR) + return true; + if (!(token_n->type == CPP_OPEN_SQUARE && c_parser_peek_nth_token (parser, n + 1)->type == CPP_OPEN_SQUARE)) return false; /* In C, '[[' must start attributes. In Objective-C, we need to diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 560e5431636..50698439104 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -2802,6 +2802,7 @@ static const struct attribute_spec aarch64_attribute_table[] = { "arm_sve_vector_bits", 1, 1, false, true, false, true, aarch64_sve::handle_arm_sve_vector_bits_attribute, NULL }, + { "__arm_streaming", 0, 0, false, true, true, true, NULL, NULL, true }, { "Advanced SIMD type", 1, 1, false, true, false, true, NULL, NULL }, { "SVE type", 3, 3, false, true, false, true, NULL, NULL }, { "SVE sizeless type", 0, 0, false, true, false, true, NULL, NULL }, diff --git a/gcc/tree-core.h b/gcc/tree-core.h index 668808a29d0..e6700509b9e 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -2167,6 +2167,25 @@ struct attribute_spec { /* An array of attribute exclusions describing names of other attributes that this attribute is mutually exclusive with. */ const exclusions *exclude; + + /* Whether the attribute is a C/C++ "regular keyword attribute". + When true for an attribute "foo", this means that: + + - The keyword foo can appear exactly where the standard attribute syntax + [[...]] can appear, but without the same minimum language requirements. + The link is intended to be automatic: there should be no exceptions. + + - The attribute appertains to whatever a standard attribute in the + same location would appertain to. There is no "sliding" from decls + to types, as sometimes happens for GNU-style attributes. + + - When MAX_LENGTH > 0, the keyword is followed by an argument list. + This argument list is parsed in the same way as arguments to [[...]] + attributes, except that the list can be empty if MIN_LENGTH == 0. + + - In C and C++, the attribute cannot appear in __attribute__ or [[...]]; + it can only appear as a simple keyword. */ + bool is_keyword; }; /* These functions allow a front-end to perform a manual layout of a