diff mbox series

RISC-V: Allow constraint "S" even if the symbol does not bind locally

Message ID 20240131050211.936559-1-maskray@google.com
State New
Headers show
Series RISC-V: Allow constraint "S" even if the symbol does not bind locally | expand

Commit Message

Fangrui Song Jan. 31, 2024, 5:02 a.m. UTC
The constraint "S" can only be used with a symbol that binds locally, so
the following does not work for -fpie/-fpic (GOT access is used).
```
namespace ns { extern int var, a[4]; }
void foo() {
  asm(".pushsection .xxx,\"aw\"; .dc.a %0; .popsection" :: "S"(&ns::var));
  asm(".reloc ., BFD_RELOC_NONE, %0" :: "S"(&ns::a[3]));
}
```

This is overly restrictive, as many references like an absolute
relocation in a writable section or a non-SHF_ALLOC section should be
totally fine.  Allow symbols that do not bind locally, similar to
aarch64 "S" and x86-64 "Ws" (commit d7250100381b817114447d91fff4748526d4fb21).

gcc/ChangeLog:

        * config/riscv/constraints.md: Relax the condition for "S".
        * doc/md.texi: Update.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/asm-raw-symbol.c: New test.
---
 gcc/config/riscv/constraints.md                 |  4 ++--
 gcc/doc/md.texi                                 |  2 +-
 gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c | 17 +++++++++++++++++
 3 files changed, 20 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c

Comments

Kito Cheng Jan. 31, 2024, 7:26 a.m. UTC | #1
I realized there is 's' constraint which is defined in GCC generic
infra[1], and that's kinda what same as the new semantic of 'S' here,

(define_constraint "s"
 "Matches a symbolic integer constant."
 (and (match_test "CONSTANT_P (op)")
      (match_test "!CONST_SCALAR_INT_P (op)")
      (match_test "!flag_pic || LEGITIMATE_PIC_OPERAND_P (op)")))

Where const, symbol_ref and label_ref is match CONSTANT_P &&
!CONST_SCALAR_INT_P,
and LEGITIMATE_PIC_OPERAND_P is always 1 for RISC-V

The only difference is it also allows high, which is something like
%hi(sym), but I think it's harmless in the use case.

However I found LLVM also not work on " asm(".reloc ., BFD_RELOC_NONE,
%0" :: "S"(&ns::a[3]));",
so maybe we could consider implement 's' in LLVM? and also add some
document in riscv-c-api.md

And just clarify, I don't have strong prefer on using 's', I am ok
with relaxing 'S' too,
propose using 's' is because that is work fine on RISC-V gcc for long
time and no backward compatible issue,
But I guess you have this proposal may came from ClangBuiltLinux, so
's' may not work for clang well due to backward compatible.

[1] https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html#index-s-in-constraint
[2] https://github.com/riscv-non-isa/riscv-c-api-doc/blob/master/riscv-c-api.md#constraints-on-operands-of-inline-assembly-statements

On Wed, Jan 31, 2024 at 1:02 PM Fangrui Song <maskray@google.com> wrote:
>
> The constraint "S" can only be used with a symbol that binds locally, so
> the following does not work for -fpie/-fpic (GOT access is used).
> ```
> namespace ns { extern int var, a[4]; }
> void foo() {
>   asm(".pushsection .xxx,\"aw\"; .dc.a %0; .popsection" :: "S"(&ns::var));
>   asm(".reloc ., BFD_RELOC_NONE, %0" :: "S"(&ns::a[3]));
> }
> ```
>
> This is overly restrictive, as many references like an absolute
> relocation in a writable section or a non-SHF_ALLOC section should be
> totally fine.  Allow symbols that do not bind locally, similar to
> aarch64 "S" and x86-64 "Ws" (commit d7250100381b817114447d91fff4748526d4fb21).
>
> gcc/ChangeLog:
>
>         * config/riscv/constraints.md: Relax the condition for "S".
>         * doc/md.texi: Update.
>
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/riscv/asm-raw-symbol.c: New test.
> ---
>  gcc/config/riscv/constraints.md                 |  4 ++--
>  gcc/doc/md.texi                                 |  2 +-
>  gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c | 17 +++++++++++++++++
>  3 files changed, 20 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c
>
> diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
> index 41acaea04eb..bb012668fcb 100644
> --- a/gcc/config/riscv/constraints.md
> +++ b/gcc/config/riscv/constraints.md
> @@ -121,8 +121,8 @@ (define_memory_constraint "A"
>         (match_test "GET_CODE(XEXP(op,0)) == REG")))
>
>  (define_constraint "S"
> -  "A constraint that matches an absolute symbolic address."
> -  (match_operand 0 "absolute_symbolic_operand"))
> +  "A symbolic reference or label reference."
> +  (match_code "const,symbol_ref,label_ref"))
>
>  (define_constraint "U"
>    "@internal
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index b0c61925120..c75e5bf259d 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -1947,7 +1947,7 @@ Integer constant that is valid as an immediate operand in a 64-bit @code{MOV}
>  pseudo instruction
>
>  @item S
> -An absolute symbolic address or a label reference
> +A symbolic reference or label reference.
>
>  @item Y
>  Floating point constant zero
> diff --git a/gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c b/gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c
> new file mode 100644
> index 00000000000..eadf6d23fe1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c
> @@ -0,0 +1,17 @@
> +/* { dg-do compile } */
> +/* { dg-options "-fpic" } */
> +
> +extern int var;
> +
> +void
> +func (void)
> +{
> +label:
> +  __asm__ ("@ %0" : : "S" (func));
> +  __asm__ ("@ %0" : : "S" (&var + 1));
> +  __asm__ ("@ %0" : : "S" (&&label));
> +}
> +
> +/* { dg-final { scan-assembler "@ func" } } */
> +/* { dg-final { scan-assembler "@ var\\+4" } } */
> +/* { dg-final { scan-assembler "@ .L" } } */
> --
> 2.43.0.429.g432eaa2c6b-goog
>
Fangrui Song Feb. 1, 2024, 1:27 a.m. UTC | #2
On Tue, Jan 30, 2024 at 11:26 PM Kito Cheng <kito.cheng@gmail.com> wrote:
>
> I realized there is 's' constraint which is defined in GCC generic
> infra[1], and that's kinda what same as the new semantic of 'S' here,
>
> (define_constraint "s"
>  "Matches a symbolic integer constant."
>  (and (match_test "CONSTANT_P (op)")
>       (match_test "!CONST_SCALAR_INT_P (op)")
>       (match_test "!flag_pic || LEGITIMATE_PIC_OPERAND_P (op)")))
>
> Where const, symbol_ref and label_ref is match CONSTANT_P &&
> !CONST_SCALAR_INT_P,
> and LEGITIMATE_PIC_OPERAND_P is always 1 for RISC-V

Thanks for catching this! I read "symbolic integer constant" and
skipped, but did not realized that
this actually means a symbol or label reference with a constant offset.
I agree that "s" should be preferred.
I have jotted down some notes at
https://maskray.me/blog/2024-01-30-raw-symbol-names-in-inline-assembly

The condition !flag_pic || LEGITIMATE_PIC_OPERAND_P (op) highlights a
key distinction in GCC's handling of symbol references:

Non-PIC code (-fno-pic): The "i" and "s" constraints are freely permitted.
PIC code (-fpie and -fpic): The architecture-specific
LEGITIMATE_PIC_OPERAND_P(X) macro dictates whether these constraints
are allowed.

While the default implementation (gcc/defaults.h) is permissive (used
by MIPS, PowerPC, and RISC-V), many ports impose stricter
restrictions, often disallowing preemptible symbols under PIC.

This differentiation probably stems from historical and architectural
considerations:

Non-PIC code: Absolute addresses could be directly embedded in
instructions like an immediate integer operand.
PIC code with dynamic linking: The need for GOT indirection often
requires an addressing mode different from absolute addressing and
more than one instructions.

Nevertheless, I think this symbol preemptibility limitation for "s" is
unfortunate. Ideally, we could retain the current "i" for immediate
integer operand (after linking), and design "s" for a raw symbol name
with a constant offset, ignoring symbol preemptibility. This
architecture-agnostic "s" would simplify metadata section utilization
and boost code portability.


> The only difference is it also allows high, which is something like
> %hi(sym), but I think it's harmless in the use case.

I do not follow this. Do you have an example?

> However I found LLVM also not work on " asm(".reloc ., BFD_RELOC_NONE,
> %0" :: "S"(&ns::a[3]));",
> so maybe we could consider implement 's' in LLVM? and also add some
> document in riscv-c-api.md

Clang does not implement the offset yet.
I created https://github.com/llvm/llvm-project/pull/80201 to support "s"

> And just clarify, I don't have strong prefer on using 's', I am ok
> with relaxing 'S' too,
> propose using 's' is because that is work fine on RISC-V gcc for long
> time and no backward compatible issue,
> But I guess you have this proposal may came from ClangBuiltLinux, so
> 's' may not work for clang well due to backward compatible.

It seems that ClangBuiltLinux can live with "i" for now:)

I raised the topic due to a micro-optimization opportunity in
https://github.com/protocolbuffers/protobuf/blob/1fe463ce71b6acc60b3aef65d51185e3704cac8b/src/google/protobuf/stubs/common.h#L86
and I believe metadata sections will get more used and compilers
should be prepared for future uses.

I'll abandon this "S" change. I can create a test-only change if you
think the test coverage is useful, as we hardly have any non-rvv
inline asm tests at present...

> [1] https://gcc.gnu.org/onlinedocs/gcc/Simple-Constraints.html#index-s-in-constraint
> [2] https://github.com/riscv-non-isa/riscv-c-api-doc/blob/master/riscv-c-api.md#constraints-on-operands-of-inline-assembly-statements
>
> On Wed, Jan 31, 2024 at 1:02 PM Fangrui Song <maskray@google.com> wrote:
> >
> > The constraint "S" can only be used with a symbol that binds locally, so
> > the following does not work for -fpie/-fpic (GOT access is used).
> > ```
> > namespace ns { extern int var, a[4]; }
> > void foo() {
> >   asm(".pushsection .xxx,\"aw\"; .dc.a %0; .popsection" :: "S"(&ns::var));
> >   asm(".reloc ., BFD_RELOC_NONE, %0" :: "S"(&ns::a[3]));
> > }
> > ```
> >
> > This is overly restrictive, as many references like an absolute
> > relocation in a writable section or a non-SHF_ALLOC section should be
> > totally fine.  Allow symbols that do not bind locally, similar to
> > aarch64 "S" and x86-64 "Ws" (commit d7250100381b817114447d91fff4748526d4fb21).
> >
> > gcc/ChangeLog:
> >
> >         * config/riscv/constraints.md: Relax the condition for "S".
> >         * doc/md.texi: Update.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         * gcc.target/riscv/asm-raw-symbol.c: New test.
> > ---
> >  gcc/config/riscv/constraints.md                 |  4 ++--
> >  gcc/doc/md.texi                                 |  2 +-
> >  gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c | 17 +++++++++++++++++
> >  3 files changed, 20 insertions(+), 3 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c
> >
> > diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
> > index 41acaea04eb..bb012668fcb 100644
> > --- a/gcc/config/riscv/constraints.md
> > +++ b/gcc/config/riscv/constraints.md
> > @@ -121,8 +121,8 @@ (define_memory_constraint "A"
> >         (match_test "GET_CODE(XEXP(op,0)) == REG")))
> >
> >  (define_constraint "S"
> > -  "A constraint that matches an absolute symbolic address."
> > -  (match_operand 0 "absolute_symbolic_operand"))
> > +  "A symbolic reference or label reference."
> > +  (match_code "const,symbol_ref,label_ref"))
> >
> >  (define_constraint "U"
> >    "@internal
> > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> > index b0c61925120..c75e5bf259d 100644
> > --- a/gcc/doc/md.texi
> > +++ b/gcc/doc/md.texi
> > @@ -1947,7 +1947,7 @@ Integer constant that is valid as an immediate operand in a 64-bit @code{MOV}
> >  pseudo instruction
> >
> >  @item S
> > -An absolute symbolic address or a label reference
> > +A symbolic reference or label reference.
> >
> >  @item Y
> >  Floating point constant zero
> > diff --git a/gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c b/gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c
> > new file mode 100644
> > index 00000000000..eadf6d23fe1
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile } */
> > +/* { dg-options "-fpic" } */
> > +
> > +extern int var;
> > +
> > +void
> > +func (void)
> > +{
> > +label:
> > +  __asm__ ("@ %0" : : "S" (func));
> > +  __asm__ ("@ %0" : : "S" (&var + 1));
> > +  __asm__ ("@ %0" : : "S" (&&label));
> > +}
> > +
> > +/* { dg-final { scan-assembler "@ func" } } */
> > +/* { dg-final { scan-assembler "@ var\\+4" } } */
> > +/* { dg-final { scan-assembler "@ .L" } } */
> > --
> > 2.43.0.429.g432eaa2c6b-goog
> >
diff mbox series

Patch

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 41acaea04eb..bb012668fcb 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -121,8 +121,8 @@  (define_memory_constraint "A"
        (match_test "GET_CODE(XEXP(op,0)) == REG")))
 
 (define_constraint "S"
-  "A constraint that matches an absolute symbolic address."
-  (match_operand 0 "absolute_symbolic_operand"))
+  "A symbolic reference or label reference."
+  (match_code "const,symbol_ref,label_ref"))
 
 (define_constraint "U"
   "@internal
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index b0c61925120..c75e5bf259d 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -1947,7 +1947,7 @@  Integer constant that is valid as an immediate operand in a 64-bit @code{MOV}
 pseudo instruction
 
 @item S
-An absolute symbolic address or a label reference
+A symbolic reference or label reference.
 
 @item Y
 Floating point constant zero
diff --git a/gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c b/gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c
new file mode 100644
index 00000000000..eadf6d23fe1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/asm-raw-symbol.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fpic" } */
+
+extern int var;
+
+void
+func (void)
+{
+label:
+  __asm__ ("@ %0" : : "S" (func));
+  __asm__ ("@ %0" : : "S" (&var + 1));
+  __asm__ ("@ %0" : : "S" (&&label));
+}
+
+/* { dg-final { scan-assembler "@ func" } } */
+/* { dg-final { scan-assembler "@ var\\+4" } } */
+/* { dg-final { scan-assembler "@ .L" } } */