diff mbox

add initial support for J2 core to sh target

Message ID 20150831174909.GA6317@brightrain.aerifal.cx
State New
Headers show

Commit Message

Rich Felker Aug. 31, 2015, 5:49 p.m. UTC
The J2 Core is an open hardware cpu implementing the SH-2 instruction
set, with the addition of barrel shift instructions and an atomic
compare-and-swap instruction. This patch adds a cpu model option -mj2
to the sh target. Presently all it does is enable use of the barrel
shift instructions (and turns off assembler checking of the ISA via
--isa=any) but I will eventually add support for the new CAS
instruction as a new -matomic-model for use by the __sync and __atomic
builtins.

I've used the the name "J2" and "-mj2" rather than treating it as a
submodel variant of "sh2" ("SH2J" and "-m2j") because the official
name of this cpu model is "J2", with the intent of not misrepresenting
it as a Renesas product. However I'd like feedback from GCC's side on
how GCC wants to identify J2 in cpu model options, tuples, and
internally; that part is not set in stone.

The --isa=any passed to the assembler probably needs to be fixed
before this patch is ready for upstream (although I'd really just
prefer _always_ passing --isa=any to the assembler, since the current
behavior breaks runtime-switching implementations, which I need). One
complication is that passing --isa=j2 will fail with old binutils; I'm
not sure if this should be detected and handled or if we should just
require up-to-date binutils to use -mj2. In any case, I don't yet have
an assembler-side patch to add the --isa level.

I know this isn't ready for upstream yet but I'd appreciate feedback
on the patch so far and anything that I need to change in order for it
to be acceptable.

Rich

Comments

Oleg Endo Sept. 1, 2015, 1:45 p.m. UTC | #1
Hi Rich,

On 01 Sep 2015, at 02:49, Rich Felker <dalias@libc.org> wrote:

> The J2 Core is an open hardware cpu implementing the SH-2 instruction
> set, with the addition of barrel shift instructions and an atomic
> compare-and-swap instruction. This patch adds a cpu model option -mj2
> to the sh target. Presently all it does is enable use of the barrel
> shift instructions (and turns off assembler checking of the ISA via
> --isa=any) but I will eventually add support for the new CAS
> instruction as a new -matomic-model for use by the __sync and __atomic
> builtins.

It seems that this J2 atomic instruction(s ?) is not available to the public.  I've skimmed through the currently available J2 hardware sources, but couldn't find anything about it.  So it's just speculation, but probably you'll require a copyright assignment for the atomics parts.

> I've used the the name "J2" and "-mj2" rather than treating it as a
> submodel variant of "sh2" ("SH2J" and "-m2j") because the official
> name of this cpu model is "J2", with the intent of not misrepresenting
> it as a Renesas product.

But then, why do you add "shj2" in config.gcc?
I guess having an SH variant which doesn't match sh*-*-* would be odd.  I'd rather name it "SHJ2" everywhere (except the -m options which don't have the "sh" prefix).  Technically it really is just another SH variant so I'd suggest to treat it like that.  Might be simpler and less confusing for other things than GCC, too.

Also you might want to add the new option to that block in config.gcc

	echo "Unknown CPU used in --with-cpu=$with_cpu, known values:"  1>&2
	echo "m1 m2 m2e m3 m3e m4 m4-single m4-single-only m4-nofpu" 1>&2
	echo "m4a m4a-single m4a-single-only m4a-nofpu m4al" 1>&2
	echo "m2a m2a-single m2a-single-only m2a-nofpu" 1>&2
	exit 1
	;;

> The --isa=any passed to the assembler probably needs to be fixed
> before this patch is ready for upstream (although I'd really just
> prefer _always_ passing --isa=any to the assembler, since the current
> behavior breaks runtime-switching implementations, which I need).

I've added the --isa thing because LD also checks (or at least tries) compatibility of modules being linked together.  Also, it makes it easier to detect wrong/unexpected instructions in the output, which can happen sometimes by a bug in GCC or with inline-asm.  If you have other suggestions, please share.

> One
> complication is that passing --isa=j2 will fail with old binutils; I'm
> not sure if this should be detected and handled or if we should just
> require up-to-date binutils to use -mj2. In any case, I don't yet have
> an assembler-side patch to add the --isa level.

I'd say requiring new binutils is acceptable.  But that'd require being able to get a new binutils to begin with :)

Binutils will most likely require patches for the J2 atomic insn.  Thus maybe it's better to start with binutils first.

> I know this isn't ready for upstream yet but I'd appreciate feedback
> on the patch so far and anything that I need to change in order for it
> to be acceptable.

A changelog entry and documentation (new -m options etc) would be good.

Cheers,
Oleg
Rich Felker Sept. 1, 2015, 2:18 p.m. UTC | #2
On Tue, Sep 01, 2015 at 10:45:10PM +0900, Oleg Endo wrote:
> Hi Rich,
> 
> On 01 Sep 2015, at 02:49, Rich Felker <dalias@libc.org> wrote:
> 
> > The J2 Core is an open hardware cpu implementing the SH-2 instruction
> > set, with the addition of barrel shift instructions and an atomic
> > compare-and-swap instruction. This patch adds a cpu model option -mj2
> > to the sh target. Presently all it does is enable use of the barrel
> > shift instructions (and turns off assembler checking of the ISA via
> > --isa=any) but I will eventually add support for the new CAS
> > instruction as a new -matomic-model for use by the __sync and __atomic
> > builtins.
> 
> It seems that this J2 atomic instruction(s ?) is not available to
> the public. I've skimmed through the currently available J2 hardware
> sources, but couldn't find anything about it. So it's just
> speculation, but probably you'll require a copyright assignment for
> the atomics parts.

It's still under development and I'm not closely following the
hardware side of things. I'll hold off on submitting the atomic
support until it's in the public release and tested. My hope was to
get the basic support upstream first (having -mj2 is very helpful for
building the kernel since it makes the libgcc issue we were dealing
with go away) and add the atomic part later.

From the copyright side, though, IMO it's not a matter of whether the
feature is public but the non-triviality of the patch that would make
it require an assignment. My work on this is under contract with SE
Instruments and our arrangement is such that they're responsible for
copyright/assignment on the work I do on FSF projects. AFAIK they
don't yet have any assignment paperwork on file for this so I'll need
some guidance from whoever handles that for GCC.

> > I've used the the name "J2" and "-mj2" rather than treating it as a
> > submodel variant of "sh2" ("SH2J" and "-m2j") because the official
> > name of this cpu model is "J2", with the intent of not misrepresenting
> > it as a Renesas product.
> 
> But then, why do you add "shj2" in config.gcc?

Because the code it's added to just did s/m/sh/ as part of its text
processing. ;-) It's slightly ugly shell script but I'm just working
with what's there.

> I guess having an SH variant which doesn't match sh*-*-* would be
> odd. I'd rather name it "SHJ2" everywhere (except the -m options
> which don't have the "sh" prefix). Technically it really is just
> another SH variant so I'd suggest to treat it like that. Might be
> simpler and less confusing for other things than GCC, too.
> 
> Also you might want to add the new option to that block in config.gcc
> 
> 	echo "Unknown CPU used in --with-cpu=$with_cpu, known values:"  1>&2
> 	echo "m1 m2 m2e m3 m3e m4 m4-single m4-single-only m4-nofpu" 1>&2
> 	echo "m4a m4a-single m4a-single-only m4a-nofpu m4al" 1>&2
> 	echo "m2a m2a-single m2a-single-only m2a-nofpu" 1>&2
> 	exit 1
> 	;;

Yes, I agree it could be odd from the GCC side. That's actually why I
omitted tuples for now and just wanted to use -mj2 or --with-cpu: I
was concerned configure scripts would blow up on seeing a cpu family
name they don't know.

If you want to just follow the existing naming pattern for tuples, I
think "sh2j" might make more sense than "shj2". It would both be
easier to match (there's probably a lot of software that accepts
sh[:digit:] but not sh[:alnum:]) and it's sufficiently different from
the actual name of the cpu so as not to give the impression that "J2"
is short for "SH J2" or something.

> > The --isa=any passed to the assembler probably needs to be fixed
> > before this patch is ready for upstream (although I'd really just
> > prefer _always_ passing --isa=any to the assembler, since the current
> > behavior breaks runtime-switching implementations, which I need).
> 
> I've added the --isa thing because LD also checks (or at least
> tries) compatibility of modules being linked together. Also, it
> makes it easier to detect wrong/unexpected instructions in the
> output, which can happen sometimes by a bug in GCC or with
> inline-asm. If you have other suggestions, please share.

I'm not sure what the best way to achieve multiple goals is, but the
current behavior makes it so you need --isa=any (and a final binary
with weird ABI tag) to have a binary that supports atomic operations
on any SH model. musl libc already has such support (except the new J2
CAS instruction) and I would like to eventually provide a libatomic
approach for GCC too so that it's possible to use __sync/C11 atomics
and have the binary be safe to run on any model that supports the
baseline ISA & ABI you built for (e.g. all >=SH2 if you used -m2).

> > One
> > complication is that passing --isa=j2 will fail with old binutils; I'm
> > not sure if this should be detected and handled or if we should just
> > require up-to-date binutils to use -mj2. In any case, I don't yet have
> > an assembler-side patch to add the --isa level.
> 
> I'd say requiring new binutils is acceptable. But that'd require
> being able to get a new binutils to begin with :)
> 
> Binutils will most likely require patches for the J2 atomic insn.
> Thus maybe it's better to start with binutils first.

I have a patch for that part, just not expanding the
already-very-complex SH "family-tree" of instruction support. However
it's likely that encoding details will change (the draft encoding
overlaps with something used by SH2A IIRC, and the intent was not to
have such overlap) so I'm holding off on submitting it until the
hardware side works out this issue. This draft patch and all the
others I currently need for a working toolchain are available here:

https://github.com/richfelker/musl-cross-make/tree/master/patches

> > I know this isn't ready for upstream yet but I'd appreciate feedback
> > on the patch so far and anything that I need to change in order for it
> > to be acceptable.
> 
> A changelog entry and documentation (new -m options etc) would be good.

OK.

Rich
Oleg Endo Sept. 1, 2015, 4:24 p.m. UTC | #3
On 01 Sep 2015, at 23:18, Rich Felker <dalias@libc.org> wrote:

> On Tue, Sep 01, 2015 at 10:45:10PM +0900, Oleg Endo wrote:
>> It seems that this J2 atomic instruction(s ?) is not available to
>> the public. I've skimmed through the currently available J2 hardware
>> sources, but couldn't find anything about it. So it's just
>> speculation, but probably you'll require a copyright assignment for
>> the atomics parts.
> 
> It's still under development and I'm not closely following the
> hardware side of things. I'll hold off on submitting the atomic
> support until it's in the public release and tested. My hope was to
> get the basic support upstream first (having -mj2 is very helpful for
> building the kernel since it makes the libgcc issue we were dealing
> with go away) and add the atomic part later.

Sounds reasonable.

> From the copyright side, though, IMO it's not a matter of whether the
> feature is public but the non-triviality of the patch that would make
> it require an assignment.

Yes, that's what I meant.  Looking at your cas.l patch for binutils, I guess the patch for GCC will be non-trivial.

> My work on this is under contract with SE
> Instruments and our arrangement is such that they're responsible for
> copyright/assignment on the work I do on FSF projects. AFAIK they
> don't yet have any assignment paperwork on file for this so I'll need
> some guidance from whoever handles that for GCC.

Sorry, I don't what's the best option in this particular situation.  Maybe somebody else can jump in.

>> But then, why do you add "shj2" in config.gcc?
> 
> Because the code it's added to just did s/m/sh/ as part of its text
> processing. ;-) It's slightly ugly shell script but I'm just working
> with what's there.

(*)

> Yes, I agree it could be odd from the GCC side. That's actually why I
> omitted tuples for now and just wanted to use -mj2 or --with-cpu: I
> was concerned configure scripts would blow up on seeing a cpu family
> name they don't know.
> 
> If you want to just follow the existing naming pattern for tuples, I
> think "sh2j" might make more sense than "shj2". It would both be
> easier to match (there's probably a lot of software that accepts
> sh[:digit:] but not sh[:alnum:]) and it's sufficiently different from
> the actual name of the cpu so as not to give the impression that "J2"
> is short for "SH J2" or something.

SH2J is also OK.  But then that thing above (*) should be fixed and sh2j should be used consistently.

> I'm not sure what the best way to achieve multiple goals is, but the
> current behavior makes it so you need --isa=any (and a final binary
> with weird ABI tag) to have a binary that supports atomic operations
> on any SH model. musl libc already has such support (except the new J2
> CAS instruction) and I would like to eventually provide a libatomic
> approach for GCC too so that it's possible to use __sync/C11 atomics
> and have the binary be safe to run on any model that supports the
> baseline ISA & ABI you built for (e.g. all >=SH2 if you used -m2).

I don't know the details of your implementation.  The compiler generated atomic sequences are not really compatible.  The safest thing is not to enable any atomic model in GCC and let it emit function calls to __atomic*. 

> I have a patch for that part, just not expanding the
> already-very-complex SH "family-tree" of instruction support. However
> it's likely that encoding details will change (the draft encoding
> overlaps with something used by SH2A IIRC, and the intent was not to
> have such overlap)

Yeah, it overlaps with the first 16 bit word of the 32 bit SH2A load/store insns.

> so I'm holding off on submitting it until the
> hardware side works out this issue.

Sounds reasonable.

> This draft patch and all the
> others I currently need for a working toolchain are available here:
> 
> https://github.com/richfelker/musl-cross-make/tree/master/patches

Thanks for sharing.

Cheers,
Oleg
Rich Felker Sept. 1, 2015, 5:08 p.m. UTC | #4
On Wed, Sep 02, 2015 at 01:24:55AM +0900, Oleg Endo wrote:
> > I'm not sure what the best way to achieve multiple goals is, but the
> > current behavior makes it so you need --isa=any (and a final binary
> > with weird ABI tag) to have a binary that supports atomic operations
> > on any SH model. musl libc already has such support (except the new J2
> > CAS instruction) and I would like to eventually provide a libatomic
> > approach for GCC too so that it's possible to use __sync/C11 atomics
> > and have the binary be safe to run on any model that supports the
> > baseline ISA & ABI you built for (e.g. all >=SH2 if you used -m2).
> 
> I don't know the details of your implementation. The compiler
> generated atomic sequences are not really compatible. The safest
> thing is not to enable any atomic model in GCC and let it emit
> function calls to __atomic*.

Exactly -- but then, libatomic.a needs to contain J2-specific cas.l
opcodes and SH4A-specific movli.l/movco.l opcodes and code that
selects at runtime which to use (or whether to use imask or gusa)
based on hwcap, etc. The point is that a mix of opcodes for different
ISA levels end up being in the final binary, which might otherwise be
targeted for SH-2 baseline so it can run on any of them.

> > I have a patch for that part, just not expanding the
> > already-very-complex SH "family-tree" of instruction support. However
> > it's likely that encoding details will change (the draft encoding
> > overlaps with something used by SH2A IIRC, and the intent was not to
> > have such overlap)
> 
> Yeah, it overlaps with the first 16 bit word of the 32 bit SH2A
> load/store insns.
> 
> > so I'm holding off on submitting it until the
> > hardware side works out this issue.
> 
> Sounds reasonable.

In the mean time, do you have any suggestsions on how the ISA level
stuff should be done to add J2 on the binutils side?

Thanks for reviewing.

Rich
Oleg Endo Sept. 2, 2015, 3:16 p.m. UTC | #5
On 02 Sep 2015, at 02:08, Rich Felker <dalias@libc.org> wrote:

> On Wed, Sep 02, 2015 at 01:24:55AM +0900, Oleg Endo wrote:
>>> I'm not sure what the best way to achieve multiple goals is, but the
>>> current behavior makes it so you need --isa=any (and a final binary
>>> with weird ABI tag) to have a binary that supports atomic operations
>>> on any SH model. musl libc already has such support (except the new J2
>>> CAS instruction) and I would like to eventually provide a libatomic
>>> approach for GCC too so that it's possible to use __sync/C11 atomics
>>> and have the binary be safe to run on any model that supports the
>>> baseline ISA & ABI you built for (e.g. all >=SH2 if you used -m2).
>> 
>> I don't know the details of your implementation. The compiler
>> generated atomic sequences are not really compatible. The safest
>> thing is not to enable any atomic model in GCC and let it emit
>> function calls to __atomic*.
> 
> Exactly -- but then, libatomic.a needs to contain J2-specific cas.l
> opcodes and SH4A-specific movli.l/movco.l opcodes and code that
> selects at runtime which to use (or whether to use imask or gusa)
> based on hwcap, etc. The point is that a mix of opcodes for different
> ISA levels end up being in the final binary, which might otherwise be
> targeted for SH-2 baseline so it can run on any of them.
> 
>>> I have a patch for that part, just not expanding the
>>> already-very-complex SH "family-tree" of instruction support. However
>>> it's likely that encoding details will change (the draft encoding
>>> overlaps with something used by SH2A IIRC, and the intent was not to
>>> have such overlap)
>> 
>> Yeah, it overlaps with the first 16 bit word of the 32 bit SH2A
>> load/store insns.
>> 
>>> so I'm holding off on submitting it until the
>>> hardware side works out this issue.
>> 
>> Sounds reasonable.
> 
> In the mean time, do you have any suggestsions on how the ISA level
> stuff should be done to add J2 on the binutils side?

Let's continue this topic on the binutils list
( https://sourceware.org/ml/binutils/2015-09/msg00031.html )
diff mbox

Patch

--- gcc-5.2.0.orig/gcc/config.gcc
+++ gcc-5.2.0/gcc/config.gcc
@@ -2668,7 +2671,7 @@ 
 	  sh2a-single-only | sh2a-single | sh2a-nofpu | sh2a | \
 	  sh4a-single-only | sh4a-single | sh4a-nofpu | sh4a | sh4al | \
 	  sh4-single-only | sh4-single | sh4-nofpu | sh4 | sh4-300 | \
-	  sh3e | sh3 | sh2e | sh2 | sh1) ;;
+	  sh3e | sh3 | sh2e | sh2 | sh1 | shj2 ) ;;
 	"")	sh_cpu_default=${sh_cpu_target} ;;
 	*)	echo "with_cpu=$with_cpu not supported"; exit 1 ;;
 	esac
@@ -2687,9 +2690,9 @@ 
 			sh_multilibs="`echo $sh_multilibs|sed -e s/m4/sh4-nofpu/ -e s/,m4-[^,]*//g -e s/,m[23]e// -e s/m2a,m2a-single/m2a-nofpu/ -e s/m5-..m....,//g`"
 		fi
 	fi
-	target_cpu_default=SELECT_`echo ${sh_cpu_default}|tr abcdefghijklmnopqrstuvwxyz- ABCDEFGHIJKLMNOPQRSTUVWXYZ_`
+	target_cpu_default=SELECT_`echo ${sh_cpu_default}|sed 's/^shj/j/'|tr abcdefghijklmnopqrstuvwxyz- ABCDEFGHIJKLMNOPQRSTUVWXYZ_`
 	tm_defines=${tm_defines}' SH_MULTILIB_CPU_DEFAULT=\"'`echo $sh_cpu_default|sed s/sh/m/`'\"'
-	tm_defines="$tm_defines SUPPORT_`echo $sh_cpu_default | sed 's/^m/sh/' | tr abcdefghijklmnopqrstuvwxyz- ABCDEFGHIJKLMNOPQRSTUVWXYZ_`=1"
+	tm_defines="$tm_defines SUPPORT_`echo $sh_cpu_default | sed -e 's/^m/sh/' -e 's/^shj/j/' | tr abcdefghijklmnopqrstuvwxyz- ABCDEFGHIJKLMNOPQRSTUVWXYZ_`=1"
 	sh_multilibs=`echo $sh_multilibs | sed -e 's/,/ /g' -e 's/^[Ss][Hh]/m/' -e 's/ [Ss][Hh]/ m/g' | tr ABCDEFGHIJKLMNOPQRSTUVWXYZ_ abcdefghijklmnopqrstuvwxyz-`
 	for sh_multilib in ${sh_multilibs}; do
 		case ${sh_multilib} in
@@ -4106,6 +4109,8 @@ 
 			;;
 		m4a | m4a-single | m4a-single-only | m4a-nofpu | m4al)
 		        ;;
+		mj2)
+			;;
 		*)
 			echo "Unknown CPU used in --with-cpu=$with_cpu, known values:"  1>&2
 			echo "m1 m2 m2e m3 m3e m4 m4-single m4-single-only m4-nofpu" 1>&2
--- gcc-5.2.0.orig/gcc/config/sh/sh.h
+++ gcc-5.2.0/gcc/config/sh/sh.h
@@ -139,6 +139,7 @@ 
 #define SELECT_SH2A_SINGLE	 (MASK_SH_E | MASK_HARD_SH2A \
 				  | MASK_FPU_SINGLE | MASK_HARD_SH2A_DOUBLE \
 				  | MASK_SH2 | MASK_SH1)
+#define SELECT_J2		 (MASK_SH2 | MASK_J2 | SELECT_SH1)
 #define SELECT_SH3		 (MASK_SH3 | SELECT_SH2)
 #define SELECT_SH3E		 (MASK_SH_E | MASK_FPU_SINGLE | SELECT_SH3)
 #define SELECT_SH4_NOFPU	 (MASK_HARD_SH4 | SELECT_SH3)
@@ -162,6 +163,9 @@ 
 #define SUPPORT_SH2 1
 #endif
 #if SUPPORT_SH2
+#define SUPPORT_J2 1
+#endif
+#ifdef SUPPORT_J2
 #define SUPPORT_SH3 1
 #define SUPPORT_SH2A_NOFPU 1
 #endif
@@ -211,7 +215,7 @@ 
 #define MASK_ARCH (MASK_SH1 | MASK_SH2 | MASK_SH3 | MASK_SH_E | MASK_SH4 \
 		   | MASK_HARD_SH2A | MASK_HARD_SH2A_DOUBLE | MASK_SH4A \
 		   | MASK_HARD_SH4 | MASK_FPU_SINGLE | MASK_SH5 \
-		   | MASK_FPU_SINGLE_ONLY)
+		   | MASK_FPU_SINGLE_ONLY | MASK_J2)
 
 /* This defaults us to big-endian.  */
 #ifndef TARGET_ENDIAN_DEFAULT
@@ -271,6 +275,7 @@ 
 %(subtarget_asm_isa_spec) %(subtarget_asm_spec) \
 %{m1:--isa=sh} \
 %{m2:--isa=sh2} \
+%{mj2:--isa=any} \
 %{m2e:--isa=sh2e} \
 %{m3:--isa=sh3} \
 %{m3e:--isa=sh3e} \
@@ -1834,7 +1839,7 @@ 
 
 /* Nonzero if the target supports dynamic shift instructions
    like shad and shld.  */
-#define TARGET_DYNSHIFT (TARGET_SH3 || TARGET_SH2A)
+#define TARGET_DYNSHIFT (TARGET_SH3 || TARGET_SH2A || TARGET_J2)
 
 /* The cost of using the dynamic shift insns (shad, shld) are the same
    if they are available.  If they are not available a library function will
--- gcc-5.2.0.orig/gcc/config/sh/sh.opt
+++ gcc-5.2.0/gcc/config/sh/sh.opt
@@ -71,6 +71,10 @@ 
 Target RejectNegative Condition(SUPPORT_SH2E)
 Generate SH2e code
 
+mj2
+Target RejectNegative Mask(J2) Condition(SUPPORT_J2)
+Generate J2 code
+
 m3
 Target RejectNegative Mask(SH3) Condition(SUPPORT_SH3)
 Generate SH3 code