Message ID | 1440433079-14458-31-git-send-email-rth@twiddle.net |
---|---|
State | New |
Headers | show |
On 08/24/2015 09:17 AM, Richard Henderson wrote: > Signed-off-by: Richard Henderson <rth@twiddle.net> > --- > target-tilegx/translate.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 49 insertions(+), 1 deletion(-) > > diff --git a/target-tilegx/translate.c b/target-tilegx/translate.c > index 210e912..2a0798a 100644 > --- a/target-tilegx/translate.c > +++ b/target-tilegx/translate.c > @@ -180,6 +180,19 @@ static void gen_saturate_op(TCGv tdest, TCGv tsrca, TCGv tsrcb, > tcg_temp_free(t0); > } > > +static void gen_atomic_excp(DisasContext *dc, unsigned dest, unsigned srca, > + unsigned srcb, TileExcp excp) > +{ > +#ifdef CONFIG_USER_ONLY > + TCGv_i32 t = tcg_const_i32((dest << 16) | (srca << 8) | srcb); > + tcg_gen_st_i32(t, cpu_env, offsetof(CPUTLGState, excparam)); > + tcg_temp_free_i32(t); > + gen_exception(dc, excp); > +#else > + gen_exception(dc, TILEGX_EXCP_OPCODE_UNIMPLEMENTED); > +#endif > +} This is broken. While it does work well enough for Hello World, implementing a non-trap instruction with an exception is extremely dicey for TileGX. The issue is that TileGX bundles operate atomically, with no RAW issues between the instructions of the bundle. Consider a bundle like { add r0, r0, r1 ; exch r2, r0, r3 } In Chen's implementation, the writeback to r0 would occur before the exception, and so the exch would happen to the wrong address. In my implementation here, the exception would occur before the writeback, and so the result of the add would be discarded. While retaining the current start_exclusive scheme, it would appear that we need to store more data here -- to wit, the contents of the registers not their numbers -- and delay the exception until after writeback. Even better, of course, would be to not exit the TB at all, and use host atomic instructions... I suppose that multi-threading patch set is still in limbo? r~
On 8/25/15 12:15, Richard Henderson wrote: > On 08/24/2015 09:17 AM, Richard Henderson wrote: >> Signed-off-by: Richard Henderson <rth@twiddle.net> >> --- >> target-tilegx/translate.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++- >> 1 file changed, 49 insertions(+), 1 deletion(-) >> >> diff --git a/target-tilegx/translate.c b/target-tilegx/translate.c >> index 210e912..2a0798a 100644 >> --- a/target-tilegx/translate.c >> +++ b/target-tilegx/translate.c >> @@ -180,6 +180,19 @@ static void gen_saturate_op(TCGv tdest, TCGv tsrca, TCGv tsrcb, >> tcg_temp_free(t0); >> } >> >> +static void gen_atomic_excp(DisasContext *dc, unsigned dest, unsigned srca, >> + unsigned srcb, TileExcp excp) >> +{ >> +#ifdef CONFIG_USER_ONLY >> + TCGv_i32 t = tcg_const_i32((dest << 16) | (srca << 8) | srcb); >> + tcg_gen_st_i32(t, cpu_env, offsetof(CPUTLGState, excparam)); >> + tcg_temp_free_i32(t); >> + gen_exception(dc, excp); >> +#else >> + gen_exception(dc, TILEGX_EXCP_OPCODE_UNIMPLEMENTED); >> +#endif >> +} Originally, I used set_exception(), not gen_exception(). > > > This is broken. While it does work well enough for Hello World, implementing a non-trap instruction with an exception is extremely dicey for TileGX. The issue is that TileGX bundles operate atomically, with no RAW issues between the instructions of the bundle. > > Consider a bundle like > > { add r0, r0, r1 ; exch r2, r0, r3 } > > In Chen's implementation, the writeback to r0 would occur before the exception, and so the exch would happen to the wrong address. In my implementation here, the exception would occur before the writeback, and so the result of the add would be discarded. We use tmp regs for buffering the r0. - calculate x1 pipe, and save result to r0 tmp reg. - exch the original r0 and r3 to r2 tmp reg. - set exception flag (which will cause exception, later). - save the result tmp regs to r0 or r2. - gen exception. > > While retaining the current start_exclusive scheme, it would appear that we need to store more data here -- to wit, the contents of the registers not their numbers -- and delay the exception until after writeback. > > Even better, of course, would be to not exit the TB at all, and use host atomic instructions... I suppose that multi-threading patch set is still in limbo? > I guess, we need not. Thanks. -- Chen Gang Open, share, and attitude like air, water, and life which God blessed
---------------------------------------- > From: xili_gchen_5257@hotmail.com > To: rth@twiddle.net; qemu-devel@nongnu.org > CC: walt@tilera.com; cmetcalf@ezchip.com; peter.maydell@linaro.org > Subject: Re: [Qemu-devel] [PATCH v14 30/33] target-tilegx: Handle atomic instructions > Date: Tue, 25 Aug 2015 21:11:11 +0800 > > On 8/25/15 12:15, Richard Henderson wrote: >> On 08/24/2015 09:17 AM, Richard Henderson wrote: >>> Signed-off-by: Richard Henderson <rth@twiddle.net> >>> --- >>> target-tilegx/translate.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++- >>> 1 file changed, 49 insertions(+), 1 deletion(-) >>> >>> diff --git a/target-tilegx/translate.c b/target-tilegx/translate.c >>> index 210e912..2a0798a 100644 >>> --- a/target-tilegx/translate.c >>> +++ b/target-tilegx/translate.c >>> @@ -180,6 +180,19 @@ static void gen_saturate_op(TCGv tdest, TCGv tsrca, TCGv tsrcb, >>> tcg_temp_free(t0); >>> } >>> >>> +static void gen_atomic_excp(DisasContext *dc, unsigned dest, unsigned srca, >>> + unsigned srcb, TileExcp excp) >>> +{ >>> +#ifdef CONFIG_USER_ONLY >>> + TCGv_i32 t = tcg_const_i32((dest << 16) | (srca << 8) | srcb); >>> + tcg_gen_st_i32(t, cpu_env, offsetof(CPUTLGState, excparam)); >>> + tcg_temp_free_i32(t); >>> + gen_exception(dc, excp); >>> +#else >>> + gen_exception(dc, TILEGX_EXCP_OPCODE_UNIMPLEMENTED); >>> +#endif >>> +} > > Originally, I used set_exception(), not gen_exception(). > >> >> >> This is broken. While it does work well enough for Hello World, implementing a non-trap instruction with an exception is extremely dicey for TileGX. The issue is that TileGX bundles operate atomically, with no RAW issues between the instructions of the bundle. >> >> Consider a bundle like >> >> { add r0, r0, r1 ; exch r2, r0, r3 } >> >> In Chen's implementation, the writeback to r0 would occur before the exception, and so the exch would happen to the wrong address. In my implementation here, the exception would occur before the writeback, and so the result of the add would be discarded. > > We use tmp regs for buffering the r0. > > - calculate x1 pipe, and save result to r0 tmp reg. > Oh, typo, calculate x0 pipe, and save result to r0 tmp reg. > - exch the original r0 and r3 to r2 tmp reg. > > - set exception flag (which will cause exception, later). > > - save the result tmp regs to r0 or r2. > > - gen exception. > >> >> While retaining the current start_exclusive scheme, it would appear that we need to store more data here -- to wit, the contents of the registers not their numbers -- and delay the exception until after writeback. >> >> Even better, of course, would be to not exit the TB at all, and use host atomic instructions... I suppose that multi-threading patch set is still in limbo? >> > > I guess, we need not. > > > Thanks. > -- > Chen Gang > > Open, share, and attitude like air, water, and life which God blessed >
On 08/25/2015 06:12 AM, Chen Gang wrote: > > > ---------------------------------------- >> From: xili_gchen_5257@hotmail.com >> To: rth@twiddle.net; qemu-devel@nongnu.org >> CC: walt@tilera.com; cmetcalf@ezchip.com; peter.maydell@linaro.org >> Subject: Re: [Qemu-devel] [PATCH v14 30/33] target-tilegx: Handle atomic instructions >> Date: Tue, 25 Aug 2015 21:11:11 +0800 >> >> On 8/25/15 12:15, Richard Henderson wrote: >>> On 08/24/2015 09:17 AM, Richard Henderson wrote: >>>> Signed-off-by: Richard Henderson <rth@twiddle.net> >>>> --- >>>> target-tilegx/translate.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++- >>>> 1 file changed, 49 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/target-tilegx/translate.c b/target-tilegx/translate.c >>>> index 210e912..2a0798a 100644 >>>> --- a/target-tilegx/translate.c >>>> +++ b/target-tilegx/translate.c >>>> @@ -180,6 +180,19 @@ static void gen_saturate_op(TCGv tdest, TCGv tsrca, TCGv tsrcb, >>>> tcg_temp_free(t0); >>>> } >>>> >>>> +static void gen_atomic_excp(DisasContext *dc, unsigned dest, unsigned srca, >>>> + unsigned srcb, TileExcp excp) >>>> +{ >>>> +#ifdef CONFIG_USER_ONLY >>>> + TCGv_i32 t = tcg_const_i32((dest << 16) | (srca << 8) | srcb); >>>> + tcg_gen_st_i32(t, cpu_env, offsetof(CPUTLGState, excparam)); >>>> + tcg_temp_free_i32(t); >>>> + gen_exception(dc, excp); >>>> +#else >>>> + gen_exception(dc, TILEGX_EXCP_OPCODE_UNIMPLEMENTED); >>>> +#endif >>>> +} >> >> Originally, I used set_exception(), not gen_exception(). >> >>> >>> >>> This is broken. While it does work well enough for Hello World, implementing a non-trap instruction with an exception is extremely dicey for TileGX. The issue is that TileGX bundles operate atomically, with no RAW issues between the instructions of the bundle. >>> >>> Consider a bundle like >>> >>> { add r0, r0, r1 ; exch r2, r0, r3 } >>> >>> In Chen's implementation, the writeback to r0 would occur before the exception, and so the exch would happen to the wrong address. In my implementation here, the exception would occur before the writeback, and so the result of the add would be discarded. >> >> We use tmp regs for buffering the r0. >> >> - calculate x1 pipe, and save result to r0 tmp reg. >> > > Oh, typo, calculate x0 pipe, and save result to r0 tmp reg. > >> - exch the original r0 and r3 to r2 tmp reg. >> >> - set exception flag (which will cause exception, later). >> >> - save the result tmp regs to r0 or r2. >> >> - gen exception. Exactly. Now re-read what I wrote and see if you can spot the problem with this. r~
On 8/25/15 22:28, Richard Henderson wrote: > On 08/25/2015 06:12 AM, Chen Gang wrote: >>>> >>>> Consider a bundle like >>>> >>>> { add r0, r0, r1 ; exch r2, r0, r3 } >>>> >>>> In Chen's implementation, the writeback to r0 would occur before the exception, and so the exch would happen to the wrong address. In my implementation here, the exception would occur before the writeback, and so the result of the add would be discarded. >>> >>> We use tmp regs for buffering the r0. >>> >>> - calculate x1 pipe, and save result to r0 tmp reg. >>> >> >> Oh, typo, calculate x0 pipe, and save result to r0 tmp reg. >> >>> - exch the original r0 and r3 to r2 tmp reg. >>> >>> - set exception flag (which will cause exception, later). >>> >>> - save the result tmp regs to r0 or r2. >>> >>> - gen exception. > > Exactly. Now re-read what I wrote and see if you can spot the problem with this. > OK, thanks. In my memory, originally, we discussed about related things, you provided several good ideas (temp regs, and gen exceptions ...). Based on them, I implemented it in this way, hope it is correct. At present, for gcc testsuite, I met several (10+) pending issues, which need to be fixed, next. So I guess, there must be still any issues which we did not find in the current whole tilegx code. Thanks. -- Chen Gang Open, share, and attitude like air, water, and life which God blessed
diff --git a/target-tilegx/translate.c b/target-tilegx/translate.c index 210e912..2a0798a 100644 --- a/target-tilegx/translate.c +++ b/target-tilegx/translate.c @@ -180,6 +180,19 @@ static void gen_saturate_op(TCGv tdest, TCGv tsrca, TCGv tsrcb, tcg_temp_free(t0); } +static void gen_atomic_excp(DisasContext *dc, unsigned dest, unsigned srca, + unsigned srcb, TileExcp excp) +{ +#ifdef CONFIG_USER_ONLY + TCGv_i32 t = tcg_const_i32((dest << 16) | (srca << 8) | srcb); + tcg_gen_st_i32(t, cpu_env, offsetof(CPUTLGState, excparam)); + tcg_temp_free_i32(t); + gen_exception(dc, excp); +#else + gen_exception(dc, TILEGX_EXCP_OPCODE_UNIMPLEMENTED); +#endif +} + static void gen_dblaligni(TCGv tdest, TCGv tsrca, TCGv tsrcb, int shr) { TCGv t0 = tcg_temp_new(); @@ -595,8 +608,13 @@ static TileExcp gen_rrr_opcode(DisasContext *dc, unsigned opext, mnemonic = "cmpeq"; break; case OE_RRR(CMPEXCH4, 0, X1): + gen_atomic_excp(dc, dest, srca, srcb, TILEGX_EXCP_OPCODE_CMPEXCH4); + mnemonic = "cmpexch4"; + break; case OE_RRR(CMPEXCH, 0, X1): - return TILEGX_EXCP_OPCODE_UNIMPLEMENTED; + gen_atomic_excp(dc, dest, srca, srcb, TILEGX_EXCP_OPCODE_CMPEXCH); + mnemonic = "cmpexch"; + break; case OE_RRR(CMPLES, 0, X0): case OE_RRR(CMPLES, 0, X1): case OE_RRR(CMPLES, 2, Y0): @@ -662,7 +680,13 @@ static TileExcp gen_rrr_opcode(DisasContext *dc, unsigned opext, mnemonic = "dblalign"; break; case OE_RRR(EXCH4, 0, X1): + gen_atomic_excp(dc, dest, srca, srcb, TILEGX_EXCP_OPCODE_EXCH4); + mnemonic = "exch4"; + break; case OE_RRR(EXCH, 0, X1): + gen_atomic_excp(dc, dest, srca, srcb, TILEGX_EXCP_OPCODE_EXCH); + mnemonic = "exch"; + break; case OE_RRR(FDOUBLE_ADDSUB, 0, X0): case OE_RRR(FDOUBLE_ADD_FLAGS, 0, X0): case OE_RRR(FDOUBLE_MUL_FLAGS, 0, X0): @@ -672,13 +696,37 @@ static TileExcp gen_rrr_opcode(DisasContext *dc, unsigned opext, case OE_RRR(FDOUBLE_UNPACK_MAX, 0, X0): case OE_RRR(FDOUBLE_UNPACK_MIN, 0, X0): case OE_RRR(FETCHADD4, 0, X1): + gen_atomic_excp(dc, dest, srca, srcb, TILEGX_EXCP_OPCODE_FETCHADD4); + mnemonic = "fetchadd4"; + break; case OE_RRR(FETCHADDGEZ4, 0, X1): + gen_atomic_excp(dc, dest, srca, srcb, TILEGX_EXCP_OPCODE_FETCHADDGEZ4); + mnemonic = "fetchaddgez4"; + break; case OE_RRR(FETCHADDGEZ, 0, X1): + gen_atomic_excp(dc, dest, srca, srcb, TILEGX_EXCP_OPCODE_FETCHADDGEZ); + mnemonic = "fetchaddgez"; + break; case OE_RRR(FETCHADD, 0, X1): + gen_atomic_excp(dc, dest, srca, srcb, TILEGX_EXCP_OPCODE_FETCHADD); + mnemonic = "fetchadd"; + break; case OE_RRR(FETCHAND4, 0, X1): + gen_atomic_excp(dc, dest, srca, srcb, TILEGX_EXCP_OPCODE_FETCHAND4); + mnemonic = "fetchand4"; + break; case OE_RRR(FETCHAND, 0, X1): + gen_atomic_excp(dc, dest, srca, srcb, TILEGX_EXCP_OPCODE_FETCHAND); + mnemonic = "fetchand"; + break; case OE_RRR(FETCHOR4, 0, X1): + gen_atomic_excp(dc, dest, srca, srcb, TILEGX_EXCP_OPCODE_FETCHOR4); + mnemonic = "fetchor4"; + break; case OE_RRR(FETCHOR, 0, X1): + gen_atomic_excp(dc, dest, srca, srcb, TILEGX_EXCP_OPCODE_FETCHOR); + mnemonic = "fetchor"; + break; case OE_RRR(FSINGLE_ADD1, 0, X0): case OE_RRR(FSINGLE_ADDSUB2, 0, X0): case OE_RRR(FSINGLE_MUL1, 0, X0):
Signed-off-by: Richard Henderson <rth@twiddle.net> --- target-tilegx/translate.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 49 insertions(+), 1 deletion(-)