diff mbox

[RFC] alpha qemu arithmetic exceptions

Message ID 20140624043423.GX18016@ZenIV.linux.org.uk
State New
Headers show

Commit Message

Al Viro June 24, 2014, 4:34 a.m. UTC
First of all, kudos - with current qemu tree qemu-alpha-system is
working pretty well - debian install and a *lot* of builds work just fine.
As in, getting from lenny to pretty complete squeeze toolchain, including gcj,
openjdk6 and a lot of crap needed to satisfy build-deps of those, plus all
priority:required and most of priority:important ones.  It's a _lot_ of
beating and the damn thing survives - the problems had been with debian
packages themselves (fstatat() bug in lenny libc, epically buggered build-deps
in gcc-defaults, etc.).  As it is, one core of 6-way 3.3GHz phenom II is quite
capable of running a home-grown autobuilder.  Feels like ~250-300MHz alpha
with a very fast disk...

	Remaining problems, AFAICS, are around floating point traps.
I've found one in glibc testsuite (math/tests-misc.c; overflow in
ADDS/SU ends up with wrong results from fetestexcept() - only FE_OVERFLOW is
set, while the sucker expects FE_INEXACT as well and actual hardware sets both)
and another in gcc one (with -funsafe-math-optimizations CVTST/S on denorms
triggers SIGFPE/FPE_FLTINV).

	The libc one is a bug in gen_fp_exc_raise_ignore() - the difference
between ADDS/SU and ADDS/SUI is only in trapping, not storing results in
FPCR.INE and friends.  Both will have the same effect on those and
    if (ignore) {
        tcg_gen_andi_i32(exc, exc, ~ignore);
    }
in gen_fp_exc_raise_ignore() leads to exc & ignore not reaching the
update of env->fpcr_exc_status in helper_fp_exc_raise_s().  See 4.7.8:
[quote]
	In addition, the FPCR gives a summary of each exception type for the 
	exception conditions detected by all IEEE floating-point operates thus
	far, as well as an overall summary bit that indicates whether any of
	these exception conditions has been detected. The indiividual exception
	bits match exactly in purpose and order the exception bits found in the
	exception summary quadword that is pushed for arithmetic traps. However,
	for each instruction, these exception bits arse set independent of the
	trapping mode specified for the instruction. Therefore, even though
	trapping may be disabled for a certain exceptional condition, the fact
	that the exceptional condition was encountered by an instruction is
	still recorded in the FPCR.
[end quote]
And yes, on actual hardware both ADDS/SU and ADDS/SUI set FPCR.INE the same
way - verified by direct experiment.

While we are at it, I'm not quite sure what plain ADDS will leave in FPCR.INE
if it traps on overflow.  It's probably entirely academical, but it might be
worth checking if ELF ABI for Alpha has anything to say about the state seen
by fetestexcept() in SIGFPE handler...

Not sure what's the decent way to fix that - we could, of course, follow that
    tcg_gen_ld8u_i32(exc, cpu_env,
                     offsetof(CPUAlphaState, fp_status.float_exception_flags));
with generating code that would do |= into ->fpcr_exc_status, but I don't
know if we'd blow the footprint to hell by doing so.  Alternatively, we could
do that in helpers called before we start raising exceptions, and I really
wonder what happens with plain CVTQS - do we get FPCR.INE set there anyway?
If so, we really have to do it at least in that helper...  Comments?

	The gcc one comes from the fact that we never set EXC_M_SWC,
whether we come from helper_fp_exc_raise() or from helper_fp_exc_raise_s(),
so kernel-side do_entArith() skips
        if (summary & 1) {
                /* Software-completion summary bit is set, so try to
                   emulate the instruction.  If the processor supports
                   precise exceptions, we don't have to search.  */
                if (!amask(AMASK_PRECISE_TRAP))
                        si_code = alpha_fp_emul(regs->pc - 4);
                else
                        si_code = alpha_fp_emul_imprecise(regs, write_mask);
                if (si_code == 0)
                        return;
        }  
and buggers off to raise SIGFPE.  That's easy to fix, but... we also
get trap PC pointing to offending instruction, not the next one after it,
as 4.7.7.3 would require and as the kernel expects.

	I'm not sure what to do with trap PC - bumping env->pc by 4 after
cpu_restore_state() in dynamic_excp() seems to work, but I'm not at all sure
it's correct; I don't know qemu well enough to tell.

	Anyway, delta that seems to fix the gcc one (gcc.dg/pr28796-2.c from
gcc-4.3 and later) follows.  Again, I'm not at all sure if handling of
env->pc in there is safe from qemu POV and I'd like like to get comments on
that from somebody more familiar with qemu guts.

Comments

Al Viro June 24, 2014, 4:52 p.m. UTC | #1
On Tue, Jun 24, 2014 at 05:34:23AM +0100, Al Viro wrote:
> 	First of all, kudos - with current qemu tree qemu-alpha-system is
> working pretty well - debian install and a *lot* of builds work just fine.
> As in, getting from lenny to pretty complete squeeze toolchain, including gcj,
> openjdk6 and a lot of crap needed to satisfy build-deps of those, plus all
> priority:required and most of priority:important ones.  It's a _lot_ of
> beating and the damn thing survives - the problems had been with debian
> packages themselves (fstatat() bug in lenny libc, epically buggered build-deps
> in gcc-defaults, etc.).  As it is, one core of 6-way 3.3GHz phenom II is quite
> capable of running a home-grown autobuilder.  Feels like ~250-300MHz alpha
> with a very fast disk...
> 
> 	Remaining problems, AFAICS, are around floating point traps.
> I've found one in glibc testsuite (math/tests-misc.c; overflow in
> ADDS/SU ends up with wrong results from fetestexcept() - only FE_OVERFLOW is
> set, while the sucker expects FE_INEXACT as well and actual hardware sets both)
> and another in gcc one (with -funsafe-math-optimizations CVTST/S on denorms
> triggers SIGFPE/FPE_FLTINV).
> 
> 	The libc one is a bug in gen_fp_exc_raise_ignore() - the difference
> between ADDS/SU and ADDS/SUI is only in trapping, not storing results in
> FPCR.INE and friends.  Both will have the same effect on those and
>     if (ignore) {
>         tcg_gen_andi_i32(exc, exc, ~ignore);
>     }
> in gen_fp_exc_raise_ignore() leads to exc & ignore not reaching the
> update of env->fpcr_exc_status in helper_fp_exc_raise_s().  See 4.7.8:
> [quote]
> 	In addition, the FPCR gives a summary of each exception type for the 
> 	exception conditions detected by all IEEE floating-point operates thus
> 	far, as well as an overall summary bit that indicates whether any of
> 	these exception conditions has been detected. The indiividual exception
> 	bits match exactly in purpose and order the exception bits found in the
> 	exception summary quadword that is pushed for arithmetic traps. However,
> 	for each instruction, these exception bits arse set independent of the
> 	trapping mode specified for the instruction. Therefore, even though
> 	trapping may be disabled for a certain exceptional condition, the fact
> 	that the exceptional condition was encountered by an instruction is
> 	still recorded in the FPCR.
> [end quote]
> And yes, on actual hardware both ADDS/SU and ADDS/SUI set FPCR.INE the same
> way - verified by direct experiment.

BTW, here's another testcase:
nclude <stdio.h>

unsigned long __attribute__((noinline)) f(double x)
{
        return (unsigned long)x;        // SVCTQ/SVC
}

main()
{
        unsigned long x;
        extern unsigned long __ieee_get_fp_control(void);
        printf("before:%lx\n", __ieee_get_fp_control());
        x = f(1ULL<<63);
        printf("after:%lx\n", __ieee_get_fp_control());
        printf("result:%lx\n", x);
}

On actual hardware:
before:0
after:20000
result:8000000000000000

On qemu:
before:0
after:0
result:8000000000000000


IOW, gen_fcvttq() is also affected, not only gen_fp_exc_raise().

Can't we simply have separate helpers for various trap suffices, with
all this work on getting exc, etc. taken inside them?  It's not as if
we distinguished many variants, after all...  Right now we have:
	plain, /U, /V
	/S, /SU
	/SUI
	/SV
	/SVI
and /SU should probably be separated from /S - we do want to suppress underflow
traps on those (again, FPCR.UND should be set regardless).  That's what, 5 or 6
helpers?  Might want to separate /V and /U from plain - AFAICS, we get it
wrong with things like ADDS/U vs. ADDS (it's just that normally underflow
traps are disabled by FPCR.DUND).  I hadn't experimented with those yet, but
even if it turns out that they *are* different - 8 helpers instead of the 2 we
currently have, sharing most of the actual source...

Another thing: shouldn't arithmetics on denorms without /S raise EXC_M_INV,
rather than EXC_M_UNF?
Richard Henderson June 24, 2014, 6:23 p.m. UTC | #2
On 06/23/2014 09:34 PM, Al Viro wrote:
> 	Anyway, delta that seems to fix the gcc one (gcc.dg/pr28796-2.c from
> gcc-4.3 and later) follows.  Again, I'm not at all sure if handling of
> env->pc in there is safe from qemu POV and I'd like like to get comments on
> that from somebody more familiar with qemu guts.

Thanks for the diagnosis on the gcc test case.  I've been meaning to
investigate some of these edge cases for quite a while and never quite
got there.

>  static inline void inline_fp_exc_raise(CPUAlphaState *env, uintptr_t retaddr,
> -                                       uint32_t exc, uint32_t regno)
> +                                       uint32_t exc, uint32_t regno, uint32_t sw)
>  {
>      if (exc) {
> -        uint32_t hw_exc = 0;
> +        uint32_t hw_exc = sw;
>  
>          if (exc & float_flag_invalid) {
>              hw_exc |= EXC_M_INV;
> @@ -75,7 +75,7 @@ static inline void inline_fp_exc_raise(CPUAlphaState *env, uintptr_t retaddr,
>     doesn't apply.  */
>  void helper_fp_exc_raise(CPUAlphaState *env, uint32_t exc, uint32_t regno)
>  {
> -    inline_fp_exc_raise(env, GETPC(), exc, regno);
> +    inline_fp_exc_raise(env, GETPC(), exc, regno, 0);
>  }
>  
>  /* Raise exceptions for ieee fp insns with software completion.  */
> @@ -84,7 +84,7 @@ void helper_fp_exc_raise_s(CPUAlphaState *env, uint32_t exc, uint32_t regno)
>      if (exc) {
>          env->fpcr_exc_status |= exc;
>          exc &= ~env->fpcr_exc_mask;
> -        inline_fp_exc_raise(env, GETPC(), exc, regno);
> +        inline_fp_exc_raise(env, GETPC(), exc, regno, EXC_M_SWC);
>      }
>  }

This part looks good.

> diff --git a/target-alpha/helper.c b/target-alpha/helper.c
> index 7c053a3..538c6b2 100644
> --- a/target-alpha/helper.c
> +++ b/target-alpha/helper.c
> @@ -527,6 +527,7 @@ void QEMU_NORETURN dynamic_excp(CPUAlphaState *env, uintptr_t retaddr,
>      env->error_code = error;
>      if (retaddr) {
>          cpu_restore_state(cs, retaddr);
> +	env->pc += 4;

This one needs a different fix, since dynamic_excp is also used from
alpha_cpu_unassigned_access, and I'm pretty sure the mchk should have the
address of the memory insn.  But that should be easy to fix up.


r~
Richard Henderson June 24, 2014, 6:33 p.m. UTC | #3
On 06/24/2014 09:52 AM, Al Viro wrote:
> unsigned long __attribute__((noinline)) f(double x)
> {
>         return (unsigned long)x;        // SVCTQ/SVC
> }
> 
> main()
> {
>         unsigned long x;
>         extern unsigned long __ieee_get_fp_control(void);
>         printf("before:%lx\n", __ieee_get_fp_control());
>         x = f(1ULL<<63);
>         printf("after:%lx\n", __ieee_get_fp_control());
>         printf("result:%lx\n", x);
> }
> 
> On actual hardware:
> before:0
> after:20000
> result:8000000000000000
> 
> On qemu:
> before:0
> after:0
> result:8000000000000000
> 
> 
> IOW, gen_fcvttq() is also affected, not only gen_fp_exc_raise().

Clearly a gross misunderstanding of what bits are actually computed, never mind
what gets signaled.

Thanks for the test.  I've not had working hardware for a couple of years to
validate what's supposed to get set and what isn't.

> Can't we simply have separate helpers for various trap suffices, with
> all this work on getting exc, etc. taken inside them?  It's not as if
> we distinguished many variants, after all...  Right now we have:
> 	plain, /U, /V
> 	/S, /SU
> 	/SUI
> 	/SV
> 	/SVI

We used to have separate helpers... at least for the modes that had been
implemented at the time.  The combinatorial explosion ugly though -- 4
different versions of add, sub, etc.  I thought the partial inlining was a
decent solution, as far as maintainability, but it's not unreasonable to disagree.

> Another thing: shouldn't arithmetics on denorms without /S raise EXC_M_INV,
> rather than EXC_M_UNF?

No idea.  Should they?


r~
Al Viro June 24, 2014, 8:32 p.m. UTC | #4
On Tue, Jun 24, 2014 at 11:33:32AM -0700, Richard Henderson wrote:
> >         return (unsigned long)x;        // SVCTQ/SVC
					       CVTTQ/SVC, of course...
> Clearly a gross misunderstanding of what bits are actually computed, never mind
> what gets signaled.
> 
> Thanks for the test.  I've not had working hardware for a couple of years to
> validate what's supposed to get set and what isn't.

If you have any ideas for testing, I do have working hw (the box that is
currently alive is ev45, though; I _can_ try to boot a UP1000 one, but
I make no promises regarding its fans, both in PSU and in CPU module ;-/)

> > Can't we simply have separate helpers for various trap suffices, with
> > all this work on getting exc, etc. taken inside them?  It's not as if
> > we distinguished many variants, after all...  Right now we have:
> > 	plain, /U, /V
> > 	/S, /SU
> > 	/SUI
> > 	/SV
> > 	/SVI
> 
> We used to have separate helpers... at least for the modes that had been
> implemented at the time.  The combinatorial explosion ugly though -- 4
> different versions of add, sub, etc.  I thought the partial inlining was a
> decent solution, as far as maintainability, but it's not unreasonable to disagree.

Um?  No, I mean having gen_fp_exc_raise() generate a call of one of the 8
helpers; gen_ieee_arith3() and friends would remain as-is.  It's just that
instead of generating load to exc, andi, call of helper_fp_exc_raise_s or
call of helper_fp_exc_raise we would generate a call of one of the
helper_fp_exc_raise{,_u,_v,_s,_su,_sui,_sv,_svi} and let that sucker deal
with loading exc, updating ->fpcr_exc_status and generating traps.

> > Another thing: shouldn't arithmetics on denorms without /S raise EXC_M_INV,
> > rather than EXC_M_UNF?
> 
> No idea.  Should they?

They seem to - both from the arch.manual and from direct experiment...  And
they do set FPCR.INV at the same time, not just trigger the trap.
Al Viro June 24, 2014, 8:47 p.m. UTC | #5
On Tue, Jun 24, 2014 at 11:23:01AM -0700, Richard Henderson wrote:
> >      env->error_code = error;
> >      if (retaddr) {
> >          cpu_restore_state(cs, retaddr);
> > +	env->pc += 4;
> 
> This one needs a different fix, since dynamic_excp is also used from
> alpha_cpu_unassigned_access, and I'm pretty sure the mchk should have the
> address of the memory insn.  But that should be easy to fix up.

That's not a problem, actually - there we have
    dynamic_excp(env, 0, EXCP_MCHK, 0);
so retaddr is going to be 0 and that env->pc += 4 won't be reached at all...
Richard Henderson June 24, 2014, 8:57 p.m. UTC | #6
On 06/24/2014 01:32 PM, Al Viro wrote:
> If you have any ideas for testing, I do have working hw (the box that is
> currently alive is ev45, though; I _can_ try to boot a UP1000 one, but
> I make no promises regarding its fans, both in PSU and in CPU module ;-/)

Ah.  Gotta be careful with ev4/45... half of the fpu is unimplemented, and so
if you're not careful all you're testing is the kernel emulation behaviour.

> Um?  No, I mean having gen_fp_exc_raise() generate a call of one of the 8
> helpers; gen_ieee_arith3() and friends would remain as-is.  It's just that
> instead of generating load to exc, andi, call of helper_fp_exc_raise_s or
> call of helper_fp_exc_raise we would generate a call of one of the
> helper_fp_exc_raise{,_u,_v,_s,_su,_sui,_sv,_svi} and let that sucker deal
> with loading exc, updating ->fpcr_exc_status and generating traps.

Ah, I getcha.  Yes, that makes sense.

>>> Another thing: shouldn't arithmetics on denorms without /S raise EXC_M_INV,
>>> rather than EXC_M_UNF?
>>
>> No idea.  Should they?
> 
> They seem to - both from the arch.manual and from direct experiment...  And
> they do set FPCR.INV at the same time, not just trigger the trap.

Ok.  I'll try to make time to fix up some of this stuff this weekend.


r~
Al Viro June 24, 2014, 9:24 p.m. UTC | #7
On Tue, Jun 24, 2014 at 01:57:52PM -0700, Richard Henderson wrote:
> On 06/24/2014 01:32 PM, Al Viro wrote:
> > If you have any ideas for testing, I do have working hw (the box that is
> > currently alive is ev45, though; I _can_ try to boot a UP1000 one, but
> > I make no promises regarding its fans, both in PSU and in CPU module ;-/)
> 
> Ah.  Gotta be careful with ev4/45... half of the fpu is unimplemented, and so
> if you're not careful all you're testing is the kernel emulation behaviour.

*nod*

> > Um?  No, I mean having gen_fp_exc_raise() generate a call of one of the 8
> > helpers; gen_ieee_arith3() and friends would remain as-is.  It's just that
> > instead of generating load to exc, andi, call of helper_fp_exc_raise_s or
> > call of helper_fp_exc_raise we would generate a call of one of the
> > helper_fp_exc_raise{,_u,_v,_s,_su,_sui,_sv,_svi} and let that sucker deal
> > with loading exc, updating ->fpcr_exc_status and generating traps.
> 
> Ah, I getcha.  Yes, that makes sense.

FWIW, the crudest version of that is simply
+    env->fpcr_exc_status |= (uint8_t)env->fp_status.float_exception_flags;
in the very beginning of helper_fp_exc_raise_s().  And yes, it recovers
math/tests-misc.c, even though it's obviously not a good final fix.

Al, off to figure out the black magic TCG is using to generate calls...
Richard Henderson June 24, 2014, 9:32 p.m. UTC | #8
On 06/24/2014 02:24 PM, Al Viro wrote:
> Al, off to figure out the black magic TCG is using to generate calls...

If you've a helper

DEF_HELPER_1(halt, void, i64)

then

  gen_helper_halt(...)

will generate the tcg ops that result in the call.


r~
Al Viro June 25, 2014, 7:01 a.m. UTC | #9
On Tue, Jun 24, 2014 at 02:32:46PM -0700, Richard Henderson wrote:
> On 06/24/2014 02:24 PM, Al Viro wrote:
> > Al, off to figure out the black magic TCG is using to generate calls...
> 
> If you've a helper
> 
> DEF_HELPER_1(halt, void, i64)
> 
> then
> 
>   gen_helper_halt(...)
> 
> will generate the tcg ops that result in the call.

Another fun issue:

CVTTQ is both underreporting the overflow *AND* reports the wrong kind - FOV
instead of IOV.

	* it misses reporting overflows for case when it *knows* that
	  overflow will happen - the need to shift up by more than 63 bits.
	  Trivially fixed, of course.  There overflow cases leave wrong
	  result as well - should be 0.
	* it also misses reporting overflows for case when value is in
	  ranges 2^63..2^64-1 and -2^64+1..-2^63-1.  And yes, it's
	  asymmetric - 2^63 is an overflow, -2^63 isn't.
	* overflow is reported by float_raise(float_flag_overflow, &FP_STATUS).
	  Wrong flag - it should be IOV, not FOV.  And it should be set
	  in FPCR regardless of the trap modifier (IOV, this VI thing is
	  wrong - we should deal with that only when we generate a trap).
	* All overflow cases should raise INE as well.

Could we steal bit 1 in float_exception_flags for IOV?  It is (currently?)
unused -
enum {
    float_flag_invalid   =  1,
    float_flag_divbyzero =  4,
    float_flag_overflow  =  8,
    float_flag_underflow = 16,
    float_flag_inexact   = 32,
    float_flag_input_denormal = 64,
    float_flag_output_denormal = 128
};

That would allow to deal with that crap nicely - we could have it raise
the new flag, then have helper_fp_exc_raise_... for default trap mode
mask it out (and yes, we need to set FPCR flags in default mode, as well
as /U and /V - confirmed by direct experiment *and* by TFM).

If we can't... well, we could put that flag separately, but it would be
more unpleasant.  Folks?
Peter Maydell June 25, 2014, 9:27 a.m. UTC | #10
On 25 June 2014 08:01, Al Viro <viro@zeniv.linux.org.uk> wrote:
> Could we steal bit 1 in float_exception_flags for IOV?  It is (currently?)
> unused -
> enum {
>     float_flag_invalid   =  1,
>     float_flag_divbyzero =  4,
>     float_flag_overflow  =  8,
>     float_flag_underflow = 16,
>     float_flag_inexact   = 32,
>     float_flag_input_denormal = 64,
>     float_flag_output_denormal = 128
> };
>
> That would allow to deal with that crap nicely - we could have it raise
> the new flag, then have helper_fp_exc_raise_... for default trap mode
> mask it out (and yes, we need to set FPCR flags in default mode, as well
> as /U and /V - confirmed by direct experiment *and* by TFM).
>
> If we can't... well, we could put that flag separately, but it would be
> more unpleasant.  Folks?

I think it's OK to put extra float_flags in, provided you can define
their semantics in terms that make sense for more than one
architecture (even if only one arch actually happens to need them).
The input_denormal/output_denormal flags only get used for ARM,
for instance. However if you wanted to split overflow from integer
overflow you'd need to fix up all the other targets which expect
them to generate just one exception flag...

(Note that any patch touching softfloat files needs to come with
a statement that you're happy to license it under either the
softfloat-2a or softfloat-2b licenses, because we're currently
midway through the tedious process of trying to relicense it.)

thanks
-- PMM
Al Viro June 25, 2014, 2:26 p.m. UTC | #11
On Wed, Jun 25, 2014 at 10:27:11AM +0100, Peter Maydell wrote:

> I think it's OK to put extra float_flags in, provided you can define
> their semantics in terms that make sense for more than one
> architecture (even if only one arch actually happens to need them).
> The input_denormal/output_denormal flags only get used for ARM,
> for instance. However if you wanted to split overflow from integer
> overflow you'd need to fix up all the other targets which expect
> them to generate just one exception flag...

Hmm...  On alpha it's generated only by the following: CVTTQ, CVTGQ,
CVTQL.  I.e. conversions to integer formats that can be held in FPU
registers (double -> s64, VAX double -> s64 and s64 -> s32).  Does
softfloat even have anything similar?  As it is, it's all in alpha-specific
code; double -> s64 might have a chance to be generic (semantics:
	* denorms -> 0, raise "inexact", provided that they survived to
that point and hadn't buggered off with "invalid"
	* exact integers in range -2^63 .. 2^63-1 -> equivalent 64bit
integer
	* values outside of that range (all with zero fractional part,
since the weight of LSB of significand will be considerably greater than 1
by that point) -> lower 64 bits of value, raise "integer overflow" and
"inexact".
	* values with non-zero fractional part -> rounded according to
rounding mode, raise "inexact".
), but existing float64_to_int64() isn't it - very different behaviour
on overflows.  Incidentally, VAX double to s64 is buggered in that area -
it *does* try to use float64_to_int64() and, on top of getting INV instead
of IOV, gets the wrong result in case of overflow (MAX_LONG/MIN_LONG instead
of value in -2^63..2^63-1 comparable modulo 2^64 with exact value taken
as element of $\Bbb Z$).

And s64->s32 is just plain weird - not in the part that has IOV raised on
values outside of -2^31..2^31-1, but in the bit shuffling it's doing if
the test passes; alpha FPU stores s32 value in bits 63-62/58-29, with the
rest filled with zeroes.

In any case, it's not splitting float_overflow_flag; similar cases in
softfloat.c raise float_invalid_flag.  I don't know if it would make
sense to try and teach float64_to_int64() about this kind of return
value on overflow...

> (Note that any patch touching softfloat files needs to come with
> a statement that you're happy to license it under either the
> softfloat-2a or softfloat-2b licenses, because we're currently
> midway through the tedious process of trying to relicense it.)

Wouldn't be a problem, but I doubt that it would be particulary useful to touch
softfloat.c due to the reasons above, anyway.
Peter Maydell June 25, 2014, 5:41 p.m. UTC | #12
On 25 June 2014 15:26, Al Viro <viro@zeniv.linux.org.uk> wrote:
> Hmm...  On alpha it's generated only by the following: CVTTQ, CVTGQ,
> CVTQL.  I.e. conversions to integer formats that can be held in FPU
> registers (double -> s64, VAX double -> s64 and s64 -> s32).  Does
> softfloat even have anything similar?

Well, VAX doubles are a bit out of scope for an IEEE emulation
library :-)

>  As it is, it's all in alpha-specific code;

It does sound like that's the best place for it. In that case, you
don't want to add a flag to the softfloat float_flags -- they are
specifically for indicating softfloat's status/exceptions. Flags
handled purely in CPU-specific code should be stored in the
CPU specific state struct somewhere.

thanks
-- PMM
diff mbox

Patch

diff --git a/target-alpha/fpu_helper.c b/target-alpha/fpu_helper.c
index d2d776c..a24535b 100644
--- a/target-alpha/fpu_helper.c
+++ b/target-alpha/fpu_helper.c
@@ -45,10 +45,10 @@  uint32_t helper_fp_exc_get(CPUAlphaState *env)
 }
 
 static inline void inline_fp_exc_raise(CPUAlphaState *env, uintptr_t retaddr,
-                                       uint32_t exc, uint32_t regno)
+                                       uint32_t exc, uint32_t regno, uint32_t sw)
 {
     if (exc) {
-        uint32_t hw_exc = 0;
+        uint32_t hw_exc = sw;
 
         if (exc & float_flag_invalid) {
             hw_exc |= EXC_M_INV;
@@ -75,7 +75,7 @@  static inline void inline_fp_exc_raise(CPUAlphaState *env, uintptr_t retaddr,
    doesn't apply.  */
 void helper_fp_exc_raise(CPUAlphaState *env, uint32_t exc, uint32_t regno)
 {
-    inline_fp_exc_raise(env, GETPC(), exc, regno);
+    inline_fp_exc_raise(env, GETPC(), exc, regno, 0);
 }
 
 /* Raise exceptions for ieee fp insns with software completion.  */
@@ -84,7 +84,7 @@  void helper_fp_exc_raise_s(CPUAlphaState *env, uint32_t exc, uint32_t regno)
     if (exc) {
         env->fpcr_exc_status |= exc;
         exc &= ~env->fpcr_exc_mask;
-        inline_fp_exc_raise(env, GETPC(), exc, regno);
+        inline_fp_exc_raise(env, GETPC(), exc, regno, EXC_M_SWC);
     }
 }
 
diff --git a/target-alpha/helper.c b/target-alpha/helper.c
index 7c053a3..538c6b2 100644
--- a/target-alpha/helper.c
+++ b/target-alpha/helper.c
@@ -527,6 +527,7 @@  void QEMU_NORETURN dynamic_excp(CPUAlphaState *env, uintptr_t retaddr,
     env->error_code = error;
     if (retaddr) {
         cpu_restore_state(cs, retaddr);
+	env->pc += 4;
     }
     cpu_loop_exit(cs);
 }