[RFC] alpha qemu arithmetic exceptions

First of all, kudos - with current qemu tree qemu-alpha-system is
working pretty well - debian install and a *lot* of builds work just fine.
As in, getting from lenny to pretty complete squeeze toolchain, including gcj,
openjdk6 and a lot of crap needed to satisfy build-deps of those, plus all
priority:required and most of priority:important ones.  It's a _lot_ of
beating and the damn thing survives - the problems had been with debian
packages themselves (fstatat() bug in lenny libc, epically buggered build-deps
in gcc-defaults, etc.).  As it is, one core of 6-way 3.3GHz phenom II is quite
capable of running a home-grown autobuilder.  Feels like ~250-300MHz alpha
with a very fast disk...

	Remaining problems, AFAICS, are around floating point traps.
I've found one in glibc testsuite (math/tests-misc.c; overflow in
ADDS/SU ends up with wrong results from fetestexcept() - only FE_OVERFLOW is
set, while the sucker expects FE_INEXACT as well and actual hardware sets both)
and another in gcc one (with -funsafe-math-optimizations CVTST/S on denorms
triggers SIGFPE/FPE_FLTINV).

	The libc one is a bug in gen_fp_exc_raise_ignore() - the difference
between ADDS/SU and ADDS/SUI is only in trapping, not storing results in
FPCR.INE and friends.  Both will have the same effect on those and
    if (ignore) {
        tcg_gen_andi_i32(exc, exc, ~ignore);
    }
in gen_fp_exc_raise_ignore() leads to exc & ignore not reaching the
update of env->fpcr_exc_status in helper_fp_exc_raise_s().  See 4.7.8:
[quote]
	In addition, the FPCR gives a summary of each exception type for the 
	exception conditions detected by all IEEE floating-point operates thus
	far, as well as an overall summary bit that indicates whether any of
	these exception conditions has been detected. The indiividual exception
	bits match exactly in purpose and order the exception bits found in the
	exception summary quadword that is pushed for arithmetic traps. However,
	for each instruction, these exception bits arse set independent of the
	trapping mode specified for the instruction. Therefore, even though
	trapping may be disabled for a certain exceptional condition, the fact
	that the exceptional condition was encountered by an instruction is
	still recorded in the FPCR.
[end quote]
And yes, on actual hardware both ADDS/SU and ADDS/SUI set FPCR.INE the same
way - verified by direct experiment.

While we are at it, I'm not quite sure what plain ADDS will leave in FPCR.INE
if it traps on overflow.  It's probably entirely academical, but it might be
worth checking if ELF ABI for Alpha has anything to say about the state seen
by fetestexcept() in SIGFPE handler...

Not sure what's the decent way to fix that - we could, of course, follow that
    tcg_gen_ld8u_i32(exc, cpu_env,
                     offsetof(CPUAlphaState, fp_status.float_exception_flags));
with generating code that would do |= into ->fpcr_exc_status, but I don't
know if we'd blow the footprint to hell by doing so.  Alternatively, we could
do that in helpers called before we start raising exceptions, and I really
wonder what happens with plain CVTQS - do we get FPCR.INE set there anyway?
If so, we really have to do it at least in that helper...  Comments?

	The gcc one comes from the fact that we never set EXC_M_SWC,
whether we come from helper_fp_exc_raise() or from helper_fp_exc_raise_s(),
so kernel-side do_entArith() skips
        if (summary & 1) {
                /* Software-completion summary bit is set, so try to
                   emulate the instruction.  If the processor supports
                   precise exceptions, we don't have to search.  */
                if (!amask(AMASK_PRECISE_TRAP))
                        si_code = alpha_fp_emul(regs->pc - 4);
                else
                        si_code = alpha_fp_emul_imprecise(regs, write_mask);
                if (si_code == 0)
                        return;
        }  
and buggers off to raise SIGFPE.  That's easy to fix, but... we also
get trap PC pointing to offending instruction, not the next one after it,
as 4.7.7.3 would require and as the kernel expects.

	I'm not sure what to do with trap PC - bumping env->pc by 4 after
cpu_restore_state() in dynamic_excp() seems to work, but I'm not at all sure
it's correct; I don't know qemu well enough to tell.

	Anyway, delta that seems to fix the gcc one (gcc.dg/pr28796-2.c from
gcc-4.3 and later) follows.  Again, I'm not at all sure if handling of
env->pc in there is safe from qemu POV and I'd like like to get comments on
that from somebody more familiar with qemu guts.

[RFC] alpha qemu arithmetic exceptions

Commit Message

Comments

Patch