Message ID | CAFULd4ZzARNW3Z-XEVbpG9KS6-PY0nzzfzFs3iOHJ9fADsWU0w@mail.gmail.com |
---|---|
State | New |
Headers | show |
Series | [alpha] : Fix sysdeps/alpha/remqu.S clobbering $f3 reg | expand |
On Fri, Jan 18, 2019 at 2:06 PM Uros Bizjak <ubizjak@gmail.com> wrote:Please also find the missing ChangeLog entry. Please find missing ChangeLog entry in this message. Also, please note I don't have write access to glibc repository. Uros. > Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via > $y_is_neg path. There was missing restore of $f3 before the return > from the function. > > The patch also reorders insns a bit, so it becomes similar as much as > possible to divqu.S. > > Without the patch, math/big testcase from Go-1.11 testsuite (that > includes lots of corner cases that exercise remqu) FAIL, with patched > function, the testcase PASSes without problems. 2019-18-01 Uroš Bizjak <ubizjak@gmail.com> * sysdeps/alpha/remqu.S (__remqu): Add missing restore of $f3 register on $y_is_neg path. Reorder instructions similar to divqu.S order. > Signed-off-by: Uros Bizjak <ubizjak@gmail.com> > > Uros.
On 1/18/19 5:06 AM, Uros Bizjak wrote: > Hello! > > Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via > $y_is_neg path. There was missing restore of $f3 before the return > from the function. > > The patch also reorders insns a bit, so it becomes similar as much as > possible to divqu.S. > > Without the patch, math/big testcase from Go-1.11 testsuite (that > includes lots of corner cases that exercise remqu) FAIL, with patched > function, the testcase PASSes without problems. > +++ b/sysdeps/alpha/remqu.S > @@ -59,20 +59,19 @@ __remqu: > subq Y, 1, AT > stt $f0, 0(sp) > and Y, AT, AT > + excb > + beq AT, $powerof2 > > stt $f1, 8(sp) > - excb Why are you moving the excb above the powerof2 branch? The path at powerof2 does not touch fpcr or issue fp insns. > @@ -94,12 +93,12 @@ __remqu: > mulq AT, Y, AT > ldt $f0, 0(sp) > ldt $f3, 48(sp) > - lda sp, FRAME(sp) > cfi_remember_state > cfi_restore ($f0) > cfi_restore ($f1) > cfi_restore ($f3) > cfi_def_cfa_offset (0) > + lda sp, FRAME(sp) This change is actively wrong wrt the unwind info. > @@ -246,12 +247,16 @@ $y_is_neg: > quotient must be either 0 or 1, so the remainder must be X > or X-Y, so just compute it directly. */ > cmpule Y, X, AT > + excb > + mt_fpcr $f3 > subq X, Y, RV > ldt $f0, 0(sp) > + ldt $f3, 48(sp) > cmoveq AT, X, RV > > lda sp, FRAME(sp) > cfi_restore ($f0) > + cfi_restore ($f3) > cfi_def_cfa_offset (0) > ret $31, (RA), 1 This appears to be the only change required to fix the bug. Can you walk me through why the other changes? r~
On Thu, Jan 24, 2019 at 9:23 AM Richard Henderson <rth@twiddle.net> wrote: > > On 1/18/19 5:06 AM, Uros Bizjak wrote: > > Hello! > > > > Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via > > $y_is_neg path. There was missing restore of $f3 before the return > > from the function. > > > > The patch also reorders insns a bit, so it becomes similar as much as > > possible to divqu.S. > > > > Without the patch, math/big testcase from Go-1.11 testsuite (that > > includes lots of corner cases that exercise remqu) FAIL, with patched > > function, the testcase PASSes without problems. > > > > +++ b/sysdeps/alpha/remqu.S > > @@ -59,20 +59,19 @@ __remqu: > > subq Y, 1, AT > > stt $f0, 0(sp) > > and Y, AT, AT > > + excb > > + beq AT, $powerof2 > > > > stt $f1, 8(sp) > > - excb > > Why are you moving the excb above the powerof2 branch? > The path at powerof2 does not touch fpcr or issue fp insns. This was meant to unify the flow with the __divqu assembly, which does the above before calling DIVBYZERO. The idea was that __divqu is used much more than __remqu, so the later should do the same as the former. > > @@ -94,12 +93,12 @@ __remqu: > > mulq AT, Y, AT > > ldt $f0, 0(sp) > > ldt $f3, 48(sp) > > - lda sp, FRAME(sp) > > cfi_remember_state > > cfi_restore ($f0) > > cfi_restore ($f1) > > cfi_restore ($f3) > > cfi_def_cfa_offset (0) > > + lda sp, FRAME(sp) > > This change is actively wrong wrt the unwind info. Again, this will match __divqu assembly. It looks that __divqu needs to be fixed then. > > @@ -246,12 +247,16 @@ $y_is_neg: > > quotient must be either 0 or 1, so the remainder must be X > > or X-Y, so just compute it directly. */ > > cmpule Y, X, AT > > + excb > > + mt_fpcr $f3 > > subq X, Y, RV > > ldt $f0, 0(sp) > > + ldt $f3, 48(sp) > > cmoveq AT, X, RV > > > > lda sp, FRAME(sp) > > cfi_restore ($f0) > > + cfi_restore ($f3) > > cfi_def_cfa_offset (0) > > ret $31, (RA), 1 > > This appears to be the only change required to fix the bug. That is true. This part is the problematic part and clobbers $f3. Should I resend the patch only with this part fixed? > Can you walk me through why the other changes? As said above, I was trying to make __remqu like __divqu, but it looks that __divqu should be fixed in some places. Thanks, Uros.
On 1/24/19 12:35 AM, Uros Bizjak wrote: > That is true. This part is the problematic part and clobbers $f3. > Should I resend the patch only with this part fixed? Yes please. We're in the final stages of release and any changes should be most minimal. We can fix the div/rem matchup and unwind errors for the next release. r~
diff --git a/sysdeps/alpha/remqu.S b/sysdeps/alpha/remqu.S index c2c5caf3c20..7210628f973 100644 --- a/sysdeps/alpha/remqu.S +++ b/sysdeps/alpha/remqu.S @@ -59,20 +59,19 @@ __remqu: subq Y, 1, AT stt $f0, 0(sp) and Y, AT, AT + excb + beq AT, $powerof2 stt $f1, 8(sp) - excb stt $f3, 48(sp) - beq AT, $powerof2 cfi_rel_offset ($f0, 0) cfi_rel_offset ($f1, 8) cfi_rel_offset ($f3, 48) + mf_fpcr $f3 _ITOFT2 X, $f0, 16, Y, $f1, 24 - mf_fpcr $f3 cvtqt $f0, $f0 cvtqt $f1, $f1 - blt X, $x_is_neg divt/c $f0, $f1, $f0 @@ -94,12 +93,12 @@ __remqu: mulq AT, Y, AT ldt $f0, 0(sp) ldt $f3, 48(sp) - lda sp, FRAME(sp) cfi_remember_state cfi_restore ($f0) cfi_restore ($f1) cfi_restore ($f3) cfi_def_cfa_offset (0) + lda sp, FRAME(sp) .align 4 subq X, AT, RV @@ -116,11 +115,13 @@ $x_is_neg: cfi_rel_offset ($f2, 24) _ITOFS AT, $f2, 16 + .align 4 addt $f0, $f2, $f0 + unop divt/c $f0, $f1, $f0 + unop /* Ok, we've now the divide issued. Continue with other checks. */ - .align 4 ldt $f1, 8(sp) unop ldt $f2, 24(sp) @@ -246,12 +247,16 @@ $y_is_neg: quotient must be either 0 or 1, so the remainder must be X or X-Y, so just compute it directly. */ cmpule Y, X, AT + excb + mt_fpcr $f3 subq X, Y, RV ldt $f0, 0(sp) + ldt $f3, 48(sp) cmoveq AT, X, RV lda sp, FRAME(sp) cfi_restore ($f0) + cfi_restore ($f3) cfi_def_cfa_offset (0) ret $31, (RA), 1
Hello! Attached patch fixes sysdeps/alpha/remqu.S clobbering $f3 register via $y_is_neg path. There was missing restore of $f3 before the return from the function. The patch also reorders insns a bit, so it becomes similar as much as possible to divqu.S. Without the patch, math/big testcase from Go-1.11 testsuite (that includes lots of corner cases that exercise remqu) FAIL, with patched function, the testcase PASSes without problems. Signed-off-by: Uros Bizjak <ubizjak@gmail.com> Uros.