diff mbox series

[v3] soft-fp: Add the lack of implementation for 128 bit self-contained.

Message ID 1532610973-22705-1-git-send-email-zong@andestech.com
State New
Headers show
Series [v3] soft-fp: Add the lack of implementation for 128 bit self-contained. | expand

Commit Message

Zong Li July 26, 2018, 1:16 p.m. UTC
Here only add the lack of implementation when building the RISC-V 32-bit
port.

These marcos are used when the following situations occur at the same
time: soft-fp fma, ldbl-128 and 32-bit _FP_W_TYPE_SIZE. The RISC-V
32-bit port is the first port which use all three together.

This is the building flow about the situation:
When building soft-fp/s_fmal.c, there uses the FP_FMA_Q in __fmal.
The _FP_W_TYPE_SIZE is defined to 32-bit in sysdeps/riscv/sfp-machine.h,
so the FP_FMA_Q was defined to _FP_FMA (Q, 4, 8, R, X, Y, Z) in
soft-fp/quad.h.

Something in the soft-fp/quad.h:
 #if _FP_W_TYPE_SIZE < 64
    # define FP_FMA_Q(R, X, Y, Z)    _FP_FMA (Q, 4, 8, R, X, Y, Z)
 #else
    # define FP_FMA_Q(R, X, Y, Z)    _FP_FMA (Q, 2, 4, R, X, Y, Z)
 #endif

Finally, in _FP_FMA (fs, wc, dwc, R, X, Y, Z), it will use the
_FP_FRAC_HIGHBIT_DW_##dwc macro, and it will be expanded to
_FP_FRAC_HIGHBIT_DW_8, but the _FP_FRAC_HIGHBIT_DW_8 is not be
implemented in soft-fp/op-8.h. there is only _FP_FRAC_HIGHBIT_DW_1,
_FP_FRAC_HIGHBIT_DW_2 and _FP_FRAC_HIGHBIT_DW_4 in the soft-fp/op-*.h.

	* soft-fp/op-8.h: Add macros.
---
 ChangeLog      |   4 +++
 soft-fp/op-8.h | 107 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 111 insertions(+)

Comments

Joseph Myers July 26, 2018, 1:41 p.m. UTC | #1
When submitting a patch, please always include a description of the 
testing done on that version of the patch.  For example, if you tested 
your 32-bit RISC-V port with this version of the patch and got clean math/ 
test results, then say so.
Zong Li July 26, 2018, 2:04 p.m. UTC | #2
Joseph Myers <joseph@codesourcery.com> 於 2018年7月26日 週四 下午9:41寫道:
>
> When submitting a patch, please always include a description of the
> testing done on that version of the patch.  For example, if you tested
> your 32-bit RISC-V port with this version of the patch and got clean math/
> test results, then say so.
>
OK, I got it. Shall I submit next version of this patch?
Thanks.
Joseph Myers July 26, 2018, 3:07 p.m. UTC | #3
On Thu, 26 Jul 2018, Zong Li wrote:

> Joseph Myers <joseph@codesourcery.com> 於 2018年7月26日 週四 下午9:41寫道:
> >
> > When submitting a patch, please always include a description of the
> > testing done on that version of the patch.  For example, if you tested
> > your 32-bit RISC-V port with this version of the patch and got clean math/
> > test results, then say so.
> >
> OK, I got it. Shall I submit next version of this patch?

*Once you have clean math test results for 32-bit RISC-V* (or if you 
arrange for another configuration to use the new code for fmal / fmaf128 
for testing purposes and get clean results there), yes, resubmit with an 
updated proposed commit message that reflects those results.  Until you 
have such clean results, we can't have much confidence in the correctness 
of the patch and so I don't think there's much point resubmitting unless 
someone finds further problems with it based on reading the code.
Richard Henderson July 26, 2018, 4:59 p.m. UTC | #4
On 07/26/2018 06:16 AM, Zong Li wrote:
> +#define _FP_FRAC_CLZ_8(R, X)                    \
> +  do                                            \
> +    {                                           \
> +      if (X##_f[7])                             \
> +        __FP_CLZ ((R), X##_f[7]);               \
> +      else if (X##_f[6])                        \
> +        {                                       \
> +          __FP_CLZ ((R), X##_f[6]);             \
> +          (R) += _FP_W_TYPE_SIZE;               \
> +        }                                       \
> +      else if (X##_f[5])                        \
...

Perhaps better as

#define _FP_FRAC_CLZ_8(R, X)                   \
  do                                           \
    {                                          \
      int fs8_i;                               \
      for (fs8_i = 7; fs8_i > 0; fs8_i--)      \
        if (X##_f[fs8_i])                      \
          break;                               \
      __FP_CLZ ((R), X##_f[fs8_i]);            \
      (R) += _FP_W_TYPE_SIZE * (7 - fs8_i);    \
    }                                          \
  while (0)


r~
Zong Li July 27, 2018, 1:02 a.m. UTC | #5
Richard Henderson <rth@twiddle.net> 於 2018年7月27日 週五 上午12:59寫道:
>
> On 07/26/2018 06:16 AM, Zong Li wrote:
> > +#define _FP_FRAC_CLZ_8(R, X)                    \
> > +  do                                            \
> > +    {                                           \
> > +      if (X##_f[7])                             \
> > +        __FP_CLZ ((R), X##_f[7]);               \
> > +      else if (X##_f[6])                        \
> > +        {                                       \
> > +          __FP_CLZ ((R), X##_f[6]);             \
> > +          (R) += _FP_W_TYPE_SIZE;               \
> > +        }                                       \
> > +      else if (X##_f[5])                        \
> ...
>
> Perhaps better as
>
> #define _FP_FRAC_CLZ_8(R, X)                   \
>   do                                           \
>     {                                          \
>       int fs8_i;                               \
>       for (fs8_i = 7; fs8_i > 0; fs8_i--)      \
>         if (X##_f[fs8_i])                      \
>           break;                               \
>       __FP_CLZ ((R), X##_f[fs8_i]);            \
>       (R) += _FP_W_TYPE_SIZE * (7 - fs8_i);    \
>     }                                          \
>   while (0)
>
>
Yes, I use for loop at one time before I summit this patch version.
But I have a little bit
worry about the performance in this macro, because it will get more branch and
instructions by looping opposite to ADD and SUB macro change to use loop.
diff mbox series

Patch

diff --git a/ChangeLog b/ChangeLog
index 8b509d4..a8daea6 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,7 @@ 
+2018-07-25  Zong Li  <zong@andestech.com>
+
+	* soft-fp/op-8.h: Add macros.
+
 2018-07-25  Carlos O'Donell <carlos@redhat.com>
 
 	[BZ #23393]
diff --git a/soft-fp/op-8.h b/soft-fp/op-8.h
index ffed258..45b019e 100644
--- a/soft-fp/op-8.h
+++ b/soft-fp/op-8.h
@@ -35,6 +35,7 @@ 
 /* We need just a few things from here for op-4, if we ever need some
    other macros, they can be added.  */
 #define _FP_FRAC_DECL_8(X)	_FP_W_TYPE X##_f[8]
+#define _FP_FRAC_SET_8(X, I)    __FP_FRAC_SET_8 (X, I)
 #define _FP_FRAC_HIGH_8(X)	(X##_f[7])
 #define _FP_FRAC_LOW_8(X)	(X##_f[0])
 #define _FP_FRAC_WORD_8(X, w)	(X##_f[w])
@@ -147,4 +148,110 @@ 
     }									\
   while (0)
 
+#define _FP_FRAC_ADD_8(R, X, Y)                                         \
+  do                                                                    \
+    {                                                                   \
+      _FP_W_TYPE fa8_c = 0;                                             \
+      for (int fa8_i = 0; fa8_i < 8; ++fa8_i)                           \
+        {                                                               \
+          R##_f[fa8_i] = X##_f[fa8_i] + Y##_f[fa8_i] + fa8_c;           \
+          fa8_c = (fa8_c                                                \
+                   ? R##_f[fa8_i] <= X##_f[fa8_i]                       \
+                   : R##_f[fa8_i] < X##_f[fa8_i]);                      \
+        }                                                               \
+    }                                                                   \
+  while (0)
+
+#define _FP_FRAC_SUB_8(R, X, Y)                                         \
+  do                                                                    \
+    {                                                                   \
+      _FP_W_TYPE fs8_c = 0;                                             \
+      for (int fs8_i = 0; fs8_i < 8; ++fs8_i)                           \
+        {                                                               \
+          R##_f[fs8_i] = X##_f[fs8_i] - Y##_f[fs8_i] - fs8_c;           \
+          fs8_c = (fs8_c                                                \
+                   ? R##_f[fs8_i] >= X##_f[fs8_i]                       \
+                   : R##_f[fs8_i] > X##_f[fs8_i]);                      \
+        }                                                               \
+    }                                                                   \
+  while (0)
+
+#define _FP_FRAC_CLZ_8(R, X)                    \
+  do                                            \
+    {                                           \
+      if (X##_f[7])                             \
+        __FP_CLZ ((R), X##_f[7]);               \
+      else if (X##_f[6])                        \
+        {                                       \
+          __FP_CLZ ((R), X##_f[6]);             \
+          (R) += _FP_W_TYPE_SIZE;               \
+        }                                       \
+      else if (X##_f[5])                        \
+        {                                       \
+          __FP_CLZ ((R), X##_f[5]);             \
+          (R) += _FP_W_TYPE_SIZE * 2;           \
+        }                                       \
+      else if (X##_f[4])                        \
+        {                                       \
+          __FP_CLZ ((R), X##_f[4]);             \
+          (R) += _FP_W_TYPE_SIZE * 3;           \
+        }                                       \
+      else if (X##_f[3])                        \
+        {                                       \
+          __FP_CLZ ((R), X##_f[3]);             \
+          (R) += _FP_W_TYPE_SIZE * 4;           \
+        }                                       \
+      else if (X##_f[2])                        \
+        {                                       \
+          __FP_CLZ ((R), X##_f[2]);             \
+          (R) += _FP_W_TYPE_SIZE * 5;           \
+        }                                       \
+      else if (X##_f[1])                        \
+        {                                       \
+          __FP_CLZ ((R), X##_f[1]);             \
+          (R) += _FP_W_TYPE_SIZE * 6;           \
+        }                                       \
+      else                                      \
+        {                                       \
+          __FP_CLZ ((R), X##_f[0]);             \
+          (R) += _FP_W_TYPE_SIZE * 7;           \
+        }                                       \
+    }                                           \
+  while (0)
+
+#define _FP_MINFRAC_8   0, 0, 0, 0, 0, 0, 0, 1
+
+#define _FP_FRAC_NEGP_8(X)      ((_FP_WS_TYPE) X##_f[7] < 0)
+#define _FP_FRAC_ZEROP_8(X)                                             \
+  ((X##_f[0] | X##_f[1] | X##_f[2] | X##_f[3]                           \
+    | X##_f[4] | X##_f[5] | X##_f[6] | X##_f[7]) == 0)
+#define _FP_FRAC_HIGHBIT_DW_8(fs, X)                                    \
+  (_FP_FRAC_HIGH_DW_##fs (X) & _FP_HIGHBIT_DW_##fs)
+
+
+#define _FP_FRAC_COPY_4_8(D, S)                           \
+  do                                                      \
+    {                                                     \
+      D##_f[0] = S##_f[0];                                \
+      D##_f[1] = S##_f[1];                                \
+      D##_f[2] = S##_f[2];                                \
+      D##_f[3] = S##_f[3];                                \
+    }                                                     \
+  while (0)
+
+#define _FP_FRAC_COPY_8_4(D, S)                           \
+  do                                                      \
+    {                                                     \
+      D##_f[0] = S##_f[0];                                \
+      D##_f[1] = S##_f[1];                                \
+      D##_f[2] = S##_f[2];                                \
+      D##_f[3] = S##_f[3];                                \
+      D##_f[4] = D##_f[5] = D##_f[6] = D##_f[7]= 0;       \
+    }                                                     \
+  while (0)
+
+#define __FP_FRAC_SET_8(X, I7, I6, I5, I4, I3, I2, I1, I0)             \
+  (X##_f[7] = I7, X##_f[6] = I6, X##_f[5] = I5, X##_f[4] = I4,         \
+   X##_f[3] = I3, X##_f[2] = I2, X##_f[1] = I1, X##_f[0] = I0)
+
 #endif /* !SOFT_FP_OP_8_H */