diff mbox series

[committed,RISC-V] Improve floor, ceil & related operations for RISC-V

Message ID bc774cba-b55b-417f-b427-860bac76773d@ventanamicro.com
State New
Headers show
Series [committed,RISC-V] Improve floor, ceil & related operations for RISC-V | expand

Commit Message

Jeff Law April 30, 2024, 3:46 p.m. UTC
This is almost exclusively Jivan's work.  His original post:


> https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg336483.html




This patch is primarily meant to improve the code we generate for FP 
rounding such as ceil/floor.  It also addresses some unnecessary sign 
extensions in the same areas.

RISC-V's FP conversions have a bit of undesirable behavior that make 
them non-suitable as-is for ceil/floor and other related functions. 
These deficiencies are addressed in the Zfa extension, but there's no 
reason not to pick up a nice improvement when we can.

Basically we can still use the basic FP conversions for floor/ceil and 
friends when we don't care about inexact exceptions by checking for the 
special cases first, then emitting the conversion when the special cases 
don't apply.  That's still much faster than calling into glibc.

The redundant sign extensions are eliminated using the same trick Jivan 
added last year, just in a few more places ;-)

This eliminates roughly 10% of the dynamic instruction count for 
imagick.  But more importantly it's about a 17% performance improvement 
for that workload within spec.

This has been bootstrapped as well as regression tested in a cross 
environment.  It's also successfully built & run specint/specfp correctly.

Pushing to the trunk and the coordination branch momentarily.


Jeff
gcc/
	* config/riscv/iterators.md (fix_ops, fix_uns): New iterators.
	(RINT, rint_pattern, rint_rm): Remove unused iterators.
	* config/riscv/riscv-protos.h (get_fp_rounding_coefficient): Prototype.
	* config/riscvriscv-v.cc (get_fp_rounding_coefficient): Externalize.
	external linkage.
	* config/riscv/riscv.md (UNSPEC_LROUND): Remove.
	(fix_trunc<ANYF:mode><GPR:mode>2): Replace with ...
	(<fix_uns>_trunc<ANYF:mode>si2): New expander & associated insn.
	(<fix_uns>_trunc<ANYF:mode>si2_ext): New insn.
	(<fix_uns>_trunc<ANYF:mode>di2): Likewise.
	(l<rint_pattern><ANYF:mode><GPR:mode>2): Replace with ...
	(lrint<ANYF:mode>si2): New expander and associated insn.
	(lrint<ANYF:mode>si2_ext), lrint<ANYF:mode>di2): New insns.
	(<round_pattern><ANYF:mode>2): Replace with....
	(l<round_pattern><ANYF:mode>si2): New expander and associated insn.
	(l<round_pattern><ANYF:mode>si2_sext): New insn.
	(l<round_pattern><ANYF:mode>di2): Likewise.
	(<round_pattern><ANYF:mode>2): New expander.

gcc/testsuite/
	* gcc.target/riscv/fix.c: New test.
	* gcc.target/riscv/round.c: New test.
	* gcc.target/riscv/round_32.c: New test.
	* gcc.target/riscv/round_64.c: New test.

Comments

Patrick O'Neill May 1, 2024, 6:44 p.m. UTC | #1
Hi Jeff,


It looks like this patch's gcc.target/riscv/round_64.c testcase doesn't 
pass when run with newlib.

It also introduced:

FAIL: gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-2.c execution 
test

on rv32gcv newlib/linux.


Precommit with a few targets: 
https://github.com/ewlu/gcc-precommit-ci/issues/1454

Postcommit with all targets: 
https://github.com/patrick-rivos/gcc-postcommit-ci/issues/853

Untrimmed postcommit output: 
https://github.com/patrick-rivos/gcc-postcommit-ci/actions/runs/8899731577/job/24456991643#step:8:132


Thanks,

Patrick


On 4/30/24 08:46, Jeff Law wrote:
> This is almost exclusively Jivan's work.  His original post:
>
>
>> https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg336483.html
>
>
>
>
> This patch is primarily meant to improve the code we generate for FP 
> rounding such as ceil/floor.  It also addresses some unnecessary sign 
> extensions in the same areas.
>
> RISC-V's FP conversions have a bit of undesirable behavior that make 
> them non-suitable as-is for ceil/floor and other related functions. 
> These deficiencies are addressed in the Zfa extension, but there's no 
> reason not to pick up a nice improvement when we can.
>
> Basically we can still use the basic FP conversions for floor/ceil and 
> friends when we don't care about inexact exceptions by checking for 
> the special cases first, then emitting the conversion when the special 
> cases don't apply.  That's still much faster than calling into glibc.
>
> The redundant sign extensions are eliminated using the same trick 
> Jivan added last year, just in a few more places ;-)
>
> This eliminates roughly 10% of the dynamic instruction count for 
> imagick.  But more importantly it's about a 17% performance 
> improvement for that workload within spec.
>
> This has been bootstrapped as well as regression tested in a cross 
> environment.  It's also successfully built & run specint/specfp 
> correctly.
>
> Pushing to the trunk and the coordination branch momentarily.
>
>
> Jeff
Jeff Law May 1, 2024, 7:07 p.m. UTC | #2
On 5/1/24 12:44 PM, Patrick O'Neill wrote:
> Hi Jeff,
> 
> 
> It looks like this patch's gcc.target/riscv/round_64.c testcase doesn't 
> pass when run with newlib.
> 
> It also introduced:
> 
> FAIL: gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-2.c execution 
> test
> 
> on rv32gcv newlib/linux.
> 
> 
> Precommit with a few targets: https://github.com/ewlu/gcc-precommit-ci/ 
> issues/1454
> 
> Postcommit with all targets: https://github.com/patrick-rivos/gcc- 
> postcommit-ci/issues/853
> 
> Untrimmed postcommit output: https://github.com/patrick-rivos/gcc- 
> postcommit-ci/actions/runs/8899731577/job/24456991643#step:8:132
Thanks.  I'll dive in.

jeff
Jeff Law May 1, 2024, 7:56 p.m. UTC | #3
On 5/1/24 12:44 PM, Patrick O'Neill wrote:
> Hi Jeff,
> 
> 
> It looks like this patch's gcc.target/riscv/round_64.c testcase doesn't 
> pass when run with newlib.
Looks like a testsuite error as much as anything.  The test relies on 
the gimple optimizers to propagate the input paramters to their use 
points and transform (for example) a ceil into ceilf.  That's not 
happening with newlib.

Pondering the best approach to fix...

Jeff
Jeff Law May 1, 2024, 9:54 p.m. UTC | #4
On 5/1/24 12:44 PM, Patrick O'Neill wrote:
> Hi Jeff,
> 
> 
> It looks like this patch's gcc.target/riscv/round_64.c testcase doesn't 
> pass when run with newlib.
So I expected this would ultimately end up being a case where certain 
builtins aren't enabled when we're using a newlib based C library and 
that's exactly what happens here.

Essentially all the "function_c99_misc" routines are disabled for 
simplifications.  So we're presented with this in forwprop:

> __attribute__((noclone, noinline))
> float convert_float_to_float_round (float N)
> {
>   double _1;
>   double _2;
>   float _4;
> 
> ;;   basic block 2, loop depth 0, maybe hot
> ;;    prev block 0, next block 1, flags: (NEW, VISITED)
> ;;    pred:       ENTRY (FALLTHRU,EXECUTABLE)
>   _1 = (double) N_3(D);
>   _2 = round (_1);
>   _4 = (float) _2;
>   return _4;
> ;;    succ:       EXIT j.c:12:10
> 
> }

The test relies on the optimizer to realize that's just roundf and 
convert it to:

__attribute__((noclone, noinline))
float convert_float_to_float_round (float N)
{
   float _6;

;;   basic block 2, loop depth 0, maybe hot
;;    prev block 0, next block 1, flags: (NEW, VISITED)
;;    pred:       ENTRY (FALLTHRU,EXECUTABLE)
   _6 = __builtin_roundf (N_3(D));
   return _6;
;;    succ:       EXIT (EXECUTABLE) z.c:12:10

}


Failure to do that conversion will result in different code generation 
in the end and thus all those scan-asm failures.  Thankfully we have a 
preexisting way to deal with this in the testsuite.



> 
> It also introduced:
> 
> FAIL: gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-2.c execution 
> test
> 
> on rv32gcv newlib/linux.
I'll have to look at this next, but it could well end up being the same 
issue under the hood.

jeff
Jeff Law May 2, 2024, 2:49 a.m. UTC | #5
On 5/1/24 12:44 PM, Patrick O'Neill wrote:
> 
> It also introduced:
> 
> FAIL: gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-2.c execution 
> test
> 
> on rv32gcv newlib/linux.
I think I see what's going on here as well.  Need to ponder this one a 
bit longer, but I'm confident I'll be able to sort it out tomorrow.

jeff
Jeff Law May 2, 2024, 10:10 p.m. UTC | #6
On 5/1/24 12:44 PM, Patrick O'Neill wrote:
> 
> FAIL: gcc.target/riscv/rvv/autovec/unop/math-nearbyint-run-2.c execution 
> test
> 
> on rv32gcv newlib/linux.
So the issue here is the code tried to handle DFmode inputs for rv32 by 
converting to a SImode integer.  That's not a good idea on multiple 
levels.  But if we focus just on math-nearbyint-run-2.....

The coefficient test was based on the input mode, DF in this case.  That 
test said, yes it was safe to use the conversion instruction.  The 
conversion instruction was generating an SImode output, so the set of 
values it could correctly handle was limited by the range of an SImode 
object.  The mismatch caused things to go astray.

Pondering this space overnight what's clear to me is that the 
coefficient test and the output of the conversion must be based on the 
same sized objects.  ie, DF/DI or SF/SI.  Doing something like DF/SI 
isn't right.

This means in the rv32 space, we can't use the conversion instructions 
for a DFmode object.  Pretty simple.

While I was in the code the other issue I saw was HFmode handling.  We 
can handle HFmode when Zfa is enabled, but we should avoid it otherwise 
as we don't have the relevant patterns we'd need.

I'm testing a fix for both of these issues as well as the whitespace 
nits caught by the CI system.

jeff
diff mbox series

Patch

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index a7694137685..75e119e407a 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -196,6 +196,13 @@  (define_code_iterator clz_ctz_pcnt [clz ctz popcount])
 
 (define_code_iterator bitmanip_rotate [rotate rotatert])
 
+;; These code iterators allow the signed and unsigned fix operations to use
+;; the same template.
+(define_code_iterator fix_ops [fix unsigned_fix])
+
+(define_code_attr fix_uns [(fix "fix") (unsigned_fix "fixuns")])
+
+
 ;; -------------------------------------------------------------------
 ;; Code Attributes
 ;; -------------------------------------------------------------------
@@ -312,11 +319,6 @@  (define_code_attr bitmanip_insn [(smin "min")
 ;; Int Iterators.
 ;; -------------------------------------------------------------------
 
-;; Iterator and attributes for floating-point rounding instructions.
-(define_int_iterator RINT [UNSPEC_LRINT UNSPEC_LROUND])
-(define_int_attr rint_pattern [(UNSPEC_LRINT "rint") (UNSPEC_LROUND "round")])
-(define_int_attr rint_rm [(UNSPEC_LRINT "dyn") (UNSPEC_LROUND "rmm")])
-
 ;; Iterator and attributes for quiet comparisons.
 (define_int_iterator QUIET_COMPARISON [UNSPEC_FLT_QUIET UNSPEC_FLE_QUIET])
 (define_int_attr quiet_pattern [(UNSPEC_FLT_QUIET "lt") (UNSPEC_FLE_QUIET "le")])
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 5d46a29d8b7..e5aebf3fc3d 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -711,6 +711,7 @@  bool gather_scatter_valid_offset_p (machine_mode);
 HOST_WIDE_INT estimated_poly_value (poly_int64, unsigned int);
 bool whole_reg_to_reg_move_p (rtx *, machine_mode, int);
 bool splat_to_scalar_move_p (rtx *);
+rtx get_fp_rounding_coefficient (machine_mode);
 }
 
 /* We classify builtin types into two classes:
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 814c5febabe..c9e0feebca6 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -4494,7 +4494,7 @@  vls_mode_valid_p (machine_mode vls_mode)
       All double floating point will be unchanged for ceil if it is
       greater than and equal to 4503599627370496.
  */
-static rtx
+rtx
 get_fp_rounding_coefficient (machine_mode inner_mode)
 {
   REAL_VALUE_TYPE real;
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 455715ab2f7..8f518fdbe5a 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -64,7 +64,6 @@  (define_c_enum "unspec" [
   UNSPEC_ROUNDEVEN
   UNSPEC_NEARBYINT
   UNSPEC_LRINT
-  UNSPEC_LROUND
   UNSPEC_FMIN
   UNSPEC_FMAX
   UNSPEC_FMINM
@@ -1919,21 +1918,48 @@  (define_insn "*movhf_softfloat_boxing"
 ;;
 ;;  ....................
 
-(define_insn "fix_trunc<ANYF:mode><GPR:mode>2"
-  [(set (match_operand:GPR      0 "register_operand" "=r")
-	(fix:GPR
+(define_expand "<fix_uns>_trunc<ANYF:mode>si2"
+  [(set (match_operand:SI      0 "register_operand" "=r")
+	(fix_ops:SI
 	    (match_operand:ANYF 1 "register_operand" " f")))]
   "TARGET_HARD_FLOAT || TARGET_ZFINX"
-  "fcvt.<GPR:ifmt>.<ANYF:fmt> %0,%1,rtz"
+{
+  if (TARGET_64BIT)
+    {
+      rtx t = gen_reg_rtx (DImode);
+      emit_insn (gen_<fix_uns>_trunc<ANYF:mode>si2_sext (t, operands[1]));
+      t = gen_lowpart (SImode, t);
+      SUBREG_PROMOTED_VAR_P (t) = 1;
+      SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+      emit_move_insn (operands[0], t);
+      DONE;
+    }
+})
+
+(define_insn "*<fix_uns>_trunc<ANYF:mode>si2"
+  [(set (match_operand:SI      0 "register_operand" "=r")
+	(fix_ops:SI
+	    (match_operand:ANYF 1 "register_operand" " f")))]
+  "TARGET_HARD_FLOAT || TARGET_ZFINX"
+  "fcvt.w<u>.<ANYF:fmt> %0,%1,rtz"
+  [(set_attr "type" "fcvt_f2i")
+   (set_attr "mode" "<ANYF:MODE>")])
+
+(define_insn "<fix_uns>_trunc<ANYF:mode>si2_sext"
+  [(set (match_operand:DI      0 "register_operand" "=r")
+  (sign_extend:DI (fix_ops:SI
+	    (match_operand:ANYF 1 "register_operand" " f"))))]
+  "TARGET_64BIT && (TARGET_HARD_FLOAT || TARGET_ZFINX)"
+  "fcvt.w<u>.<ANYF:fmt> %0,%1,rtz"
   [(set_attr "type" "fcvt_f2i")
    (set_attr "mode" "<ANYF:MODE>")])
 
-(define_insn "fixuns_trunc<ANYF:mode><GPR:mode>2"
-  [(set (match_operand:GPR      0 "register_operand" "=r")
-	(unsigned_fix:GPR
+(define_insn "<fix_uns>_trunc<ANYF:mode>di2"
+  [(set (match_operand:DI      0 "register_operand" "=r")
+	(fix_ops:DI
 	    (match_operand:ANYF 1 "register_operand" " f")))]
-  "TARGET_HARD_FLOAT  || TARGET_ZFINX"
-  "fcvt.<GPR:ifmt>u.<ANYF:fmt> %0,%1,rtz"
+  "TARGET_64BIT && (TARGET_HARD_FLOAT || TARGET_ZFINX)"
+  "fcvt.l<u>.<ANYF:fmt> %0,%1,rtz"
   [(set_attr "type" "fcvt_f2i")
    (set_attr "mode" "<ANYF:MODE>")])
 
@@ -1955,17 +1981,170 @@  (define_insn "floatuns<GPR:mode><ANYF:mode>2"
   [(set_attr "type" "fcvt_i2f")
    (set_attr "mode" "<ANYF:MODE>")])
 
-(define_insn "l<rint_pattern><ANYF:mode><GPR:mode>2"
-  [(set (match_operand:GPR       0 "register_operand" "=r")
-	(unspec:GPR
+(define_expand "lrint<ANYF:mode>si2"
+  [(set (match_operand:SI       0 "register_operand" "=r")
+	(unspec:SI
 	    [(match_operand:ANYF 1 "register_operand" " f")]
-	    RINT))]
+	    UNSPEC_LRINT))]
   "TARGET_HARD_FLOAT || TARGET_ZFINX"
-  "fcvt.<GPR:ifmt>.<ANYF:fmt> %0,%1,<rint_rm>"
+{
+  if (TARGET_64BIT)
+    {
+      rtx t = gen_reg_rtx (DImode);
+      emit_insn (gen_lrint<ANYF:mode>si2_sext (t, operands[1]));
+      t = gen_lowpart (SImode, t);
+      SUBREG_PROMOTED_VAR_P (t) = 1;
+      SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+      emit_move_insn (operands[0], t);
+      DONE;
+    }
+})
+
+(define_insn "*lrint<ANYF:mode>si2"
+  [(set (match_operand:SI       0 "register_operand" "=r")
+	(unspec:SI
+	    [(match_operand:ANYF 1 "register_operand" " f")]
+	    UNSPEC_LRINT))]
+  "TARGET_HARD_FLOAT || TARGET_ZFINX"
+  "fcvt.w.<ANYF:fmt> %0,%1,dyn"
   [(set_attr "type" "fcvt_f2i")
    (set_attr "mode" "<ANYF:MODE>")])
 
-(define_insn "<round_pattern><ANYF:mode>2"
+(define_insn "lrint<ANYF:mode>si2_sext"
+  [(set (match_operand:DI       0 "register_operand" "=r")
+  (sign_extend:DI (unspec:SI
+	    [(match_operand:ANYF 1 "register_operand" " f")]
+	    UNSPEC_LRINT)))]
+  "TARGET_64BIT && (TARGET_HARD_FLOAT || TARGET_ZFINX)"
+  "fcvt.w.<ANYF:fmt> %0,%1,dyn"
+  [(set_attr "type" "fcvt_f2i")
+   (set_attr "mode" "<ANYF:MODE>")])
+
+(define_insn "lrint<ANYF:mode>di2"
+  [(set (match_operand:DI       0 "register_operand" "=r")
+	(unspec:DI
+	    [(match_operand:ANYF 1 "register_operand" " f")]
+	    UNSPEC_LRINT))]
+  "TARGET_64BIT && (TARGET_HARD_FLOAT || TARGET_ZFINX)"
+  "fcvt.l.<ANYF:fmt> %0,%1,dyn"
+  [(set_attr "type" "fcvt_f2i")
+   (set_attr "mode" "<ANYF:MODE>")])
+
+(define_expand "l<round_pattern><ANYF:mode>si2"
+  [(set (match_operand:SI       0 "register_operand" "=r")
+	(unspec:SI
+	    [(match_operand:ANYF 1 "register_operand" " f")]
+    ROUND))]
+  "TARGET_HARD_FLOAT || TARGET_ZFINX"
+{
+  if (TARGET_64BIT)
+    {
+      rtx t = gen_reg_rtx (DImode);
+      emit_insn (gen_l<round_pattern><ANYF:mode>si2_sext (t, operands[1]));
+      t = gen_lowpart (SImode, t);
+      SUBREG_PROMOTED_VAR_P (t) = 1;
+      SUBREG_PROMOTED_SET (t, SRP_SIGNED);
+      emit_move_insn (operands[0], t);
+      DONE;
+    }
+})
+
+(define_insn "*l<round_pattern><ANYF:mode>si2"
+  [(set (match_operand:SI       0 "register_operand" "=r")
+	(unspec:SI
+	    [(match_operand:ANYF 1 "register_operand" " f")]
+    ROUND))]
+  "TARGET_HARD_FLOAT || TARGET_ZFINX"
+  "fcvt.w.<ANYF:fmt> %0,%1,<round_rm>"
+  [(set_attr "type" "fcvt_f2i")
+   (set_attr "mode" "<ANYF:MODE>")])
+
+(define_insn "l<round_pattern><ANYF:mode>si2_sext"
+  [(set (match_operand:DI       0 "register_operand" "=r")
+	 (sign_extend:DI (unspec:SI
+	                     [(match_operand:ANYF 1 "register_operand" " f")]
+                      ROUND)))]
+  "TARGET_64BIT && (TARGET_HARD_FLOAT || TARGET_ZFINX)"
+  "fcvt.w.<ANYF:fmt> %0,%1,<round_rm>"
+  [(set_attr "type" "fcvt_f2i")
+   (set_attr "mode" "<ANYF:MODE>")])
+
+(define_insn "l<round_pattern><ANYF:mode>di2"
+  [(set (match_operand:DI       0 "register_operand" "=r")
+	(unspec:DI
+	    [(match_operand:ANYF 1 "register_operand" " f")]
+    ROUND))]
+  "TARGET_64BIT && (TARGET_HARD_FLOAT || TARGET_ZFINX)"
+  "fcvt.l.<ANYF:fmt> %0,%1,<round_rm>"
+  [(set_attr "type" "fcvt_f2i")
+   (set_attr "mode" "<ANYF:MODE>")])
+
+(define_expand "<round_pattern><ANYF:mode>2"
+  [(set (match_operand:ANYF     0 "register_operand" "=f")
+        (unspec:ANYF
+            [(match_operand:ANYF 1 "register_operand" " f")]
+        ROUND))]
+  "TARGET_HARD_FLOAT && (TARGET_ZFA
+                         || flag_fp_int_builtin_inexact || !flag_trapping_math)"
+{
+  if (TARGET_ZFA)
+    emit_insn (gen_<round_pattern><ANYF:mode>_zfa2 (operands[0],
+                                                    operands[1]));
+  else
+    {
+      rtx reg;
+      rtx label = gen_label_rtx ();
+      rtx end_label = gen_label_rtx ();
+      rtx abs_reg = gen_reg_rtx (<ANYF:MODE>mode);
+      rtx coeff_reg = gen_reg_rtx (<ANYF:MODE>mode);
+      rtx tmp_reg = gen_reg_rtx (<ANYF:MODE>mode);
+
+      riscv_emit_move (tmp_reg, operands[1]);
+      riscv_emit_move (coeff_reg,
+                       riscv_vector::get_fp_rounding_coefficient (<ANYF:MODE>mode));
+      emit_insn (gen_abs<ANYF:mode>2 (abs_reg, operands[1]));
+
+      riscv_expand_conditional_branch (label, LT, abs_reg, coeff_reg);
+
+      emit_jump_insn (gen_jump (end_label));
+      emit_barrier ();
+
+      emit_label (label);
+      switch (<ANYF:MODE>mode)
+        {
+        case SFmode:
+          reg = gen_reg_rtx (SImode);
+          emit_insn (gen_l<round_pattern>sfsi2 (reg, operands[1]));
+          emit_insn (gen_floatsisf2 (abs_reg, reg));
+          break;
+        case DFmode:
+          if (TARGET_64BIT)
+            {
+              reg = gen_reg_rtx (DImode);
+              emit_insn (gen_l<round_pattern>dfdi2 (reg, operands[1]));
+              emit_insn (gen_floatdidf2 (abs_reg, reg));
+            }
+          else
+            {
+              reg = gen_reg_rtx (SImode);
+              emit_insn (gen_l<round_pattern>dfsi2 (reg, operands[1]));
+              emit_insn (gen_floatsidf2 (abs_reg, reg));
+            }
+          break;
+        default:
+          gcc_unreachable ();
+        }
+
+      emit_insn (gen_copysign<ANYF:mode>3 (tmp_reg, abs_reg, operands[1]));
+
+      emit_label (end_label);
+      riscv_emit_move (operands[0], tmp_reg);
+    }
+
+  DONE;
+})
+
+(define_insn "<round_pattern><ANYF:mode>_zfa2"
   [(set (match_operand:ANYF     0 "register_operand" "=f")
 	(unspec:ANYF
 	    [(match_operand:ANYF 1 "register_operand" " f")]
diff --git a/gcc/testsuite/gcc.target/riscv/fix.c b/gcc/testsuite/gcc.target/riscv/fix.c
new file mode 100644
index 00000000000..265a7da1fc5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/fix.c
@@ -0,0 +1,34 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d" } */
+/* { dg-skip-if "" { *-*-* }  { "-O0" } } */
+
+int
+foo (double n)
+{
+  return n;
+}
+
+int
+foo_1 (float n)
+{
+  return n;
+}
+
+unsigned int
+foo_2 (double n)
+{
+  return n;
+}
+
+unsigned int
+foo_3 (float n)
+{
+  return n;
+}
+
+/* { dg-final { scan-assembler-times {\mfcvt.w.d} 1 } } */
+/* { dg-final { scan-assembler-times {\mfcvt.w.s} 1 } } */
+/* { dg-final { scan-assembler-times {\mfcvt.wu.d} 1 } } */
+/* { dg-final { scan-assembler-times {\mfcvt.wu.s} 1 } } */
+/* { dg-final { scan-assembler-not "\\ssext.w\\s" } } */
+
diff --git a/gcc/testsuite/gcc.target/riscv/round.c b/gcc/testsuite/gcc.target/riscv/round.c
new file mode 100644
index 00000000000..decfc82a390
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/round.c
@@ -0,0 +1,144 @@ 
+#include <math.h>
+
+extern void abort (void);
+extern void exit (int);
+
+#define NEQ(a, b) (fabs((a) - (b)) > 0.000001)
+
+#define DECL_FUNC(TYPE1, TYPE2, ROUND)              \
+   __attribute__ ((noinline, noclone)) TYPE2        \
+   convert_##TYPE1##_to_##TYPE2##_##ROUND (TYPE1 N) \
+    {                                               \
+      return ROUND (N);                             \
+    }
+
+#define DECL_ALL_ROUNDS_FOR(ROUND_FUNC) \
+  DECL_FUNC(float, float, ROUND_FUNC)   \
+  DECL_FUNC(double, double, ROUND_FUNC) \
+  DECL_FUNC(double, int, ROUND_FUNC)    \
+  DECL_FUNC(double, long, ROUND_FUNC)   \
+  DECL_FUNC(float, int, ROUND_FUNC)     \
+  DECL_FUNC(float, long, ROUND_FUNC)
+
+
+DECL_ALL_ROUNDS_FOR(round)
+DECL_ALL_ROUNDS_FOR(ceil)
+DECL_ALL_ROUNDS_FOR(floor)
+DECL_ALL_ROUNDS_FOR(trunc)
+DECL_ALL_ROUNDS_FOR(nearbyint)
+
+#define TEST_ROUND(TYPE1, TYPE2, N, N_R, ROUND)      \
+  if (NEQ (convert_##TYPE1##_to_##TYPE2##_##ROUND (N), N_R)) \
+    abort ();
+
+
+int main () {
+
+  /* Round */
+  TEST_ROUND(double, double, -4.8, -5.0, round);
+  TEST_ROUND(double, double, -4.2, -4.0, round);
+  TEST_ROUND(double, double, 4.8, 5.0, round);
+  TEST_ROUND(double, double, 4.2, 4.0, round);
+
+  TEST_ROUND(double, int, -4.8, -5, round);
+  TEST_ROUND(double, int, -4.2, -4, round);
+  TEST_ROUND(double, int, 4.8, 5, round);
+  TEST_ROUND(double, int, 4.2, 4, round);
+
+  TEST_ROUND(double, long, -4.8, -5, round);
+  TEST_ROUND(double, long, -4.2, -4, round);
+  TEST_ROUND(double, long, 4.8, 5, round);
+  TEST_ROUND(double, long, 4.2, 4, round);
+
+  TEST_ROUND(float, long, -4.8, -5, round);
+  TEST_ROUND(float, long, -4.2, -4, round);
+  TEST_ROUND(float, long, 4.8, 5, round);
+  TEST_ROUND(float, long, 4.2, 4, round);
+
+  /* Ceil */
+  TEST_ROUND(double, double, -4.8, -4.0, ceil);
+  TEST_ROUND(double, double, -4.2, -4.0, ceil);
+  TEST_ROUND(double, double, 4.8, 5.0, ceil);
+  TEST_ROUND(double, double, 4.2, 5.0, ceil);
+
+  TEST_ROUND(double, int, -4.8, -4, ceil);
+  TEST_ROUND(double, int, -4.2, -4, ceil);
+  TEST_ROUND(double, int, 4.8, 5, ceil);
+  TEST_ROUND(double, int, 4.2, 5, ceil);
+
+  TEST_ROUND(double, long, -4.8, -4, ceil);
+  TEST_ROUND(double, long, -4.2, -4, ceil);
+  TEST_ROUND(double, long, 4.8, 5, ceil);
+  TEST_ROUND(double, long, 4.2, 5, ceil);
+
+  TEST_ROUND(float, long, -4.8, -4, ceil);
+  TEST_ROUND(float, long, -4.2, -4, ceil);
+  TEST_ROUND(float, long, 4.8, 5, ceil);
+  TEST_ROUND(float, long, 4.2, 5, ceil);
+
+  /* Floor */
+  TEST_ROUND(double, double, -4.8, -5.0, floor);
+  TEST_ROUND(double, double, -4.2, -5.0, floor);
+  TEST_ROUND(double, double, 4.8, 4.0, floor);
+  TEST_ROUND(double, double, 4.2, 4.0, floor);
+
+  TEST_ROUND(double, int, -4.8, -5, floor);
+  TEST_ROUND(double, int, -4.2, -5, floor);
+  TEST_ROUND(double, int, 4.8, 4, floor);
+  TEST_ROUND(double, int, 4.2, 4, floor);
+
+  TEST_ROUND(double, long, -4.8, -5, floor);
+  TEST_ROUND(double, long, -4.2, -5, floor);
+  TEST_ROUND(double, long, 4.8, 4, floor);
+  TEST_ROUND(double, long, 4.2, 4, floor);
+
+  TEST_ROUND(float, long, -4.8, -5, floor);
+  TEST_ROUND(float, long, -4.2, -5, floor);
+  TEST_ROUND(float, long, 4.8, 4, floor);
+  TEST_ROUND(float, long, 4.2, 4, floor);
+
+  /* Trunc */
+  TEST_ROUND(double, double, -4.8, -4.0, trunc);
+  TEST_ROUND(double, double, -4.2, -4.0, trunc);
+  TEST_ROUND(double, double, 4.8, 4.0, trunc);
+  TEST_ROUND(double, double, 4.2, 4.0, trunc);
+
+  TEST_ROUND(double, int, -4.8, -4, trunc);
+  TEST_ROUND(double, int, -4.2, -4, trunc);
+  TEST_ROUND(double, int, 4.8, 4, trunc);
+  TEST_ROUND(double, int, 4.2, 4, trunc);
+
+  TEST_ROUND(double, long, -4.8, -4, trunc);
+  TEST_ROUND(double, long, -4.2, -4, trunc);
+  TEST_ROUND(double, long, 4.8, 4, trunc);
+  TEST_ROUND(double, long, 4.2, 4, trunc);
+
+  TEST_ROUND(float, long, -4.8, -4, trunc);
+  TEST_ROUND(float, long, -4.2, -4, trunc);
+  TEST_ROUND(float, long, 4.8, 4, trunc);
+  TEST_ROUND(float, long, 4.2, 4, trunc);
+
+  /* Nearbyint */
+  TEST_ROUND(double, double, -4.8, -5.0, nearbyint);
+  TEST_ROUND(double, double, -4.2, -4.0, nearbyint);
+  TEST_ROUND(double, double, 4.8, 5.0, nearbyint);
+  TEST_ROUND(double, double, 4.2, 4.0, nearbyint);
+
+  TEST_ROUND(double, int, -4.8, -5, nearbyint);
+  TEST_ROUND(double, int, -4.2, -4, nearbyint);
+  TEST_ROUND(double, int, 4.8, 5, nearbyint);
+  TEST_ROUND(double, int, 4.2, 4, nearbyint);
+
+  TEST_ROUND(double, long, -4.8, -5, nearbyint);
+  TEST_ROUND(double, long, -4.2, -4, nearbyint);
+  TEST_ROUND(double, long, 4.8, 5, nearbyint);
+  TEST_ROUND(double, long, 4.2, 4, nearbyint);
+
+  TEST_ROUND(float, long, -4.8, -5, nearbyint);
+  TEST_ROUND(float, long, -4.2, -4, nearbyint);
+  TEST_ROUND(float, long, 4.8, 5, nearbyint);
+  TEST_ROUND(float, long, 4.2, 4, nearbyint);
+
+  exit(0);
+}
+
diff --git a/gcc/testsuite/gcc.target/riscv/round_32.c b/gcc/testsuite/gcc.target/riscv/round_32.c
new file mode 100644
index 00000000000..f9fea70ad55
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/round_32.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile { target { riscv32*-*-* } } } */
+/* { dg-options "-march=rv32gc -mabi=ilp32d -fno-math-errno -funsafe-math-optimizations -fno-inline" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+#include "round.c"
+
+/* { dg-final { scan-assembler-times {\mfcvt.w.s} 15 } } */
+/* { dg-final { scan-assembler-times {\mfcvt.s.w} 5 } } */
+/* { dg-final { scan-assembler-times {\mfcvt.d.w} 65 } } */
+/* { dg-final { scan-assembler-times {\mfcvt.w.d} 15 } } */
+/* { dg-final { scan-assembler-times {,rup} 6 } } */
+/* { dg-final { scan-assembler-times {,rmm} 6 } } */
+/* { dg-final { scan-assembler-times {,rdn} 6 } } */
+/* { dg-final { scan-assembler-times {,rtz} 6 } } */
+/* { dg-final { scan-assembler-not {\mfcvt.l.d} } } */
+/* { dg-final { scan-assembler-not {\mfcvt.d.l} } } */
+/* { dg-final { scan-assembler-not "\\sceil\\s" } } */
+/* { dg-final { scan-assembler-not "\\sfloor\\s" } } */
+/* { dg-final { scan-assembler-not "\\sround\\s" } } */
+/* { dg-final { scan-assembler-not "\\snearbyint\\s" } } */
+/* { dg-final { scan-assembler-not "\\srint\\s" } } */
+/* { dg-final { scan-assembler-not "\\stail\\s" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/round_64.c b/gcc/testsuite/gcc.target/riscv/round_64.c
new file mode 100644
index 00000000000..e79690979a5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/round_64.c
@@ -0,0 +1,23 @@ 
+/* { dg-do compile { target { riscv64*-*-* } } } */
+/* { dg-options "-march=rv64gc -mabi=lp64d -fno-math-errno -funsafe-math-optimizations -fno-inline" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+#include "round.c"
+
+/* { dg-final { scan-assembler-times {\mfcvt.w.s} 10 } } */
+/* { dg-final { scan-assembler-times {\mfcvt.s.w} 5 } } */
+/* { dg-final { scan-assembler-times {\mfcvt.l.d} 10 } } */
+/* { dg-final { scan-assembler-times {\mfcvt.d.l} 45 } } */
+/* { dg-final { scan-assembler-times {\mfcvt.w.d} 5 } } */
+/* { dg-final { scan-assembler-times {,rup} 6 } } */
+/* { dg-final { scan-assembler-times {,rmm} 6 } } */
+/* { dg-final { scan-assembler-times {,rdn} 6 } } */
+/* { dg-final { scan-assembler-times {,rtz} 6 } } */
+/* { dg-final { scan-assembler-not "\\sceil\\s" } } */
+/* { dg-final { scan-assembler-not "\\sfloor\\s" } } */
+/* { dg-final { scan-assembler-not "\\sround\\s" } } */
+/* { dg-final { scan-assembler-not "\\snearbyint\\s" } } */
+/* { dg-final { scan-assembler-not "\\srint\\s" } } */
+/* { dg-final { scan-assembler-not "\\stail\\s" } } */
+/* { dg-final { scan-assembler-not "\\ssext.w\\s" } } */
+