diff mbox series

Allow libcalls for complex memcpy when optimizing for size.

Message ID 20191031234153.28246-1-jimw@sifive.com
State New
Headers show
Series Allow libcalls for complex memcpy when optimizing for size. | expand

Commit Message

Jim Wilson Oct. 31, 2019, 11:41 p.m. UTC
The RISC-V backend wants to use a libcall when optimizing for size if
more than 6 instructions are needed.  Emit_move_complex asks for no
libcalls.  This case requires 8 insns for rv64 and 16 insns for rv32,
so we get fallback code that emits a loop.  Commit_one_edge_insertion
doesn't allow code inserted for a phi node on an edge to end with a
branch, and so this triggers an assertion.  This problem goes away if
we allow libcalls when optimizing for size, which gives the code the
RISC-V backend wants, and avoids triggering the assert.

	gcc/
	PR middle-end/92263
	* expr.c (emit_move_complex): Only use BLOCK_OP_NO_LIBCALL when
	optimize_insn_for_speed_p is true.

	gcc/testsuite/
	PR middle-end/92263
	* gcc.dg/pr92263.c: New.
---
 gcc/expr.c                     |  6 ++++--
 gcc/testsuite/gcc.dg/pr92263.c | 28 ++++++++++++++++++++++++++++
 2 files changed, 32 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr92263.c

Comments

Jim Wilson Oct. 31, 2019, 11:44 p.m. UTC | #1
On Thu, Oct 31, 2019 at 4:41 PM Jim Wilson <jimw@sifive.com> wrote:
>         gcc/
>         PR middle-end/92263
>         * expr.c (emit_move_complex): Only use BLOCK_OP_NO_LIBCALL when
>         optimize_insn_for_speed_p is true.
>
>         gcc/testsuite/
>         PR middle-end/92263
>         * gcc.dg/pr92263.c: New.

Tested with rv32-newlib and rv64-linux cross compiler builds and make
checks.  There were no regressions.  The new testcase fails for both
rv32 and rv64 without the patch, and works for both with the patch.

Jim
Jeff Law Nov. 4, 2019, 11:57 p.m. UTC | #2
On 10/31/19 5:41 PM, Jim Wilson wrote:
> The RISC-V backend wants to use a libcall when optimizing for size if
> more than 6 instructions are needed.  Emit_move_complex asks for no
> libcalls.  This case requires 8 insns for rv64 and 16 insns for rv32,
> so we get fallback code that emits a loop.  Commit_one_edge_insertion
> doesn't allow code inserted for a phi node on an edge to end with a
> branch, and so this triggers an assertion.  This problem goes away if
> we allow libcalls when optimizing for size, which gives the code the
> RISC-V backend wants, and avoids triggering the assert.
> 
> 	gcc/
> 	PR middle-end/92263
> 	* expr.c (emit_move_complex): Only use BLOCK_OP_NO_LIBCALL when
> 	optimize_insn_for_speed_p is true.
> 
> 	gcc/testsuite/
> 	PR middle-end/92263
> 	* gcc.dg/pr92263.c: New.
OK.

jeff
diff mbox series

Patch

diff --git a/gcc/expr.c b/gcc/expr.c
index 476c6865f20..0c2e03dd32d 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -3571,11 +3571,13 @@  emit_move_complex (machine_mode mode, rtx x, rtx y)
       rtx_insn *ret;
 
       /* For memory to memory moves, optimal behavior can be had with the
-	 existing block move logic.  */
+	 existing block move logic.  But use normal expansion if optimizing
+	 for size.  */
       if (MEM_P (x) && MEM_P (y))
 	{
 	  emit_block_move (x, y, gen_int_mode (GET_MODE_SIZE (mode), Pmode),
-			   BLOCK_OP_NO_LIBCALL);
+			   (optimize_insn_for_speed_p()
+			    ? BLOCK_OP_NO_LIBCALL : BLOCK_OP_NORMAL));
 	  return get_last_insn ();
 	}
 
diff --git a/gcc/testsuite/gcc.dg/pr92263.c b/gcc/testsuite/gcc.dg/pr92263.c
new file mode 100644
index 00000000000..a79dfd1e351
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr92263.c
@@ -0,0 +1,28 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fno-tree-dce -fno-tree-forwprop -Os -ffloat-store" } */
+
+extern long double cabsl (_Complex long double);
+
+typedef struct {
+  int nsant, nvqd;
+  _Complex long double *vqd;
+} vsorc_t;
+vsorc_t vsorc;
+
+void foo(int next_job, int ain_num, int iped, long t) {
+  long double zpnorm;
+
+  while (!next_job)
+    if (ain_num)
+    {
+      if (iped == 1)
+        zpnorm = 0.0;
+      int indx = vsorc.nvqd-1;
+      vsorc.vqd[indx] = t*1.0fj;
+      if (cabsl(vsorc.vqd[indx]) < 1.e-20)
+        vsorc.vqd[indx] = 0.0fj;
+      zpnorm = t;
+      if (zpnorm > 0.0)
+        iped = vsorc.nsant;
+    }
+}