{"id":814015,"url":"http://patchwork.ozlabs.org/api/patches/814015/?format=json","web_url":"http://patchwork.ozlabs.org/project/gcc/patch/CAELXzTMOO-E0KEvGKTTWhqTUF2gMM1322Y-2UyV39J2YqBnAuQ@mail.gmail.com/","project":{"id":17,"url":"http://patchwork.ozlabs.org/api/projects/17/?format=json","name":"GNU Compiler Collection","link_name":"gcc","list_id":"gcc-patches.gcc.gnu.org","list_email":"gcc-patches@gcc.gnu.org","web_url":null,"scm_url":null,"webscm_url":null,"list_archive_url":"","list_archive_url_format":"","commit_url_format":""},"msgid":"<CAELXzTMOO-E0KEvGKTTWhqTUF2gMM1322Y-2UyV39J2YqBnAuQ@mail.gmail.com>","list_archive_url":null,"date":"2017-09-15T01:30:15","name":"[RFC,PACH,3/5] Prevent tree unroller from completely unrolling inner loops if that results in excessive strided-loads in outer loop","commit_ref":null,"pull_url":null,"state":"new","archived":false,"hash":"9e426fab9abfc872a841abfac4d05d661e784392","submitter":{"id":25768,"url":"http://patchwork.ozlabs.org/api/people/25768/?format=json","name":"Kugan Vivekanandarajah","email":"kugan.vivekanandarajah@linaro.org"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/gcc/patch/CAELXzTMOO-E0KEvGKTTWhqTUF2gMM1322Y-2UyV39J2YqBnAuQ@mail.gmail.com/mbox/","series":[{"id":3186,"url":"http://patchwork.ozlabs.org/api/series/3186/?format=json","web_url":"http://patchwork.ozlabs.org/project/gcc/list/?series=3186","date":"2017-09-15T01:24:36","name":"Loop unrolling and memory load streams","version":1,"mbox":"http://patchwork.ozlabs.org/series/3186/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/814015/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/814015/checks/","tags":{},"related":[],"headers":{"Return-Path":"<gcc-patches-return-462191-incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":"incoming@patchwork.ozlabs.org","Delivered-To":["patchwork-incoming@bilbo.ozlabs.org","mailing list gcc-patches@gcc.gnu.org"],"Authentication-Results":["ozlabs.org;\n\tspf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org\n\t(client-ip=209.132.180.131; helo=sourceware.org;\n\tenvelope-from=gcc-patches-return-462191-incoming=patchwork.ozlabs.org@gcc.gnu.org;\n\treceiver=<UNKNOWN>)","ozlabs.org; dkim=pass (1024-bit key;\n\tunprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org\n\theader.b=\"V80peqDB\"; dkim-atps=neutral","sourceware.org; auth=none"],"Received":["from sourceware.org (server1.sourceware.org [209.132.180.131])\n\t(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256\n\tbits)) (No client certificate requested)\n\tby ozlabs.org (Postfix) with ESMTPS id 3xtdBs19MYz9t2V\n\tfor <incoming@patchwork.ozlabs.org>;\n\tFri, 15 Sep 2017 11:30:43 +1000 (AEST)","(qmail 111920 invoked by alias); 15 Sep 2017 01:30:36 -0000","(qmail 111882 invoked by uid 89); 15 Sep 2017 01:30:28 -0000","from mail-qt0-f175.google.com (HELO mail-qt0-f175.google.com)\n\t(209.85.216.175) by sourceware.org\n\t(qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;\n\tFri, 15 Sep 2017 01:30:25 +0000","by mail-qt0-f175.google.com with SMTP id l25so975881qtf.13 for\n\t<gcc-patches@gcc.gnu.org>; Thu, 14 Sep 2017 18:30:17 -0700 (PDT)","by 10.237.37.211 with HTTP; Thu, 14 Sep 2017 18:30:15 -0700 (PDT)"],"DomainKey-Signature":"a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id\n\t:list-unsubscribe:list-archive:list-post:list-help:sender\n\t:mime-version:from:date:message-id:subject:to:content-type; q=\n\tdns; s=default; b=QI5+QwVpqGQs5+05ZFh0ModfGgjSSoemwbRc7IW26E+tyX\n\tVc9pUK+r8kw+bRPD9Pe36jbsHBKML4J5zM2I90HeXyMTNtgrVMECiofbugLAGuqV\n\tquEBv9ugd7vOvq0qzlpsHgxZeCXyUTjqnfgutrEZHg0uQk01HsiBqQyz/lL5c=","DKIM-Signature":"v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id\n\t:list-unsubscribe:list-archive:list-post:list-help:sender\n\t:mime-version:from:date:message-id:subject:to:content-type; s=\n\tdefault; bh=3lnh/Z4V+SKia1Eh/ZpkA4kPzhI=; b=V80peqDB+IOYixsSiuMJ\n\tsmz8eipatZH8KG5XdQJ8UYf+eSbxhVXPQKnbSyQKK01hIhXfF9ZTEfYMfsACXFPb\n\tWYm02xpWTal0m5p57JjEgkOFO5kmS21XRwbxv/onzzT++5Bi6vP9OHp1n/3g1hv6\n\tLx8wLOGXXSWOK9RuSLO3uaQ=","Mailing-List":"contact gcc-patches-help@gcc.gnu.org; run by ezmlm","Precedence":"bulk","List-Id":"<gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>","List-Archive":"<http://gcc.gnu.org/ml/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-help@gcc.gnu.org>","Sender":"gcc-patches-owner@gcc.gnu.org","X-Virus-Found":"No","X-Spam-SWARE-Status":"No, score=-24.2 required=5.0 tests=AWL, BAYES_00,\n\tGIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3,\n\tRCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM,\n\tSPF_PASS autolearn=ham version=3.3.2 spammy=","X-HELO":"mail-qt0-f175.google.com","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net;\n\ts=20161025;\n\th=x-gm-message-state:mime-version:from:date:message-id:subject:to;\n\tbh=lH0+0BUjguHxQYt8KNrvUrkghtMFvqo6sYYuiD9xmKE=;\n\tb=BYxJrAt7TqAuptuN9ZElAWijzLcQVxwazZOW5vd+0aWKXtD/sLPFdzFoC7v98ZUfz0\n\tlKbo4tJUy0YxcDnbyorLzcyRPhpg8VTQB8k7Mf/hH6F3v/FZBIYGn41IZfmfe4wZsBU9\n\teMMZPeHWF9+JXaLZd2ng/fOIE9vnOQV4K1l2KkFc0nRVNPx/RgbCkpgYqnjAEvzyJGAW\n\tsuQUjQWrYUTnZcwruzaJpa1zU28WM+yGL+Yao5JF59SBr2zqVgPlYXEkKWpkUdWQYB+Y\n\toidHa6+ypsQyQzJ/zp8oGCS7qno3jwXzZ3vp+YkT2LGdG2yqV6isGzdMtwsYgpv4IB6+\n\t12Mw==","X-Gm-Message-State":"AHPjjUiWz3uFtA9xV1cnK/iWdHUt4RUa4pHTJX7heGZ3lEl2HPDdTqVI\tjDlr7mUyCmRe4otS1UeTKuSg8AwcLCJ0yBzvTPeZDIuCrAI=","X-Google-Smtp-Source":"AOwi7QCyjYb9NjCsA2a1X8P0UQRKDXtXDr+ja+wAjAxOYvSmhhzU99Zx0L1AvguE3Oaa1mHspscklxgjPpBC72GhsB4=","X-Received":"by 10.237.35.35 with SMTP id h32mr32149413qtc.47.1505439016050;\n\tThu, 14 Sep 2017 18:30:16 -0700 (PDT)","MIME-Version":"1.0","From":"Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>","Date":"Fri, 15 Sep 2017 11:30:15 +1000","Message-ID":"<CAELXzTMOO-E0KEvGKTTWhqTUF2gMM1322Y-2UyV39J2YqBnAuQ@mail.gmail.com>","Subject":"[RFC][PACH 3/5] Prevent tree unroller from completely unrolling\n\tinner loops if that results in excessive strided-loads in outer loop","To":"\"gcc-patches@gcc.gnu.org\" <gcc-patches@gcc.gnu.org>","Content-Type":"multipart/mixed; boundary=\"001a1146fe04c891a505593055cf\"","X-IsSubscribed":"yes"},"content":"This patch prevent tree unroller from completely unrolling inner loops if that\nresults in excessive strided-loads in outer loop.\n\nThanks,\nKugan\n\ngcc/ChangeLog:\n\n2017-09-12  Kugan Vivekanandarajah  <kuganv@linaro.org>\n\n    * config/aarch64/aarch64.c (count_mem_load_streams): New.\n    (aarch64_ok_to_unroll): New.\n    * doc/tm.texi (ok_to_unroll): Define new target hook.\n    * doc/tm.texi.in (ok_to_unroll): Likewise.\n    * target.def (ok_to_unroll): Likewise.\n    * tree-ssa-loop-ivcanon.c (try_unroll_loop_completely): Use\n      ok_to_unroll while unrolling.","diff":"From 5de245bbf6ba1768e8206a61feb0f42c106a1d94 Mon Sep 17 00:00:00 2001\nFrom: Kugan Vivekanandarajah <kugan.vivekanandarajah@linaro.org>\nDate: Fri, 18 Aug 2017 16:41:13 +1000\nSubject: [PATCH 3/5] tree unroller limit strided loads\n\n---\n gcc/config/aarch64/aarch64.c | 70 ++++++++++++++++++++++++++++++++++++++++++++\n gcc/doc/tm.texi              |  4 +++\n gcc/doc/tm.texi.in           |  2 ++\n gcc/target.def               |  8 +++++\n gcc/tree-ssa-loop-ivcanon.c  |  8 +++++\n 5 files changed, 92 insertions(+)\n\ndiff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c\nindex 7d1ee70..e88bb6c 100644\n--- a/gcc/config/aarch64/aarch64.c\n+++ b/gcc/config/aarch64/aarch64.c\n@@ -64,6 +64,7 @@\n #include \"sched-int.h\"\n #include \"target-globals.h\"\n #include \"common/common-target.h\"\n+#include \"tree-scalar-evolution.h\"\n #include \"selftest.h\"\n #include \"selftest-rtl.h\"\n \n@@ -15122,6 +15123,72 @@ aarch64_sched_can_speculate_insn (rtx_insn *insn)\n     }\n }\n \n+/* Count the strided loads in the LOOP with respect to OUT_LOOP.\n+   If the strided loads are larger (compared to MAX_STRIDED_LOADS),\n+   we dont need to compute all of them.  */\n+\n+static unsigned\n+count_mem_load_streams (struct loop *out_loop,\n+\t\t\tstruct loop *loop,\n+\t\t\tunsigned max_strided_loads)\n+{\n+  basic_block *bbs = get_loop_body (loop);\n+  unsigned nbbs = loop->num_nodes;\n+  gimple_stmt_iterator gsi;\n+  unsigned count = 0;\n+\n+  for (unsigned i = 0; i < nbbs; i++)\n+    {\n+      bool ok;\n+      basic_block bb = bbs[i];\n+      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);\n+\t   gsi_next (&gsi))\n+\t{\n+\t  gimple *stmt = gsi_stmt (gsi);\n+\t  if (!is_gimple_assign (stmt)\n+\t      || !gimple_vuse (stmt))\n+\t    continue;\n+\t  tree op = gimple_assign_rhs1 (stmt);\n+\t  if (!INDIRECT_REF_P (op)\n+\t      && TREE_CODE (op) != MEM_REF\n+\t      && TREE_CODE (op) != TARGET_MEM_REF)\n+\t    continue;\n+\t  op = TREE_OPERAND (op, 0);\n+\t  tree ev = analyze_scalar_evolution (out_loop, op);\n+\t  ev = instantiate_parameters (loop, ev);\n+\t  if (no_evolution_in_loop_p (ev, out_loop->num, &ok) && !ok)\n+\t    count++;\n+\t  if (count >= max_strided_loads)\n+\t    return count;\n+\t}\n+    }\n+  return count;\n+}\n+\n+/* Target hook that prevents complete loop unrolling if this would make\n+   the outer loop's prefetch strems more than hardware can handle.  */\n+\n+static bool\n+aarch64_ok_to_unroll (struct loop *loop, unsigned HOST_WIDE_INT nunroll)\n+{\n+  struct loop *loop_father;\n+  unsigned loads;\n+  unsigned outter_loads;\n+\n+  if (aarch64_tune_params.prefetch->hw_prefetchers_avail == -1)\n+    return true;\n+\n+  if ((loop_father = loop_outer (loop)))\n+    {\n+      unsigned max_strided_loads = aarch64_tune_params.prefetch->hw_prefetchers_avail;\n+      loads = count_mem_load_streams (loop_father, loop, max_strided_loads);\n+      outter_loads = count_mem_load_streams (loop_father, loop_father, max_strided_loads);\n+      if ((outter_loads + (nunroll - 1) * loads) > max_strided_loads)\n+\treturn false;\n+    }\n+  return true;\n+}\n+\n /* Target-specific selftests.  */\n \n #if CHECKING_P\n@@ -15550,6 +15617,9 @@ aarch64_libgcc_floating_mode_supported_p\n #undef TARGET_CUSTOM_FUNCTION_DESCRIPTORS\n #define TARGET_CUSTOM_FUNCTION_DESCRIPTORS 4\n \n+#undef TARGET_OK_TO_UNROLL\n+#define TARGET_OK_TO_UNROLL aarch64_ok_to_unroll\n+\n #if CHECKING_P\n #undef TARGET_RUN_TARGET_SELFTESTS\n #define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests\ndiff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi\nindex 795e492..45cea4c 100644\n--- a/gcc/doc/tm.texi\n+++ b/gcc/doc/tm.texi\n@@ -11617,6 +11617,10 @@ is required only when the target has special constraints like maximum\n number of memory accesses.\n @end deftypefn\n \n+@deftypefn {Target Hook} bool TARGET_OK_TO_UNROLL (struct loop *@var{loop_info}, unsigned HOST_WIDE_INT @var{nunroll})\n+This hook should return false if target prefers loop should not be unrolled\n+@end deftypefn\n+\n @defmac POWI_MAX_MULTS\n If defined, this macro is interpreted as a signed integer C expression\n that specifies the maximum number of floating point multiplications\ndiff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in\nindex 98f2e6b..64dfa51 100644\n--- a/gcc/doc/tm.texi.in\n+++ b/gcc/doc/tm.texi.in\n@@ -8155,6 +8155,8 @@ build_type_attribute_variant (@var{mdecl},\n \n @hook TARGET_LOOP_UNROLL_ADJUST\n \n+@hook TARGET_OK_TO_UNROLL\n+\n @defmac POWI_MAX_MULTS\n If defined, this macro is interpreted as a signed integer C expression\n that specifies the maximum number of floating point multiplications\ndiff --git a/gcc/target.def b/gcc/target.def\nindex bbd9c01..2f62328 100644\n--- a/gcc/target.def\n+++ b/gcc/target.def\n@@ -5120,6 +5120,14 @@ hardware divmod insn but defines target-specific divmod libfuncs.\",\n  void, (rtx libfunc, machine_mode mode, rtx op0, rtx op1, rtx *quot, rtx *rem),\n  NULL)\n \n+/* Target function to check complete unrolling of loop is  profitable for loop.  */\n+DEFHOOK\n+(ok_to_unroll,\n+ \"This hook should return false if target prefers loop should not be unrolled\",\n+ bool,\n+ (struct loop *loop_info, unsigned HOST_WIDE_INT nunroll),\n+ NULL)\n+\n /* Return the class for a secondary reload, and fill in extra information.  */\n DEFHOOK\n (secondary_reload,\ndiff --git a/gcc/tree-ssa-loop-ivcanon.c b/gcc/tree-ssa-loop-ivcanon.c\nindex efb199a..c2016458 100644\n--- a/gcc/tree-ssa-loop-ivcanon.c\n+++ b/gcc/tree-ssa-loop-ivcanon.c\n@@ -63,6 +63,7 @@ along with GCC; see the file COPYING3.  If not see\n #include \"tree-inline.h\"\n #include \"tree-cfgcleanup.h\"\n #include \"builtins.h\"\n+#include \"target.h\"\n \n /* Specifies types of loops that may be unrolled.  */\n \n@@ -855,6 +856,13 @@ try_unroll_loop_completely (struct loop *loop,\n \t\t     loop->num);\n \t  return false;\n \t}\n+\n+      if (targetm.ok_to_unroll\n+\t  && !targetm.ok_to_unroll (loop, n_unroll))\n+\t{\n+\t  return false;\n+\t}\n+\n       if (!n_unroll)\n         dump_printf_loc (report_flags, locus,\n                          \"loop turned into non-loop; it never loops.\\n\");\n-- \n2.7.4\n\n","prefixes":["RFC","PACH","3/5"]}