{"id":2229521,"url":"http://patchwork.ozlabs.org/api/1.1/patches/2229521/?format=json","web_url":"http://patchwork.ozlabs.org/project/gcc/patch/20260428114731.3C21B4BB3BB5@sourceware.org/","project":{"id":17,"url":"http://patchwork.ozlabs.org/api/1.1/projects/17/?format=json","name":"GNU Compiler Collection","link_name":"gcc","list_id":"gcc-patches.gcc.gnu.org","list_email":"gcc-patches@gcc.gnu.org","web_url":null,"scm_url":null,"webscm_url":null},"msgid":"<20260428114731.3C21B4BB3BB5@sourceware.org>","date":"2026-04-28T11:47:03","name":"[x86] override vector_costs::better_main_loop_than_p","commit_ref":null,"pull_url":null,"state":"new","archived":false,"hash":"c85075d30a910d8a9b6200921501f9e7e5dda5d1","submitter":{"id":4338,"url":"http://patchwork.ozlabs.org/api/1.1/people/4338/?format=json","name":"Richard Biener","email":"rguenther@suse.de"},"delegate":null,"mbox":"http://patchwork.ozlabs.org/project/gcc/patch/20260428114731.3C21B4BB3BB5@sourceware.org/mbox/","series":[{"id":501832,"url":"http://patchwork.ozlabs.org/api/1.1/series/501832/?format=json","web_url":"http://patchwork.ozlabs.org/project/gcc/list/?series=501832","date":"2026-04-28T11:47:03","name":"[x86] override vector_costs::better_main_loop_than_p","version":1,"mbox":"http://patchwork.ozlabs.org/series/501832/mbox/"}],"comments":"http://patchwork.ozlabs.org/api/patches/2229521/comments/","check":"pending","checks":"http://patchwork.ozlabs.org/api/patches/2229521/checks/","tags":{},"headers":{"Return-Path":"<gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256\n header.s=susede2_rsa header.b=RHtRe+YJ;\n\tdkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256\n header.s=susede2_ed25519 header.b=7yhwxFD7;\n\tdkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de\n header.a=rsa-sha256 header.s=susede2_rsa header.b=RHtRe+YJ;\n\tdkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256\n header.s=susede2_ed25519 header.b=7yhwxFD7;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:6:3111::32; helo=vm01.sourceware.org;\n envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n\tdkim=pass (1024-bit key,\n unprotected) header.d=suse.de header.i=@suse.de header.a=rsa-sha256\n header.s=susede2_rsa header.b=RHtRe+YJ;\n\tdkim=pass header.d=suse.de header.i=@suse.de header.a=ed25519-sha256\n header.s=susede2_ed25519 header.b=7yhwxFD7;\n\tdkim=pass (1024-bit key) header.d=suse.de header.i=@suse.de\n header.a=rsa-sha256 header.s=susede2_rsa header.b=RHtRe+YJ;\n\tdkim=neutral header.d=suse.de header.i=@suse.de header.a=ed25519-sha256\n header.s=susede2_ed25519 header.b=7yhwxFD7","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=suse.de","sourceware.org; spf=pass smtp.mailfrom=suse.de","server2.sourceware.org;\n arc=none smtp.remote-ip=195.135.223.131","smtp-out2.suse.de;\n\tnone"],"Received":["from vm01.sourceware.org (vm01.sourceware.org\n [IPv6:2620:52:6:3111::32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4g4dw639xGz1xrS\n\tfor <incoming@patchwork.ozlabs.org>; Tue, 28 Apr 2026 21:47:33 +1000 (AEST)","from vm01.sourceware.org (localhost [127.0.0.1])\n\tby sourceware.org (Postfix) with ESMTP id 3C21B4BB3BB5\n\tfor <incoming@patchwork.ozlabs.org>; Tue, 28 Apr 2026 11:47:31 +0000 (GMT)","from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131])\n by sourceware.org (Postfix) with ESMTPS id C36724BA2E04\n for <gcc-patches@gcc.gnu.org>; Tue, 28 Apr 2026 11:47:04 +0000 (GMT)","from murzim.nue2.suse.org (unknown [10.168.4.243])\n (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest\n SHA256)\n (No client certificate requested)\n by smtp-out2.suse.de (Postfix) with ESMTPS id 9C9CA5BCD7;\n Tue, 28 Apr 2026 11:47:03 +0000 (UTC)"],"DKIM-Filter":["OpenDKIM Filter v2.11.0 sourceware.org 3C21B4BB3BB5","OpenDKIM Filter v2.11.0 sourceware.org C36724BA2E04"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org C36724BA2E04","ARC-Filter":"OpenARC Filter v1.0.0 sourceware.org C36724BA2E04","ARC-Seal":"i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1777376824; cv=none;\n b=ZoikT+qqKUrfD3LUr1PzJdTGIoZLrGxLeyoVYK3gzlKdh5HkX38E/KvpCVYMxfUKmRMtGXvUZ3czBsMQbX2c21hvHxZpqmNH9myGrzHa3IZVx84ua886sBE6FQikUJk2yGWlFlxIQkRKI6M6TpNWQgGjH1XhiMBwBefbqdpDOQM=","ARC-Message-Signature":"i=1; a=rsa-sha256; d=sourceware.org; s=key;\n t=1777376824; c=relaxed/simple;\n bh=qda/KVzWf7P66KFyRz5kCR7rDJoN/vhpcfkK/ipFjPI=;\n h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date:\n From:To:Subject:MIME-Version;\n b=pZyOCcIMfbPYKAfES6sVMa69oHpX+YzvQnWpBPs0B1S8I9WlNq5+kFMURhe4kUMY0uebccoYFgd3ywZvj+9sGbzHBahRrVWKah+ZK7RUjtBNS9IbeFYZSnkoV6IxjQG/8eBEhdgu4BgP8JlYwGf7NdW+ALAloB5Rr628VmcxPA0=","ARC-Authentication-Results":"i=1; server2.sourceware.org","DKIM-Signature":["v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_rsa;\n t=1777376823;\n h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version:\n content-type:content-type; bh=j6r6h6vJvC71P9PXva7Yh6aRYsh2OvF0N5BpYTyz2mE=;\n b=RHtRe+YJ2EM7d+bFFRfcn0aUHuNr/kviIONFNStQwRomjI0uGiRjyd4tVrwgVW9/gLzw9I\n gYiL2DrjJ4iDO2Dz2tbBFfjiHAGAPS1ZcDR2ABB1C6L/k+9Az+hJ7b8mX7ShDyOD/cNdZ6\n 78OxhhUPg2fJ3C2HHJI2yb+SZCTXUpo=","v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_ed25519; t=1777376823;\n h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version:\n content-type:content-type; bh=j6r6h6vJvC71P9PXva7Yh6aRYsh2OvF0N5BpYTyz2mE=;\n b=7yhwxFD7e1DZMIQXbZuAXS0/J6zWkSfTAP4FqAGNuJyHbkJlqnqYcQ8vYEU5O2AA5nHFCN\n SzTYX7K9ZnlrnZCA==","v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_rsa;\n t=1777376823;\n h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version:\n content-type:content-type; bh=j6r6h6vJvC71P9PXva7Yh6aRYsh2OvF0N5BpYTyz2mE=;\n b=RHtRe+YJ2EM7d+bFFRfcn0aUHuNr/kviIONFNStQwRomjI0uGiRjyd4tVrwgVW9/gLzw9I\n gYiL2DrjJ4iDO2Dz2tbBFfjiHAGAPS1ZcDR2ABB1C6L/k+9Az+hJ7b8mX7ShDyOD/cNdZ6\n 78OxhhUPg2fJ3C2HHJI2yb+SZCTXUpo=","v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de;\n s=susede2_ed25519; t=1777376823;\n h=from:from:reply-to:date:date:to:to:cc:cc:mime-version:mime-version:\n content-type:content-type; bh=j6r6h6vJvC71P9PXva7Yh6aRYsh2OvF0N5BpYTyz2mE=;\n b=7yhwxFD7e1DZMIQXbZuAXS0/J6zWkSfTAP4FqAGNuJyHbkJlqnqYcQ8vYEU5O2AA5nHFCN\n SzTYX7K9ZnlrnZCA=="],"Date":"Tue, 28 Apr 2026 13:47:03 +0200 (CEST)","From":"Richard Biener <rguenther@suse.de>","To":"gcc-patches@gcc.gnu.org","cc":"hongtao.liu@intel.com","Subject":"[PATCH] [x86] override vector_costs::better_main_loop_than_p","MIME-Version":"1.0","Content-Type":"text/plain; charset=US-ASCII","X-Spamd-Result":"default: False [-1.80 / 50.00]; BAYES_HAM(-3.00)[100.00%];\n MISSING_MID(2.50)[]; NEURAL_HAM_LONG(-1.00)[-1.000];\n NEURAL_HAM_SHORT(-0.20)[-0.996]; MIME_GOOD(-0.10)[text/plain];\n ARC_NA(0.00)[]; FROM_HAS_DN(0.00)[];\n FUZZY_RATELIMITED(0.00)[rspamd.com]; MISSING_XM_UA(0.00)[];\n RCVD_COUNT_ZERO(0.00)[0]; RCPT_COUNT_TWO(0.00)[2];\n DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519];\n FROM_EQ_ENVFROM(0.00)[]; MIME_TRACE(0.00)[0:+];\n TO_DN_NONE(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[];\n DBL_BLOCKED_OPENRESOLVER(0.00)[gcc.target:url,murzim.nue2.suse.org:helo]","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org","Message-Id":"<20260428114731.3C21B4BB3BB5@sourceware.org>"},"content":"This overrides vector_costs::better_main_loop_than_p to avoid\nregressing gcc.target/i386/vect-partial-vectors-2.c with\n--param ix86-vect-compare-costs=1.  As the user (or a tuning model)\nasks for masked epilogs the vectorizer considers to mask the\nmain loop in case it effectively works as a standalone vector epilog\ndue to known small number of iterations of the loop.  While the\ngeneric cost compare rightfully figures masking of AVX is more expensive\nthan not masking with SSE it does not consider the cost of the epilog.\n\nThis compensates with a x86 specific heuristic that prefers the\nmasked loop if the loop cannot be vectorized with a non-masked\nmain loop and at most a single vector epilog plus a single scalar\nepilog iteration.  This is a reasonable heuristic for x86 and\na small number of iterations as icache footprint matters here,\nso considering the possibility of 3 vector epilogs and 1 scalar\niteration does not look profitable.  Unless testcases will prove\nto us otherwise.\n\nI'm not sure if it makes sense to preserve --param ix86-vect-compare-costs=0\nin the end, if people think so I'll duplicate the testcase with\nboth modes explicitly specified.\n\nBootstrapped and tested on x86_64-unknown-linux-gnu (ontop of the\npatch enabling vector cost compare by default).\n\nAny feedback?  I'm looking to override better_epilogue_loop_than_p\nto mitigate the rest of the fallout.\n\nOK?\n\nThanks,\nRichard.\n\n\t* tree-vectorizer.h (vector_costs::vinfo): New accessor.\n\t* config/i386/i386.cc (ix86_vector_costs::better_main_loop_than_p):\n\tPrefer a masked main loop if we can elide enough of (vector)\n\tepilog loop iterations.\n---\n gcc/config/i386/i386.cc | 23 +++++++++++++++++++++++\n gcc/tree-vectorizer.h   |  3 +++\n 2 files changed, 26 insertions(+)","diff":"diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc\nindex cb37318ed24..5af670a90bd 100644\n--- a/gcc/config/i386/i386.cc\n+++ b/gcc/config/i386/i386.cc\n@@ -26124,6 +26124,7 @@ public:\n \t\t\t      tree vectype, int misalign,\n \t\t\t      vect_cost_model_location where) override;\n   void finish_cost (const vector_costs *) override;\n+  bool better_main_loop_than_p (const vector_costs *) const override;\n \n private:\n \n@@ -26983,6 +26984,28 @@ ix86_vector_costs::finish_cost (const vector_costs *scalar_costs)\n   vector_costs::finish_cost (scalar_costs);\n }\n \n+/* Return true if THIS should be preferred over OTHER as main vector loop.  */\n+\n+bool\n+ix86_vector_costs::better_main_loop_than_p (const vector_costs *other) const\n+{\n+  loop_vec_info this_loop_vinfo = as_a<loop_vec_info> (this->vinfo ());\n+  loop_vec_info other_loop_vinfo = as_a<loop_vec_info> (other->vinfo ());\n+\n+  /* If the other loop is masked it does not need an epilog.  Prefer that\n+     if the current loop cannot be vectorized fully with a vector\n+     epilogs with at most one scalar iteration left.  */\n+  if (LOOP_VINFO_NITERS_KNOWN_P (this_loop_vinfo)\n+      && LOOP_VINFO_USING_PARTIAL_VECTORS_P (other_loop_vinfo)\n+      && known_gt (LOOP_VINFO_VECT_FACTOR (other_loop_vinfo),\n+\t\t   LOOP_VINFO_INT_NITERS (this_loop_vinfo))\n+      && (popcount_hwi (LOOP_VINFO_INT_NITERS (this_loop_vinfo) & ~1)\n+\t  > (param_vect_epilogues_nomask != 0)))\n+    return false;\n+\n+  return vector_costs::better_main_loop_than_p (other);\n+}\n+\n /* Validate target specific memory model bits in VAL. */\n \n static unsigned HOST_WIDE_INT\ndiff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h\nindex 1d20725c69e..9a78cd33619 100644\n--- a/gcc/tree-vectorizer.h\n+++ b/gcc/tree-vectorizer.h\n@@ -1799,8 +1799,11 @@ public:\n   unsigned int epilogue_cost () const;\n   unsigned int outside_cost () const;\n   unsigned int total_cost () const;\n+\n   unsigned int suggested_unroll_factor () const;\n   machine_mode suggested_epilogue_mode (int &masked) const;\n+\n+  vec_info *vinfo () const { return m_vinfo; }\n   bool costing_for_scalar () const { return m_costing_for_scalar; }\n \n protected:\n","prefixes":["x86"]}