[{"id":3686987,"web_url":"http://patchwork.ozlabs.org/comment/3686987/","msgid":"<DIBF1XI69RQC.O9F9EZG729IO@gmail.com>","list_archive_url":null,"date":"2026-05-06T07:30:29","subject":"Re: [PATCH] RISC-V: Add per-type reduction costs to the vector cost\n model","submitter":{"id":86205,"url":"http://patchwork.ozlabs.org/api/people/86205/","name":"Robin Dapp","email":"rdapp.gcc@gmail.com"},"content":"Hi Wang Yaduo,\n\n> Add per-type reduction costs (i8/i16/i32/i64/f16/f32/f64) to the RISC-V\n> vector cost model, distinguishing between ordered (fold-left) and\n> unordered (tree) floating-point reductions.  When a reduction is\n> detected, the per-type cost replaces the default vec_to_scalar_cost,\n> similar to AArch64.  This causes _Float16 n=4 ordered reductions to no\n> longer be vectorized in VLS mode due to the higher cost.\n>\n> gcc/ChangeLog:\n>\n> \t* config/riscv/riscv-protos.h (common_vector_cost): Add per-type\n> \treduction cost fields: reduc_i8_cost, reduc_i16_cost,\n> \treduc_i32_cost, reduc_i64_cost, reduc_f16_cost, reduc_f32_cost,\n> \treduc_f64_cost for unordered reductions, and reduc_f16_ordered_cost,\n> \treduc_f32_ordered_cost, reduc_f64_ordered_cost for ordered\n> \t(fold-left) reductions.\n> \t* config/riscv/riscv.cc (rvv_vla_vector_cost): Initialize reduction\n> \tcost fields with default values.\n> \t(rvv_vls_vector_cost): Likewise.\n> \t* config/riscv/riscv-vector-costs.cc (costs::adjust_stmt_cost): Add\n> \treduction detection in the vec_to_scalar case.  When a reduction is\n> \tdetected, replace the default vec_to_scalar_cost with the\n> \tappropriate per-type reduction cost based on element mode and\n> \treduction kind (ordered vs unordered).\n>\n> gcc/testsuite/ChangeLog:\n>\n> \t* gcc.target/riscv/rvv/autovec/reduc/reduc_cost-1.c: New test for\n> \tVLA unordered reduction costs.\n> \t* gcc.target/riscv/rvv/autovec/reduc/reduc_cost-2.c: New test for\n> \tVLA ordered reduction costs.\n> \t* gcc.target/riscv/rvv/autovec/vls/reduc_cost-1.c: New test for\n> \tVLS reduction costs.\n> \t* gcc.target/riscv/rvv/autovec/vls/reduc-19.c: Update expected\n> \tvfredosum count from 9 to 8.\n> \t* gcc.target/riscv/rvv/autovec/vls/wred-3.c: Update expected\n> \tvfwredosum count from 17 to 16.\n>\n> Signed-off-by: Wang Yaduo <wangyaduo@linux.alibaba.com>\n> ---\n>  gcc/config/riscv/riscv-protos.h               | 20 +++++-\n>  gcc/config/riscv/riscv-vector-costs.cc        | 68 ++++++++++++++++++-\n>  gcc/config/riscv/riscv.cc                     | 20 ++++++\n>  .../riscv/rvv/autovec/reduc/reduc_cost-1.c    | 34 ++++++++++\n>  .../riscv/rvv/autovec/reduc/reduc_cost-2.c    | 34 ++++++++++\n>  .../riscv/rvv/autovec/vls/reduc-19.c          |  4 +-\n>  .../riscv/rvv/autovec/vls/reduc_cost-1.c      | 41 +++++++++++\n>  .../gcc.target/riscv/rvv/autovec/vls/wred-3.c |  4 +-\n>  8 files changed, 219 insertions(+), 6 deletions(-)\n>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_cost-1.c\n>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_cost-2.c\n>  create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/reduc_cost-1.c\n>\n> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h\n> index dd029c704..5da5a6a21 100644\n> --- a/gcc/config/riscv/riscv-protos.h\n> +++ b/gcc/config/riscv/riscv-protos.h\n> @@ -279,6 +279,24 @@ struct common_vector_cost\n>  \n>    /* Cost of an unaligned vector store.  */\n>    const int unalign_store_cost;\n> +\n> +  /* Cost of vector reduction operations (unordered / tree reduction).\n> +     Indexed by element type.  */\n> +  const int reduc_i8_cost;\n> +  const int reduc_i16_cost;\n> +  const int reduc_i32_cost;\n> +  const int reduc_i64_cost;\n> +  const int reduc_f16_cost;\n> +  const int reduc_f32_cost;\n> +  const int reduc_f64_cost;\n\nDo we need all of those?  I'm not sure but given that they are supposed to be \nimplemented as tree reductions, the latency should not vary too much WRT the \nelement size?\n\n> +\n> +  /* Cost of ordered (fold-left / strict) floating-point reductions.\n> +     These are significantly more expensive than unordered (tree) reductions\n> +     because RVV ordered reduction instructions (e.g. vfredosum) process\n> +     elements sequentially.  */\n> +  const int reduc_f16_ordered_cost;\n> +  const int reduc_f32_ordered_cost;\n> +  const int reduc_f64_ordered_cost;\n\nSame here, I'm not entirely sure and uarchs might vary (wildly) but generally \nthese should scale linearly with the number of elements so perhaps once factor \nis enough?  Open for debate, though.\n\n>  /* scalable vectorization (VLA) specific cost.  */\n> @@ -289,7 +307,7 @@ struct scalable_vector_cost : common_vector_cost\n>    {}\n>  \n>    /* TODO: We will need more other kinds of vector cost for VLA.\n> -     E.g. fold_left reduction cost, lanes load/store cost, ..., etc.  */\n> +     E.g. lanes load/store cost, ..., etc.  */\n>  };\n\nWe have lane cost, so this comment can be removed.\n\n> --- a/gcc/config/riscv/riscv.cc\n> +++ b/gcc/config/riscv/riscv.cc\n> @@ -415,6 +415,16 @@ static const common_vector_cost rvv_vls_vector_cost = {\n>    1, /* align_store_cost  */\n>    2, /* unalign_load_cost  */\n>    2, /* unalign_store_cost  */\n> +  2, /* reduc_i8_cost  */\n> +  2, /* reduc_i16_cost  */\n> +  2, /* reduc_i32_cost  */\n> +  2, /* reduc_i64_cost  */\n> +  2, /* reduc_f16_cost  */\n> +  2, /* reduc_f32_cost  */\n> +  2, /* reduc_f64_cost  */\n> +  6, /* reduc_f16_ordered_cost  */\n> +  4, /* reduc_f32_ordered_cost  */\n> +  2, /* reduc_f64_ordered_cost  */\n>  };\n\nAny reason why the scaling is not *2 but rather +2?  I'd have expected twice \nthe work (and thus, latency) for 2x elements.  Also, even 2-6 seem rather low \ncompared to regular reductions?  Looking at the published Ascalon X numbers, \nit's more like 5, 10, 20.\n\n\n> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_cost-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_cost-1.c\n\nDistinct cost-model tests are better put into the costmodel sub directory.\n\n>  #include \"wred-2.c\"\n\n> -/* { dg-final { scan-assembler-times {vfwredosum\\.vs} 17 } } */\n> +/* The _Float16->float n=4 case is not vectorized because the ordered\n> +   reduction cost makes it unprofitable for small trip counts.  */\n> +/* { dg-final { scan-assembler-times {vfwredosum\\.vs} 16 } } */\n\nThis is supposed to test functionality so I'd rather keep the expectation and \nadd -fno-vect-cost-model.","headers":{"Return-Path":"<gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (2048-bit key;\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20251104 header.b=Q6/S67LO;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:6:3111::32; helo=vm01.sourceware.org;\n envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n\tdkim=pass (2048-bit key,\n unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256\n header.s=20251104 header.b=Q6/S67LO","sourceware.org;\n dmarc=pass (p=none dis=none) header.from=gmail.com","sourceware.org; spf=pass smtp.mailfrom=gmail.com","sourceware.org;\n arc=none smtp.remote-ip=2a00:1450:4864:20::42b"],"Received":["from vm01.sourceware.org (vm01.sourceware.org\n [IPv6:2620:52:6:3111::32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4g9Xdf0BCtz1yJV\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 06 May 2026 21:07:02 +1000 (AEST)","from vm01.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id E94A04BA23FA\n\tfor <incoming@patchwork.ozlabs.org>; Wed,  6 May 2026 11:06:59 +0000 (GMT)","from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com\n [IPv6:2a00:1450:4864:20::42b])\n by sourceware.org (Postfix) with ESMTPS id 27A0C4BA7986\n for <gcc-patches@gcc.gnu.org>; Wed,  6 May 2026 07:30:32 +0000 (GMT)","by mail-wr1-x42b.google.com with SMTP id\n ffacd0b85a97d-43d75312379so356654f8f.1\n for <gcc-patches@gcc.gnu.org>; Wed, 06 May 2026 00:30:32 -0700 (PDT)","from localhost (ip-085-216-098-084.um25.pools.vodafone-ip.de.\n [85.216.98.84]) by smtp.gmail.com with ESMTPSA id\n ffacd0b85a97d-45054b02f76sm10010974f8f.23.2026.05.06.00.30.29\n (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);\n Wed, 06 May 2026 00:30:30 -0700 (PDT)"],"DKIM-Filter":["OpenDKIM Filter v2.11.0 sourceware.org E94A04BA23FA","OpenDKIM Filter v2.11.0 sourceware.org 27A0C4BA7986"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org 27A0C4BA7986","ARC-Filter":"OpenARC Filter v1.0.0 sourceware.org 27A0C4BA7986","ARC-Seal":"i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1778052632; cv=none;\n b=GpmqeIz7+C7/2qltYWW76RWgocO55d0wETUnKI5gVhpfteutfB4U6uEV6uK9wwFSbpnQEJdIXigFgYv+HYBoPRRqxdlXgI2A8oHFXebZ654S6F0SNkGO3WZnKxmsQ5YG/njU9r1uW3aCRWYSWSdyfY3/iGD/zIfkUdC06CZkMkI=","ARC-Message-Signature":"i=1; a=rsa-sha256; d=sourceware.org; s=key;\n t=1778052632; c=relaxed/simple;\n bh=mAUzGzx/x2X/P8Drv0qWQMP3M2hACW5iCaWRJS32wMo=;\n h=DKIM-Signature:Mime-Version:Date:Message-Id:Subject:To:From;\n b=UGqDGaH8cQpzs2suJQNvwUljKeNmqHomKZT50r5pjmIpAUYKjkRNmbj0ejWSwzq2JFWll1qZ+BCtO8xWtJw41vJ4sMb+mE+k79fyBH9GHeg0MspyDGnLPTnr2bGZ4IRs4HUEfGpXtm7EKiJLgzevWKt34ildA6N0s2uKQKFkg3I=","ARC-Authentication-Results":"i=1; sourceware.org;\n dkim=pass (2048-bit key, unprotected)\n header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20251104\n header.b=Q6/S67LO","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=gmail.com; s=20251104; t=1778052631; x=1778657431; darn=gcc.gnu.org;\n h=in-reply-to:references:from:to:cc:subject:message-id:date\n :content-transfer-encoding:mime-version:from:to:cc:subject:date\n :message-id:reply-to;\n bh=U/v+uoRHBJeSLVcK1hHMeE99MsaKLRdnjRTU34WoN+Y=;\n b=Q6/S67LOLZ49e7FBvrQOgXmsR6dqkjUsfh5Cjg7P6AC3hILsEwkbv0RhZlx+TZg1Cq\n XhZhnpDokIm9ng13dAVIfvfLGGpZgYtzWbTL/fm0dsgBgkLaxgfaXDtW5aSqAS2dZWIa\n IHo3RmVTUOLe+GzqpJQeUPBCLIgw3HkYHaG0vyfonfP3DbPWPGEBC5FQM/3zl3COUvv4\n RFIBDvRjVlRTVv8aFGCngX24YEyKti9I5NzYuORXmT0jiF7qr0cJjvAEH4DNoEM9+Zg6\n z7Ytj76D/2Z4/cYNt46Nkad1EpM9pWENHF6EBvd8FAaA0lugjKL+zWEPYxKQtnVRZ/q9\n zLoQ==","X-Google-DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=1e100.net; s=20251104; t=1778052631; x=1778657431;\n h=in-reply-to:references:from:to:cc:subject:message-id:date\n :content-transfer-encoding:mime-version:x-gm-gg:x-gm-message-state\n :from:to:cc:subject:date:message-id:reply-to;\n bh=U/v+uoRHBJeSLVcK1hHMeE99MsaKLRdnjRTU34WoN+Y=;\n b=Nr2UDn6NRh4byg6QxjmdjbOeACzpuTgffesCqz0ghCb+zmxW5+ZaSzEs0LnX455pq/\n BuAr8gutZKc/Ki3OgrGg4xbQLCHnQHyZT/DYqATQ3Dr4JWIO3x6x5Fzi3vtBZogrKE+L\n 303RVGMMgDdl+Fjo6xYZnJ3kYniW97b8c4H9kSN63boPO+9gySzSgmdTX2yBMgw+EVoG\n aC40GmmtszaD89T4emohNY+GxHTvxVTLBFy/XF7AhNFRSifscvEt2tKzhmD59PnAcjih\n u64oT+CLuDdYUhZR1EUPLfZeiQYfn0WmlaOfY96stc9h8OhVglnZq3ty0icP/CzzSt+V\n oKwg==","X-Forwarded-Encrypted":"i=1;\n AFNElJ+J5pgLa+Wsy2ZqZzM0lAaSZN3mypZnN416R3sTBXO8LCqTydIutRCfeCsYhGqIxg1TQAMPTIfdT6iKaQ==@gcc.gnu.org","X-Gm-Message-State":"AOJu0Yy2uUscVyhpUeTKYOuke5oD0T9r+6aXDO13NFqwu/quDCCbI/Xa\n hRO/lxeaZ4/W82uDRZL+UH+WhUtPBOhiuXEyE0Ab8bmsShrHghNZ5J7A","X-Gm-Gg":"AeBDiesURJU0MMjZWGjZadHP4iOjfjQGLvDrMnAFSnM1HiMDcMPB36io8Txu32dtI29\n U64opvW7zfWa4lUajZ98A6J5yKmPSIwSpfi2bxpv90a58HnZUoI8fNK5F0Ao3nRB0Gp44phKy84\n 2+i7Astqprgd1usg0jZMvy4snOoozuzSumb3FubabYZyLjCQnfHXUffIq2CAvvm7Dt1VHF/0zpt\n 5Voj+zcUSA14qPJOuGIHMM3cGAwUsA+s4zqcFeRsiM3CXkThzGfP+DiIijYs3DDAk5V0Prgere1\n zE7+MJPD+NOvXtMXyniFBV7eLo9wBKZyhlA5LC7L7PNHn2AvsYGHBLHmqqGUrfy1nk0WIYX3tMg\n P1O/KYGFpq+ozU3R7/co262oJvsLBeuzBdpcLGQIVYiZqVNzZjUg70uOfWGMXea1jj2T6TSOwWt\n rA8tCP7FbJ9FK3s5Qfws7zZqT1PuNL7UHT0kXtXLKG7bOsvJwhXj3H1Voc824yMakQdTvMU0ZAF\n QyNkug=","X-Received":"by 2002:a05:6000:2a82:b0:451:ccce:33a9 with SMTP id\n ffacd0b85a97d-451ccce379dmr1044121f8f.27.1778052630709;\n Wed, 06 May 2026 00:30:30 -0700 (PDT)","Mime-Version":"1.0","Content-Transfer-Encoding":"quoted-printable","Content-Type":"text/plain; charset=UTF-8","Date":"Wed, 06 May 2026 09:30:29 +0200","Message-Id":"<DIBF1XI69RQC.O9F9EZG729IO@gmail.com>","Subject":"Re: [PATCH] RISC-V: Add per-type reduction costs to the vector cost\n model","Cc":"<rdapp.gcc@gmail.com>, <kito.cheng@gmail.com>, <juzhe.zhong@rivai.ai>,\n <palmer@dabbelt.com>, <pan2.li@intel.com>, <jeffreyalaw@gmail.com>,\n <chenzhongyao.hit@gmail.com>","To":"\"Wang Yaduo\" <wangyaduo@linux.alibaba.com>, <gcc-patches@gcc.gnu.org>","From":"\"Robin Dapp\" <rdapp.gcc@gmail.com>","References":"<20260506070415.99154-1-wangyaduo@linux.alibaba.com>","In-Reply-To":"<20260506070415.99154-1-wangyaduo@linux.alibaba.com>","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Errors-To":"gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org"}},{"id":3686990,"web_url":"http://patchwork.ozlabs.org/comment/3686990/","msgid":"<8ad45887-1139-4ef8-8b52-0e4d7ab4526d.wangyaduo@linux.alibaba.com>","list_archive_url":null,"date":"2026-05-06T08:37:20","subject":"=?utf-8?q?=E5=9B=9E=E5=A4=8D=EF=BC=9A=5BPATCH=5D_RISC-V=3A_Add_per-?=\n\t=?utf-8?q?type_reduction_costs_to_the_vector_cost_model?=","submitter":{"id":93342,"url":"http://patchwork.ozlabs.org/api/people/93342/","name":"Wang Yaduo","email":"wangyaduo@linux.alibaba.com"},"content":"Hi Robin,\n Thanks for your reply.\n> > +\n> > + /* Cost of vector reduction operations (unordered / tree reduction).\n> > + Indexed by element type. */\n> > + const int reduc_i8_cost;\n> > + const int reduc_i16_cost;\n> > + const int reduc_i32_cost;\n> > + const int reduc_i64_cost;\n> > + const int reduc_f16_cost;\n> > + const int reduc_f32_cost;\n> > + const int reduc_f64_cost;\n> Do we need all of those? I'm not sure but given that they are supposed to be \n> implemented as tree reductions, the latency should not vary too much WRT the \n> element size?\nThis is inspired by `sve_vec_cost` of aarch64, it might not be necessary in generic or rocket, but maybe some uarchs have different implementations, I think it's ok to preserve all these types to help the downstream to tune for their uarchs.\n> > +\n> > + /* Cost of ordered (fold-left / strict) floating-point reductions.\n> > + These are significantly more expensive than unordered (tree) reductions\n> > + because RVV ordered reduction instructions (e.g. vfredosum) process\n> > + elements sequentially. */\n> > + const int reduc_f16_ordered_cost;\n> > + const int reduc_f32_ordered_cost;\n> > + const int reduc_f64_ordered_cost;\n> Same here, I'm not entirely sure and uarchs might vary (wildly) but generally \n> these should scale linearly with the number of elements so perhaps once factor \n> is enough? Open for debate, though.\nThe ordered cost could be different depending on elements' size, as far as I can see in XuanTie C950(the mcpu and mtune will be commited in further patch), it is not scale linearly.\n> > /* scalable vectorization (VLA) specific cost. */\n> > @@ -289,7 +307,7 @@ struct scalable_vector_cost : common_vector_cost\n> > {}\n> > \n> > /* TODO: We will need more other kinds of vector cost for VLA.\n> > - E.g. fold_left reduction cost, lanes load/store cost, ..., etc. */\n> > + E.g. lanes load/store cost, ..., etc. */\n> > };\n> We have lane cost, so this comment can be removed. \nGet, I will fix this after all decisions are made.\n> > --- a/gcc/config/riscv/riscv.cc\n> > +++ b/gcc/config/riscv/riscv.cc\n> > @@ -415,6 +415,16 @@ static const common_vector_cost rvv_vls_vector_cost = {\n> > 1, /* align_store_cost */\n> > 2, /* unalign_load_cost */\n> > 2, /* unalign_store_cost */\n> > + 2, /* reduc_i8_cost */\n> > + 2, /* reduc_i16_cost */\n> > + 2, /* reduc_i32_cost */\n> > + 2, /* reduc_i64_cost */\n> > + 2, /* reduc_f16_cost */\n> > + 2, /* reduc_f32_cost */\n> > + 2, /* reduc_f64_cost */\n> > + 6, /* reduc_f16_ordered_cost */\n> > + 4, /* reduc_f32_ordered_cost */\n> > + 2, /* reduc_f64_ordered_cost */\n> > };\n> Any reason why the scaling is not *2 but rather +2? I'd have expected twice \n> the work (and thus, latency) for 2x elements. Also, even 2-6 seem rather low \n> compared to regular reductions? Looking at the published Ascalon X numbers, \n> it's more like 5, 10, 20.\nYes, I agree with you, considering that this is a common cost, I set these costs to not make too much effects. In our uarchs, it might be 10 ~ 20 depending on elements' size. Is 20 for f16, 10 for f32 and 5 for f64 good to you, or some one has other opinion.\n> > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_cost-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_cost-1.c\n> Distinct cost-model tests are better put into the costmodel sub directory.\n> > #include \"wred-2.c\"\n> > -/* { dg-final { scan-assembler-times {vfwredosum\\.vs} 17 } } */\n> > +/* The _Float16->float n=4 case is not vectorized because the ordered\n> > + reduction cost makes it unprofitable for small trip counts. */\n> > +/* { dg-final { scan-assembler-times {vfwredosum\\.vs} 16 } } */\n> This is supposed to test functionality so I'd rather keep the expectation and \n> add -fno-vect-cost-model.\nI will resolve these later.","headers":{"Return-Path":"<gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>","X-Original-To":["incoming@patchwork.ozlabs.org","gcc-patches@gcc.gnu.org"],"Delivered-To":["patchwork-incoming@legolas.ozlabs.org","gcc-patches@gcc.gnu.org"],"Authentication-Results":["legolas.ozlabs.org;\n\tdkim=pass (1024-bit key;\n unprotected) header.d=linux.alibaba.com header.i=@linux.alibaba.com\n header.a=rsa-sha256 header.s=default header.b=LZARbHS/;\n\tdkim-atps=neutral","legolas.ozlabs.org;\n spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org\n (client-ip=2620:52:6:3111::32; helo=vm01.sourceware.org;\n envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;\n receiver=patchwork.ozlabs.org)","sourceware.org;\n\tdkim=pass (1024-bit key,\n unprotected) header.d=linux.alibaba.com header.i=@linux.alibaba.com\n header.a=rsa-sha256 header.s=default header.b=LZARbHS/","sourceware.org; dmarc=pass (p=none dis=none)\n header.from=linux.alibaba.com","sourceware.org;\n spf=pass smtp.mailfrom=linux.alibaba.com","sourceware.org; arc=none smtp.remote-ip=115.124.30.124"],"Received":["from vm01.sourceware.org (vm01.sourceware.org\n [IPv6:2620:52:6:3111::32])\n\t(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)\n\t key-exchange x25519 server-signature ECDSA (secp384r1) server-digest SHA384)\n\t(No client certificate requested)\n\tby legolas.ozlabs.org (Postfix) with ESMTPS id 4g9Xj80cRxz1yJV\n\tfor <incoming@patchwork.ozlabs.org>; Wed, 06 May 2026 21:10:04 +1000 (AEST)","from vm01.sourceware.org (localhost [IPv6:::1])\n\tby sourceware.org (Postfix) with ESMTP id D9A0D4BA23C6\n\tfor <incoming@patchwork.ozlabs.org>; Wed,  6 May 2026 11:10:01 +0000 (GMT)","from out30-124.freemail.mail.aliyun.com\n (out30-124.freemail.mail.aliyun.com [115.124.30.124])\n by sourceware.org (Postfix) with ESMTPS id EAC6F4BA23EC\n for <gcc-patches@gcc.gnu.org>; Wed,  6 May 2026 08:37:31 +0000 (GMT)","from WS-web\n (wangyaduo@linux.alibaba.com[W4_0.2.3_v5ForWebDing_2125130F_1778053945070_o7001c903]\n cluster:ay36) at Wed, 06 May 2026 16:37:20 +0800"],"DKIM-Filter":["OpenDKIM Filter v2.11.0 sourceware.org D9A0D4BA23C6","OpenDKIM Filter v2.11.0 sourceware.org EAC6F4BA23EC"],"DMARC-Filter":"OpenDMARC Filter v1.4.2 sourceware.org EAC6F4BA23EC","ARC-Filter":"OpenARC Filter v1.0.0 sourceware.org EAC6F4BA23EC","ARC-Seal":"i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1778056652; cv=none;\n b=O17obUrjMibT8NdBsJcrvsmvGHCtJ7ncBJspUyXqT6v0TT43VOS+n8NFwjQsMjOuDWYVX4jiWiGzQLgoyKuNViOc86oenPL6i3wlY+yGn+W+GlBkhDUENNODOVjpIRYMFDe7x6RdDOUzxuhTEvtnaph8PeSfDNmt3c0gkOW+JXI=","ARC-Message-Signature":"i=1; a=rsa-sha256; d=sourceware.org; s=key;\n t=1778056652; c=relaxed/simple;\n bh=E40LFCzPjUmRgrVqhZ3Bcoo5NC7C4Wzbu0X6OzoAc1Q=;\n h=DKIM-Signature:Date:From:To:Message-ID:Subject:MIME-Version;\n b=CCLQVAfON3+gnBf5qTZNwg61t1iDH46NLLBOczK3eq7PPjJmktsVwxjRU6Pobj7b/DR2l7Igwvq2f3dVfPbhFOSHn1RPZcub8fHlfPqbNoqwmxHC3zzwzxWz30g2HhLdn1YZ9GtAEkoDGwyBgCDPU8XQjTxUQB/V1Oa46PPcDOw=","ARC-Authentication-Results":"i=1; sourceware.org;\n dkim=pass (1024-bit key, unprotected)\n header.d=linux.alibaba.com header.i=@linux.alibaba.com header.a=rsa-sha256\n header.s=default header.b=LZARbHS/","DKIM-Signature":"v=1; a=rsa-sha256; c=relaxed/relaxed;\n d=linux.alibaba.com; s=default;\n t=1778056648; h=Date:From:To:Message-ID:Subject:MIME-Version:Content-Type;\n bh=E40LFCzPjUmRgrVqhZ3Bcoo5NC7C4Wzbu0X6OzoAc1Q=;\n b=LZARbHS/3z/GLgqHFekf78jWhoMtzQQmeMMqmbGj20fQ5x3f8qfXzG3ceM0uXAI81cyZgPjgEpsYE3Lx8JmvKiYSq+Jjyrz0PIs9m42lKdeVE1YFtqahcy3g67mlGRvYd2RZpJRsuYb3jHBbWnEdNkLI2lPf1HcKUW3KmfBml4s=","X-Alimail-AntiSpam":"AC=PASS; BC=-1|-1; BR=01201311R271e4; CH=green;\n DM=||false|;\n DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=maildocker-contentspam033037033178;\n MF=wangyaduo@linux.alibaba.com; NM=1; PH=DW; RN=8; SR=0;\n TI=W4_0.2.3_v5ForWebDing_2125130F_1778053945070_o7001c903;","Date":"Wed, 06 May 2026 16:37:20 +0800","From":"\"Wang Yaduo\" <wangyaduo@linux.alibaba.com>","To":"\"Robin Dapp\" <rdapp.gcc@gmail.com>,\n \"gcc-patches\" <gcc-patches@gcc.gnu.org>","Cc":"\"kito.cheng\" <kito.cheng@gmail.com>, \"juzhe.zhong\" <juzhe.zhong@rivai.ai>,\n \"palmer\" <palmer@dabbelt.com>, \"pan2.li\" <pan2.li@intel.com>,\n \"jeffreyalaw\" <jeffreyalaw@gmail.com>,\n \"chenzhongyao.hit\" <chenzhongyao.hit@gmail.com>","Message-ID":"<8ad45887-1139-4ef8-8b52-0e4d7ab4526d.wangyaduo@linux.alibaba.com>","Subject":"=?utf-8?q?=E5=9B=9E=E5=A4=8D=EF=BC=9A=5BPATCH=5D_RISC-V=3A_Add_per-?=\n\t=?utf-8?q?type_reduction_costs_to_the_vector_cost_model?=","X-Mailer":"[Alimail-Mailagent revision 745868][W4_0.2.3][v5ForWebDing][Chrome]","MIME-Version":"1.0","x-aliyun-im-through":"{\"version\":\"v1.0\"}","References":"<20260506070415.99154-1-wangyaduo@linux.alibaba.com>,\n <DIBF1XI69RQC.O9F9EZG729IO@gmail.com>","x-aliyun-mail-creator":"\n W4_0.2.3_v5ForWebDing_QvNTW96aWxsYS81LjAgKE1hY2ludG9zaDsgSW50ZWwgTWFjIE9TIFggMTBfMTVfNykgQXBwbGVXZWJLaXQvNTM3LjM2IChLSFRNTCwgbGlrZSBHZWNrbykgQ2hyb21lLzE0Ny4wLjAuMCBTYWZhcmkvNTM3LjM2La","In-Reply-To":"<DIBF1XI69RQC.O9F9EZG729IO@gmail.com>","x-aliyun-mailtrack":"{\"foreign-track\":\"0\"}","Content-Type":"multipart/alternative;\n boundary=\"----=ALIBOUNDARY_2493_7fbc3cec2700_69fafdc0_873\"","X-BeenThere":"gcc-patches@gcc.gnu.org","X-Mailman-Version":"2.1.30","Precedence":"list","List-Id":"Gcc-patches mailing list <gcc-patches.gcc.gnu.org>","List-Unsubscribe":"<https://gcc.gnu.org/mailman/options/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>","List-Archive":"<https://gcc.gnu.org/pipermail/gcc-patches/>","List-Post":"<mailto:gcc-patches@gcc.gnu.org>","List-Help":"<mailto:gcc-patches-request@gcc.gnu.org?subject=help>","List-Subscribe":"<https://gcc.gnu.org/mailman/listinfo/gcc-patches>,\n <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>","Reply-To":"Wang Yaduo <wangyaduo@linux.alibaba.com>","Errors-To":"gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org"}}]