Message ID | CAK=A3=3_9_i_xPNPDxOHVc-+r4cnCu4tBSFPou0YYD=QNoBULQ@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Thu, 21 Nov 2013, Cong Hou wrote: > While I added the new define_insn_and_split for vec_merge, a bug is > exposed: in config/i386/sse.md, [ define_expand "xop_vmfrcz<mode>2" ] > only takes one input, but the corresponding builtin functions have two > inputs, which are shown in i386.c: > > { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv4sf2, > "__builtin_ia32_vfrczss", IX86_BUILTIN_VFRCZSS, UNKNOWN, > (int)MULTI_ARG_2_SF }, > { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv2df2, > "__builtin_ia32_vfrczsd", IX86_BUILTIN_VFRCZSD, UNKNOWN, > (int)MULTI_ARG_2_DF }, > > In consequence, the ix86_expand_multi_arg_builtin() function tries to > check two args but based on the define_expand of xop_vmfrcz<mode>2, > the content of insn_data[CODE_FOR_xop_vmfrczv4sf2].operand[2] may be > incorrect (because it only needs one input). > > The patch below fixed this issue. > > Bootstrapped and tested on ax x86-64 machine. Note that this patch > should be applied before the one I sent earlier (sorry for sending > them in wrong order). This is PR 56788. Your patch seems strange to me and I don't think it fixes the real issue, but I'll let more knowledgeable people answer.
On Thu, Nov 21, 2013 at 4:39 PM, Marc Glisse <marc.glisse@inria.fr> wrote: > On Thu, 21 Nov 2013, Cong Hou wrote: > >> While I added the new define_insn_and_split for vec_merge, a bug is >> exposed: in config/i386/sse.md, [ define_expand "xop_vmfrcz<mode>2" ] >> only takes one input, but the corresponding builtin functions have two >> inputs, which are shown in i386.c: >> >> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv4sf2, >> "__builtin_ia32_vfrczss", IX86_BUILTIN_VFRCZSS, UNKNOWN, >> (int)MULTI_ARG_2_SF }, >> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv2df2, >> "__builtin_ia32_vfrczsd", IX86_BUILTIN_VFRCZSD, UNKNOWN, >> (int)MULTI_ARG_2_DF }, >> >> In consequence, the ix86_expand_multi_arg_builtin() function tries to >> check two args but based on the define_expand of xop_vmfrcz<mode>2, >> the content of insn_data[CODE_FOR_xop_vmfrczv4sf2].operand[2] may be >> incorrect (because it only needs one input). >> >> The patch below fixed this issue. >> >> Bootstrapped and tested on ax x86-64 machine. Note that this patch >> should be applied before the one I sent earlier (sorry for sending >> them in wrong order). > > > This is PR 56788. Your patch seems strange to me and I don't think it > fixes the real issue, but I'll let more knowledgeable people answer. Thank you for pointing out the bug report. This patch is not intended to fix PR56788. For your function: #include <x86intrin.h> __m128d f(__m128d x, __m128d y){ return _mm_frcz_sd(x,y); } Note that the second parameter is ignored intentionally, but the prototype of this function contains two parameters. My fix is explicitly telling GCC that the optab xop_vmfrczv4sf3 should have three operands instead of two, to let it have the correct information in insn_data[CODE_FOR_xop_vmfrczv4sf3].operand[2] which is used to match the type of the second parameter in the builtin function in ix86_expand_multi_arg_builtin(). thanks, Cong > > -- > Marc Glisse
On Thu, 21 Nov 2013, Cong Hou wrote: > On Thu, Nov 21, 2013 at 4:39 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >> On Thu, 21 Nov 2013, Cong Hou wrote: >> >>> While I added the new define_insn_and_split for vec_merge, a bug is >>> exposed: in config/i386/sse.md, [ define_expand "xop_vmfrcz<mode>2" ] >>> only takes one input, but the corresponding builtin functions have two >>> inputs, which are shown in i386.c: >>> >>> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv4sf2, >>> "__builtin_ia32_vfrczss", IX86_BUILTIN_VFRCZSS, UNKNOWN, >>> (int)MULTI_ARG_2_SF }, >>> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv2df2, >>> "__builtin_ia32_vfrczsd", IX86_BUILTIN_VFRCZSD, UNKNOWN, >>> (int)MULTI_ARG_2_DF }, >>> >>> In consequence, the ix86_expand_multi_arg_builtin() function tries to >>> check two args but based on the define_expand of xop_vmfrcz<mode>2, >>> the content of insn_data[CODE_FOR_xop_vmfrczv4sf2].operand[2] may be >>> incorrect (because it only needs one input). >>> >>> The patch below fixed this issue. >>> >>> Bootstrapped and tested on ax x86-64 machine. Note that this patch >>> should be applied before the one I sent earlier (sorry for sending >>> them in wrong order). >> >> >> This is PR 56788. Your patch seems strange to me and I don't think it >> fixes the real issue, but I'll let more knowledgeable people answer. > > > Thank you for pointing out the bug report. This patch is not intended > to fix PR56788. IMHO, if PR56788 was fixed, you wouldn't have this issue, and if PR56788 doesn't get fixed, I'll post a patch to remove _mm_frcz_sd and the associated builtin, which would solve your issue as well. > For your function: > > #include <x86intrin.h> > __m128d f(__m128d x, __m128d y){ > return _mm_frcz_sd(x,y); > } > > Note that the second parameter is ignored intentionally, but the > prototype of this function contains two parameters. My fix is > explicitly telling GCC that the optab xop_vmfrczv4sf3 should have > three operands instead of two, to let it have the correct information > in insn_data[CODE_FOR_xop_vmfrczv4sf3].operand[2] which is used to > match the type of the second parameter in the builtin function in > ix86_expand_multi_arg_builtin(). I disagree that this is intentional, it is a bug. AFAIK there is no AMD documentation that could be used as a reference for what _mm_frcz_sd is supposed to do. The only existing documentations are by Microsoft (which does *not* ignore the second argument) and by LLVM (which has a single argument). Whatever we chose for _mm_frcz_sd, the builtin should take a single argument, and if necessary we'll use 2 builtins to implement _mm_frcz_sd.
On Fri, Nov 22, 2013 at 3:57 AM, Marc Glisse <marc.glisse@inria.fr> wrote: > On Thu, 21 Nov 2013, Cong Hou wrote: > >> On Thu, Nov 21, 2013 at 4:39 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >>> >>> On Thu, 21 Nov 2013, Cong Hou wrote: >>> >>>> While I added the new define_insn_and_split for vec_merge, a bug is >>>> exposed: in config/i386/sse.md, [ define_expand "xop_vmfrcz<mode>2" ] >>>> only takes one input, but the corresponding builtin functions have two >>>> inputs, which are shown in i386.c: >>>> >>>> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv4sf2, >>>> "__builtin_ia32_vfrczss", IX86_BUILTIN_VFRCZSS, UNKNOWN, >>>> (int)MULTI_ARG_2_SF }, >>>> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv2df2, >>>> "__builtin_ia32_vfrczsd", IX86_BUILTIN_VFRCZSD, UNKNOWN, >>>> (int)MULTI_ARG_2_DF }, >>>> >>>> In consequence, the ix86_expand_multi_arg_builtin() function tries to >>>> check two args but based on the define_expand of xop_vmfrcz<mode>2, >>>> the content of insn_data[CODE_FOR_xop_vmfrczv4sf2].operand[2] may be >>>> incorrect (because it only needs one input). >>>> >>>> The patch below fixed this issue. >>>> >>>> Bootstrapped and tested on ax x86-64 machine. Note that this patch >>>> should be applied before the one I sent earlier (sorry for sending >>>> them in wrong order). >>> >>> >>> >>> This is PR 56788. Your patch seems strange to me and I don't think it >>> fixes the real issue, but I'll let more knowledgeable people answer. >> >> >> >> Thank you for pointing out the bug report. This patch is not intended >> to fix PR56788. > > > IMHO, if PR56788 was fixed, you wouldn't have this issue, and if PR56788 > doesn't get fixed, I'll post a patch to remove _mm_frcz_sd and the > associated builtin, which would solve your issue as well. I agree. Then I will wait until your patch is merged to the trunk, otherwise my patch could not pass the test. > > >> For your function: >> >> #include <x86intrin.h> >> __m128d f(__m128d x, __m128d y){ >> return _mm_frcz_sd(x,y); >> } >> >> Note that the second parameter is ignored intentionally, but the >> prototype of this function contains two parameters. My fix is >> explicitly telling GCC that the optab xop_vmfrczv4sf3 should have >> three operands instead of two, to let it have the correct information >> in insn_data[CODE_FOR_xop_vmfrczv4sf3].operand[2] which is used to >> match the type of the second parameter in the builtin function in >> ix86_expand_multi_arg_builtin(). > > > I disagree that this is intentional, it is a bug. AFAIK there is no AMD > documentation that could be used as a reference for what _mm_frcz_sd is > supposed to do. The only existing documentations are by Microsoft (which > does *not* ignore the second argument) and by LLVM (which has a single > argument). Whatever we chose for _mm_frcz_sd, the builtin should take a > single argument, and if necessary we'll use 2 builtins to implement > _mm_frcz_sd. > I also only found the one by Microsoft.. If the second argument is ignored, we could just remove it, as long as there is no "standard" that requires two arguments. Hopefully it won't break current projects using _mm_frcz_sd. Thank you for your comments! Cong > -- > Marc Glisse
On Fri, 22 Nov 2013, Cong Hou wrote: >> IMHO, if PR56788 was fixed, you wouldn't have this issue, and if PR56788 >> doesn't get fixed, I'll post a patch to remove _mm_frcz_sd and the >> associated builtin, which would solve your issue as well. > > I agree. Then I will wait until your patch is merged to the trunk, > otherwise my patch could not pass the test. It seems that I was wrong in the PR (I can't remember where I read that misleading info at the time), but Uros seems to be handling it just fine :-)
Any comment on this patch? thanks, Cong On Fri, Nov 22, 2013 at 11:40 AM, Cong Hou <congh@google.com> wrote: > On Fri, Nov 22, 2013 at 3:57 AM, Marc Glisse <marc.glisse@inria.fr> wrote: >> On Thu, 21 Nov 2013, Cong Hou wrote: >> >>> On Thu, Nov 21, 2013 at 4:39 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >>>> >>>> On Thu, 21 Nov 2013, Cong Hou wrote: >>>> >>>>> While I added the new define_insn_and_split for vec_merge, a bug is >>>>> exposed: in config/i386/sse.md, [ define_expand "xop_vmfrcz<mode>2" ] >>>>> only takes one input, but the corresponding builtin functions have two >>>>> inputs, which are shown in i386.c: >>>>> >>>>> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv4sf2, >>>>> "__builtin_ia32_vfrczss", IX86_BUILTIN_VFRCZSS, UNKNOWN, >>>>> (int)MULTI_ARG_2_SF }, >>>>> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv2df2, >>>>> "__builtin_ia32_vfrczsd", IX86_BUILTIN_VFRCZSD, UNKNOWN, >>>>> (int)MULTI_ARG_2_DF }, >>>>> >>>>> In consequence, the ix86_expand_multi_arg_builtin() function tries to >>>>> check two args but based on the define_expand of xop_vmfrcz<mode>2, >>>>> the content of insn_data[CODE_FOR_xop_vmfrczv4sf2].operand[2] may be >>>>> incorrect (because it only needs one input). >>>>> >>>>> The patch below fixed this issue. >>>>> >>>>> Bootstrapped and tested on ax x86-64 machine. Note that this patch >>>>> should be applied before the one I sent earlier (sorry for sending >>>>> them in wrong order). >>>> >>>> >>>> >>>> This is PR 56788. Your patch seems strange to me and I don't think it >>>> fixes the real issue, but I'll let more knowledgeable people answer. >>> >>> >>> >>> Thank you for pointing out the bug report. This patch is not intended >>> to fix PR56788. >> >> >> IMHO, if PR56788 was fixed, you wouldn't have this issue, and if PR56788 >> doesn't get fixed, I'll post a patch to remove _mm_frcz_sd and the >> associated builtin, which would solve your issue as well. > > > I agree. Then I will wait until your patch is merged to the trunk, > otherwise my patch could not pass the test. > > >> >> >>> For your function: >>> >>> #include <x86intrin.h> >>> __m128d f(__m128d x, __m128d y){ >>> return _mm_frcz_sd(x,y); >>> } >>> >>> Note that the second parameter is ignored intentionally, but the >>> prototype of this function contains two parameters. My fix is >>> explicitly telling GCC that the optab xop_vmfrczv4sf3 should have >>> three operands instead of two, to let it have the correct information >>> in insn_data[CODE_FOR_xop_vmfrczv4sf3].operand[2] which is used to >>> match the type of the second parameter in the builtin function in >>> ix86_expand_multi_arg_builtin(). >> >> >> I disagree that this is intentional, it is a bug. AFAIK there is no AMD >> documentation that could be used as a reference for what _mm_frcz_sd is >> supposed to do. The only existing documentations are by Microsoft (which >> does *not* ignore the second argument) and by LLVM (which has a single >> argument). Whatever we chose for _mm_frcz_sd, the builtin should take a >> single argument, and if necessary we'll use 2 builtins to implement >> _mm_frcz_sd. >> > > > I also only found the one by Microsoft.. If the second argument is > ignored, we could just remove it, as long as there is no "standard" > that requires two arguments. Hopefully it won't break current projects > using _mm_frcz_sd. > > Thank you for your comments! > > > Cong > > >> -- >> Marc Glisse
Ping? thanks, Cong On Mon, Dec 2, 2013 at 5:02 PM, Cong Hou <congh@google.com> wrote: > Any comment on this patch? > > > thanks, > Cong > > > On Fri, Nov 22, 2013 at 11:40 AM, Cong Hou <congh@google.com> wrote: >> On Fri, Nov 22, 2013 at 3:57 AM, Marc Glisse <marc.glisse@inria.fr> wrote: >>> On Thu, 21 Nov 2013, Cong Hou wrote: >>> >>>> On Thu, Nov 21, 2013 at 4:39 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >>>>> >>>>> On Thu, 21 Nov 2013, Cong Hou wrote: >>>>> >>>>>> While I added the new define_insn_and_split for vec_merge, a bug is >>>>>> exposed: in config/i386/sse.md, [ define_expand "xop_vmfrcz<mode>2" ] >>>>>> only takes one input, but the corresponding builtin functions have two >>>>>> inputs, which are shown in i386.c: >>>>>> >>>>>> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv4sf2, >>>>>> "__builtin_ia32_vfrczss", IX86_BUILTIN_VFRCZSS, UNKNOWN, >>>>>> (int)MULTI_ARG_2_SF }, >>>>>> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv2df2, >>>>>> "__builtin_ia32_vfrczsd", IX86_BUILTIN_VFRCZSD, UNKNOWN, >>>>>> (int)MULTI_ARG_2_DF }, >>>>>> >>>>>> In consequence, the ix86_expand_multi_arg_builtin() function tries to >>>>>> check two args but based on the define_expand of xop_vmfrcz<mode>2, >>>>>> the content of insn_data[CODE_FOR_xop_vmfrczv4sf2].operand[2] may be >>>>>> incorrect (because it only needs one input). >>>>>> >>>>>> The patch below fixed this issue. >>>>>> >>>>>> Bootstrapped and tested on ax x86-64 machine. Note that this patch >>>>>> should be applied before the one I sent earlier (sorry for sending >>>>>> them in wrong order). >>>>> >>>>> >>>>> >>>>> This is PR 56788. Your patch seems strange to me and I don't think it >>>>> fixes the real issue, but I'll let more knowledgeable people answer. >>>> >>>> >>>> >>>> Thank you for pointing out the bug report. This patch is not intended >>>> to fix PR56788. >>> >>> >>> IMHO, if PR56788 was fixed, you wouldn't have this issue, and if PR56788 >>> doesn't get fixed, I'll post a patch to remove _mm_frcz_sd and the >>> associated builtin, which would solve your issue as well. >> >> >> I agree. Then I will wait until your patch is merged to the trunk, >> otherwise my patch could not pass the test. >> >> >>> >>> >>>> For your function: >>>> >>>> #include <x86intrin.h> >>>> __m128d f(__m128d x, __m128d y){ >>>> return _mm_frcz_sd(x,y); >>>> } >>>> >>>> Note that the second parameter is ignored intentionally, but the >>>> prototype of this function contains two parameters. My fix is >>>> explicitly telling GCC that the optab xop_vmfrczv4sf3 should have >>>> three operands instead of two, to let it have the correct information >>>> in insn_data[CODE_FOR_xop_vmfrczv4sf3].operand[2] which is used to >>>> match the type of the second parameter in the builtin function in >>>> ix86_expand_multi_arg_builtin(). >>> >>> >>> I disagree that this is intentional, it is a bug. AFAIK there is no AMD >>> documentation that could be used as a reference for what _mm_frcz_sd is >>> supposed to do. The only existing documentations are by Microsoft (which >>> does *not* ignore the second argument) and by LLVM (which has a single >>> argument). Whatever we chose for _mm_frcz_sd, the builtin should take a >>> single argument, and if necessary we'll use 2 builtins to implement >>> _mm_frcz_sd. >>> >> >> >> I also only found the one by Microsoft.. If the second argument is >> ignored, we could just remove it, as long as there is no "standard" >> that requires two arguments. Hopefully it won't break current projects >> using _mm_frcz_sd. >> >> Thank you for your comments! >> >> >> Cong >> >> >>> -- >>> Marc Glisse
Cong, can you ping this patch again? There does not seem to be pending comments left. David On Tue, Dec 17, 2013 at 10:05 AM, Cong Hou <congh@google.com> wrote: > Ping? > > > thanks, > Cong > > > On Mon, Dec 2, 2013 at 5:02 PM, Cong Hou <congh@google.com> wrote: >> Any comment on this patch? >> >> >> thanks, >> Cong >> >> >> On Fri, Nov 22, 2013 at 11:40 AM, Cong Hou <congh@google.com> wrote: >>> On Fri, Nov 22, 2013 at 3:57 AM, Marc Glisse <marc.glisse@inria.fr> wrote: >>>> On Thu, 21 Nov 2013, Cong Hou wrote: >>>> >>>>> On Thu, Nov 21, 2013 at 4:39 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >>>>>> >>>>>> On Thu, 21 Nov 2013, Cong Hou wrote: >>>>>> >>>>>>> While I added the new define_insn_and_split for vec_merge, a bug is >>>>>>> exposed: in config/i386/sse.md, [ define_expand "xop_vmfrcz<mode>2" ] >>>>>>> only takes one input, but the corresponding builtin functions have two >>>>>>> inputs, which are shown in i386.c: >>>>>>> >>>>>>> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv4sf2, >>>>>>> "__builtin_ia32_vfrczss", IX86_BUILTIN_VFRCZSS, UNKNOWN, >>>>>>> (int)MULTI_ARG_2_SF }, >>>>>>> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv2df2, >>>>>>> "__builtin_ia32_vfrczsd", IX86_BUILTIN_VFRCZSD, UNKNOWN, >>>>>>> (int)MULTI_ARG_2_DF }, >>>>>>> >>>>>>> In consequence, the ix86_expand_multi_arg_builtin() function tries to >>>>>>> check two args but based on the define_expand of xop_vmfrcz<mode>2, >>>>>>> the content of insn_data[CODE_FOR_xop_vmfrczv4sf2].operand[2] may be >>>>>>> incorrect (because it only needs one input). >>>>>>> >>>>>>> The patch below fixed this issue. >>>>>>> >>>>>>> Bootstrapped and tested on ax x86-64 machine. Note that this patch >>>>>>> should be applied before the one I sent earlier (sorry for sending >>>>>>> them in wrong order). >>>>>> >>>>>> >>>>>> >>>>>> This is PR 56788. Your patch seems strange to me and I don't think it >>>>>> fixes the real issue, but I'll let more knowledgeable people answer. >>>>> >>>>> >>>>> >>>>> Thank you for pointing out the bug report. This patch is not intended >>>>> to fix PR56788. >>>> >>>> >>>> IMHO, if PR56788 was fixed, you wouldn't have this issue, and if PR56788 >>>> doesn't get fixed, I'll post a patch to remove _mm_frcz_sd and the >>>> associated builtin, which would solve your issue as well. >>> >>> >>> I agree. Then I will wait until your patch is merged to the trunk, >>> otherwise my patch could not pass the test. >>> >>> >>>> >>>> >>>>> For your function: >>>>> >>>>> #include <x86intrin.h> >>>>> __m128d f(__m128d x, __m128d y){ >>>>> return _mm_frcz_sd(x,y); >>>>> } >>>>> >>>>> Note that the second parameter is ignored intentionally, but the >>>>> prototype of this function contains two parameters. My fix is >>>>> explicitly telling GCC that the optab xop_vmfrczv4sf3 should have >>>>> three operands instead of two, to let it have the correct information >>>>> in insn_data[CODE_FOR_xop_vmfrczv4sf3].operand[2] which is used to >>>>> match the type of the second parameter in the builtin function in >>>>> ix86_expand_multi_arg_builtin(). >>>> >>>> >>>> I disagree that this is intentional, it is a bug. AFAIK there is no AMD >>>> documentation that could be used as a reference for what _mm_frcz_sd is >>>> supposed to do. The only existing documentations are by Microsoft (which >>>> does *not* ignore the second argument) and by LLVM (which has a single >>>> argument). Whatever we chose for _mm_frcz_sd, the builtin should take a >>>> single argument, and if necessary we'll use 2 builtins to implement >>>> _mm_frcz_sd. >>>> >>> >>> >>> I also only found the one by Microsoft.. If the second argument is >>> ignored, we could just remove it, as long as there is no "standard" >>> that requires two arguments. Hopefully it won't break current projects >>> using _mm_frcz_sd. >>> >>> Thank you for your comments! >>> >>> >>> Cong >>> >>> >>>> -- >>>> Marc Glisse
Ping? thanks, Cong On Tue, Jul 8, 2014 at 8:23 PM, Xinliang David Li <davidxl@google.com> wrote: > Cong, can you ping this patch again? There does not seem to be > pending comments left. > > David > > On Tue, Dec 17, 2013 at 10:05 AM, Cong Hou <congh@google.com> wrote: >> Ping? >> >> >> thanks, >> Cong >> >> >> On Mon, Dec 2, 2013 at 5:02 PM, Cong Hou <congh@google.com> wrote: >>> Any comment on this patch? >>> >>> >>> thanks, >>> Cong >>> >>> >>> On Fri, Nov 22, 2013 at 11:40 AM, Cong Hou <congh@google.com> wrote: >>>> On Fri, Nov 22, 2013 at 3:57 AM, Marc Glisse <marc.glisse@inria.fr> wrote: >>>>> On Thu, 21 Nov 2013, Cong Hou wrote: >>>>> >>>>>> On Thu, Nov 21, 2013 at 4:39 PM, Marc Glisse <marc.glisse@inria.fr> wrote: >>>>>>> >>>>>>> On Thu, 21 Nov 2013, Cong Hou wrote: >>>>>>> >>>>>>>> While I added the new define_insn_and_split for vec_merge, a bug is >>>>>>>> exposed: in config/i386/sse.md, [ define_expand "xop_vmfrcz<mode>2" ] >>>>>>>> only takes one input, but the corresponding builtin functions have two >>>>>>>> inputs, which are shown in i386.c: >>>>>>>> >>>>>>>> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv4sf2, >>>>>>>> "__builtin_ia32_vfrczss", IX86_BUILTIN_VFRCZSS, UNKNOWN, >>>>>>>> (int)MULTI_ARG_2_SF }, >>>>>>>> { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv2df2, >>>>>>>> "__builtin_ia32_vfrczsd", IX86_BUILTIN_VFRCZSD, UNKNOWN, >>>>>>>> (int)MULTI_ARG_2_DF }, >>>>>>>> >>>>>>>> In consequence, the ix86_expand_multi_arg_builtin() function tries to >>>>>>>> check two args but based on the define_expand of xop_vmfrcz<mode>2, >>>>>>>> the content of insn_data[CODE_FOR_xop_vmfrczv4sf2].operand[2] may be >>>>>>>> incorrect (because it only needs one input). >>>>>>>> >>>>>>>> The patch below fixed this issue. >>>>>>>> >>>>>>>> Bootstrapped and tested on ax x86-64 machine. Note that this patch >>>>>>>> should be applied before the one I sent earlier (sorry for sending >>>>>>>> them in wrong order). >>>>>>> >>>>>>> >>>>>>> >>>>>>> This is PR 56788. Your patch seems strange to me and I don't think it >>>>>>> fixes the real issue, but I'll let more knowledgeable people answer. >>>>>> >>>>>> >>>>>> >>>>>> Thank you for pointing out the bug report. This patch is not intended >>>>>> to fix PR56788. >>>>> >>>>> >>>>> IMHO, if PR56788 was fixed, you wouldn't have this issue, and if PR56788 >>>>> doesn't get fixed, I'll post a patch to remove _mm_frcz_sd and the >>>>> associated builtin, which would solve your issue as well. >>>> >>>> >>>> I agree. Then I will wait until your patch is merged to the trunk, >>>> otherwise my patch could not pass the test. >>>> >>>> >>>>> >>>>> >>>>>> For your function: >>>>>> >>>>>> #include <x86intrin.h> >>>>>> __m128d f(__m128d x, __m128d y){ >>>>>> return _mm_frcz_sd(x,y); >>>>>> } >>>>>> >>>>>> Note that the second parameter is ignored intentionally, but the >>>>>> prototype of this function contains two parameters. My fix is >>>>>> explicitly telling GCC that the optab xop_vmfrczv4sf3 should have >>>>>> three operands instead of two, to let it have the correct information >>>>>> in insn_data[CODE_FOR_xop_vmfrczv4sf3].operand[2] which is used to >>>>>> match the type of the second parameter in the builtin function in >>>>>> ix86_expand_multi_arg_builtin(). >>>>> >>>>> >>>>> I disagree that this is intentional, it is a bug. AFAIK there is no AMD >>>>> documentation that could be used as a reference for what _mm_frcz_sd is >>>>> supposed to do. The only existing documentations are by Microsoft (which >>>>> does *not* ignore the second argument) and by LLVM (which has a single >>>>> argument). Whatever we chose for _mm_frcz_sd, the builtin should take a >>>>> single argument, and if necessary we'll use 2 builtins to implement >>>>> _mm_frcz_sd. >>>>> >>>> >>>> >>>> I also only found the one by Microsoft.. If the second argument is >>>> ignored, we could just remove it, as long as there is no "standard" >>>> that requires two arguments. Hopefully it won't break current projects >>>> using _mm_frcz_sd. >>>> >>>> Thank you for your comments! >>>> >>>> >>>> Cong >>>> >>>> >>>>> -- >>>>> Marc Glisse
diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 2c0554b..42903c2 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,10 @@ +2013-11-21 Cong Hou <congh@google.com> + + * config/i386/sse.md (xop_vmfrcz<mode>3): Add the second input, and + rename the optab accordingly. + * config/i386/i386.c (CODE_FOR_xop_vmfrczv4sf3, + CODE_FOR_xop_vmfrczv2df3): Rename. + 2013-11-12 Jeff Law <law@redhat.com> * tree-ssa-threadedge.c (thread_around_empty_blocks): New diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index 5287b49..ea004dd 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -29079,8 +29079,8 @@ static const struct builtin_description bdesc_multi_arg[] = { OPTION_MASK_ISA_XOP, CODE_FOR_xop_shlv8hi3, "__builtin_ia32_vpshlw", IX86_BUILTIN_VPSHLW, UNKNOWN, (int)MULTI_ARG_2_HI }, { OPTION_MASK_ISA_XOP, CODE_FOR_xop_shlv16qi3, "__builtin_ia32_vpshlb", IX86_BUILTIN_VPSHLB, UNKNOWN, (int)MULTI_ARG_2_QI }, - { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv4sf2, "__builtin_ia32_vfrczss", IX86_BUILTIN_VFRCZSS, UNKNOWN, (int)MULTI_ARG_2_SF }, - { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv2df2, "__builtin_ia32_vfrczsd", IX86_BUILTIN_VFRCZSD, UNKNOWN, (int)MULTI_ARG_2_DF }, + { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv4sf3, "__builtin_ia32_vfrczss", IX86_BUILTIN_VFRCZSS, UNKNOWN, (int)MULTI_ARG_2_SF }, + { OPTION_MASK_ISA_XOP, CODE_FOR_xop_vmfrczv2df3, "__builtin_ia32_vfrczsd", IX86_BUILTIN_VFRCZSD, UNKNOWN, (int)MULTI_ARG_2_DF }, { OPTION_MASK_ISA_XOP, CODE_FOR_xop_frczv4sf2, "__builtin_ia32_vfrczps", IX86_BUILTIN_VFRCZPS, UNKNOWN, (int)MULTI_ARG_1_SF }, { OPTION_MASK_ISA_XOP, CODE_FOR_xop_frczv2df2, "__builtin_ia32_vfrczpd", IX86_BUILTIN_VFRCZPD, UNKNOWN, (int)MULTI_ARG_1_DF }, { OPTION_MASK_ISA_XOP, CODE_FOR_xop_frczv8sf2, "__builtin_ia32_vfrczps256", IX86_BUILTIN_VFRCZPS256, UNKNOWN, (int)MULTI_ARG_1_SF2 }, diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 7bb2d77..189e18a 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -12371,17 +12371,17 @@ (set_attr "mode" "<MODE>")]) ;; scalar insns -(define_expand "xop_vmfrcz<mode>2" +(define_expand "xop_vmfrcz<mode>3" [(set (match_operand:VF_128 0 "register_operand") (vec_merge:VF_128 (unspec:VF_128 [(match_operand:VF_128 1 "nonimmediate_operand")] UNSPEC_FRCZ) - (match_dup 3) + (match_operand:VF_128 2 "register_operand") (const_int 1)))] "TARGET_XOP" { - operands[3] = CONST0_RTX (<MODE>mode); + operands[2] = CONST0_RTX (<MODE>mode); }) (define_insn "*xop_vmfrcz_<mode>"