Message ID | 20180107225904.11535-3-hjl.tools@gmail.com |
---|---|
State | New |
Headers | show |
Series | x86: CVE-2017-5715, aka Spectre | expand |
* H. J. Lu: > Add -mindirect-branch-loop= option to control loop filler in call and > return thunks generated by -mindirect-branch=. 'lfence' uses "lfence" > as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" > as loop filler. The default is 'lfence'. Why is the loop needed? Doesn't ud2 or cpuid stop speculative execution?
On Mon, Jan 8, 2018 at 12:20 AM, Florian Weimer <fw@deneb.enyo.de> wrote: > * H. J. Lu: > >> Add -mindirect-branch-loop= option to control loop filler in call and >> return thunks generated by -mindirect-branch=. 'lfence' uses "lfence" >> as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" >> as loop filler. The default is 'lfence'. > > Why is the loop needed? Doesn't ud2 or cpuid stop speculative > execution? My understanding is that a loop works better.
* H. J. Lu: > On Mon, Jan 8, 2018 at 12:20 AM, Florian Weimer <fw@deneb.enyo.de> wrote: >> * H. J. Lu: >> >>> Add -mindirect-branch-loop= option to control loop filler in call and >>> return thunks generated by -mindirect-branch=. 'lfence' uses "lfence" >>> as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" >>> as loop filler. The default is 'lfence'. >> >> Why is the loop needed? Doesn't ud2 or cpuid stop speculative >> execution? > > My understanding is that a loop works better. Better how? What about far jumps? I think they prevent some forms of prefetch on i386, so perhaps long mode is similar in that regard?
On Mon, Jan 8, 2018 at 10:32 AM, Florian Weimer <fw@deneb.enyo.de> wrote: > * H. J. Lu: > >> On Mon, Jan 8, 2018 at 12:20 AM, Florian Weimer <fw@deneb.enyo.de> wrote: >>> * H. J. Lu: >>> >>>> Add -mindirect-branch-loop= option to control loop filler in call and >>>> return thunks generated by -mindirect-branch=. 'lfence' uses "lfence" >>>> as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" >>>> as loop filler. The default is 'lfence'. >>> >>> Why is the loop needed? Doesn't ud2 or cpuid stop speculative >>> execution? >> >> My understanding is that a loop works better. > > Better how? > > What about far jumps? I think they prevent some forms of prefetch on > i386, so perhaps long mode is similar in that regard? These are more expensive and we can't guarantee that they are effective, hence the short loop .
On Mon, 2018-01-08 at 09:20 +0100, Florian Weimer wrote: > * H. J. Lu: > > > Add -mindirect-branch-loop= option to control loop filler in call and > > return thunks generated by -mindirect-branch=. 'lfence' uses "lfence" > > as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" > > as loop filler. The default is 'lfence'. > > Why is the loop needed? Doesn't ud2 or cpuid stop speculative > execution? The idea is not to stop it per se, but to capture it. We trick the speculative execution into *thinking* it's going to return back to that endless loop, which prevents it from doing the branch prediction which would otherwise have got into trouble. There has been a fair amount of bikeshedding of precisely what goes in there already, and '1: pause; jmp 1b' is the best option that hasn't been shot down in flames by the CPU architects. HJ, do we still actually need the options for lfence and nop? I thought those were originally just for testing and could possibly be dropped now? Not that I care for Linux since I'm providing my own external thunk anyway...
On Mon, Jan 8, 2018 at 1:00 PM, David Woodhouse <dwmw2@infradead.org> wrote: > On Mon, 2018-01-08 at 09:20 +0100, Florian Weimer wrote: >> * H. J. Lu: >> >> > Add -mindirect-branch-loop= option to control loop filler in call and >> > return thunks generated by -mindirect-branch=. 'lfence' uses "lfence" >> > as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" >> > as loop filler. The default is 'lfence'. >> >> Why is the loop needed? Doesn't ud2 or cpuid stop speculative >> execution? > > The idea is not to stop it per se, but to capture it. We trick the > speculative execution into *thinking* it's going to return back to that > endless loop, which prevents it from doing the branch prediction which > would otherwise have got into trouble. > > There has been a fair amount of bikeshedding of precisely what goes in > there already, and '1: pause; jmp 1b' is the best option that hasn't > been shot down in flames by the CPU architects. > > HJ, do we still actually need the options for lfence and nop? I thought > those were originally just for testing and could possibly be dropped > now? This is a trial change. It may be useful later. But I can drop it and hardcode it to "pause".
On 01/07/2018 03:59 PM, H.J. Lu wrote: > Add -mindirect-branch-loop= option to control loop filler in call and > return thunks generated by -mindirect-branch=. 'lfence' uses "lfence" > as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" > as loop filler. The default is 'lfence'. > > gcc/ > > * config/i386/i386-opts.h (indirect_branch_loop): New. > * config/i386/i386.c (output_indirect_thunk): Support > -mindirect-branch-loop=. > * config/i386/i386.opt (mindirect-branch-loop=): New option. > (indirect_branch_loop): New. > (lfence): Likewise. > (pause): Likewise. > (nop): Likewise. > * doc/invoke.texi: Document -mindirect-branch-loop= option. > > gcc/testsuite/ > > * gcc.target/i386/indirect-thunk-loop-1.c: New test. > * gcc.target/i386/indirect-thunk-loop-2.c: Likewise. > * gcc.target/i386/indirect-thunk-loop-3.c: Likewise. > * gcc.target/i386/indirect-thunk-loop-4.c: Likewise. > * gcc.target/i386/indirect-thunk-loop-5.c: Likewise. I think we should drop the ability to change the filler until such time as we really need it. Just pick one and go with it. I think David suggested that they wanted "pause". I'm obviously fine with that. jeff
On Thu, Jan 11, 2018 at 1:42 PM, Jeff Law <law@redhat.com> wrote: > On 01/07/2018 03:59 PM, H.J. Lu wrote: >> Add -mindirect-branch-loop= option to control loop filler in call and >> return thunks generated by -mindirect-branch=. 'lfence' uses "lfence" >> as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" >> as loop filler. The default is 'lfence'. >> >> gcc/ >> >> * config/i386/i386-opts.h (indirect_branch_loop): New. >> * config/i386/i386.c (output_indirect_thunk): Support >> -mindirect-branch-loop=. >> * config/i386/i386.opt (mindirect-branch-loop=): New option. >> (indirect_branch_loop): New. >> (lfence): Likewise. >> (pause): Likewise. >> (nop): Likewise. >> * doc/invoke.texi: Document -mindirect-branch-loop= option. >> >> gcc/testsuite/ >> >> * gcc.target/i386/indirect-thunk-loop-1.c: New test. >> * gcc.target/i386/indirect-thunk-loop-2.c: Likewise. >> * gcc.target/i386/indirect-thunk-loop-3.c: Likewise. >> * gcc.target/i386/indirect-thunk-loop-4.c: Likewise. >> * gcc.target/i386/indirect-thunk-loop-5.c: Likewise. > I think we should drop the ability to change the filler until such time > as we really need it. Just pick one and go with it. I think David > suggested that they wanted "pause". I'm obviously fine with that. I will hard code it to "pause".
Hi, On Thu, Jan 11 2018, Jeff Law wrote: > On 01/07/2018 03:59 PM, H.J. Lu wrote: >> Add -mindirect-branch-loop= option to control loop filler in call and >> return thunks generated by -mindirect-branch=. 'lfence' uses "lfence" >> as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" >> as loop filler. The default is 'lfence'. >> >> gcc/ >> >> * config/i386/i386-opts.h (indirect_branch_loop): New. >> * config/i386/i386.c (output_indirect_thunk): Support >> -mindirect-branch-loop=. >> * config/i386/i386.opt (mindirect-branch-loop=): New option. >> (indirect_branch_loop): New. >> (lfence): Likewise. >> (pause): Likewise. >> (nop): Likewise. >> * doc/invoke.texi: Document -mindirect-branch-loop= option. >> >> gcc/testsuite/ >> >> * gcc.target/i386/indirect-thunk-loop-1.c: New test. >> * gcc.target/i386/indirect-thunk-loop-2.c: Likewise. >> * gcc.target/i386/indirect-thunk-loop-3.c: Likewise. >> * gcc.target/i386/indirect-thunk-loop-4.c: Likewise. >> * gcc.target/i386/indirect-thunk-loop-5.c: Likewise. > I think we should drop the ability to change the filler until such time > as we really need it. Just pick one and go with it. I think David > suggested that they wanted "pause". I'm obviously fine with that. > unless I am mistaken (which is frankly quite possible, I am still not quite up to speed about the nuances), AMD strongly prefers the lfence variant. OTOH, IIUC, in kernel this will be run-time patched but so it does not matter in the most pressing case and we might want to have a mechanism doing something similar for protecting userspace later on. But perhaps it is enough to keep the option? Muthu and/or Venkat, can you please comment? Thank you, Martin
On Fri, Jan 12, 2018 at 4:38 AM, Martin Jambor <mjambor@suse.cz> wrote: > Hi, > > On Thu, Jan 11 2018, Jeff Law wrote: >> On 01/07/2018 03:59 PM, H.J. Lu wrote: >>> Add -mindirect-branch-loop= option to control loop filler in call and >>> return thunks generated by -mindirect-branch=. 'lfence' uses "lfence" >>> as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" >>> as loop filler. The default is 'lfence'. >>> >>> gcc/ >>> >>> * config/i386/i386-opts.h (indirect_branch_loop): New. >>> * config/i386/i386.c (output_indirect_thunk): Support >>> -mindirect-branch-loop=. >>> * config/i386/i386.opt (mindirect-branch-loop=): New option. >>> (indirect_branch_loop): New. >>> (lfence): Likewise. >>> (pause): Likewise. >>> (nop): Likewise. >>> * doc/invoke.texi: Document -mindirect-branch-loop= option. >>> >>> gcc/testsuite/ >>> >>> * gcc.target/i386/indirect-thunk-loop-1.c: New test. >>> * gcc.target/i386/indirect-thunk-loop-2.c: Likewise. >>> * gcc.target/i386/indirect-thunk-loop-3.c: Likewise. >>> * gcc.target/i386/indirect-thunk-loop-4.c: Likewise. >>> * gcc.target/i386/indirect-thunk-loop-5.c: Likewise. >> I think we should drop the ability to change the filler until such time >> as we really need it. Just pick one and go with it. I think David >> suggested that they wanted "pause". I'm obviously fine with that. >> > > unless I am mistaken (which is frankly quite possible, I am still not > quite up to speed about the nuances), AMD strongly prefers the lfence > variant. OTOH, IIUC, in kernel this will be run-time patched but so it > does not matter in the most pressing case and we might want to have a > mechanism doing something similar for protecting userspace later on. > But perhaps it is enough to keep the option? > > Muthu and/or Venkat, can you please comment? If we do want it, I will submit a separate patch AFTER the current patch set has been approved and checked into GCC 8.
Hi all, > -----Original Message----- > From: H.J. Lu [mailto:hjl.tools@gmail.com] > Sent: Friday, January 12, 2018 7:36 PM > To: Martin Jambor <mjambor@suse.cz> > Cc: Nagarajan, Muthu kumar raj <Muthukumarraj.Nagarajan@amd.com>; > Kumar, Venkataramanan <Venkataramanan.Kumar@amd.com>; GCC > Patches <gcc-patches@gcc.gnu.org>; Jeff Law <law@redhat.com> > Subject: Re: [PATCH 2/5] x86: Add -mindirect-branch-loop= > > On Fri, Jan 12, 2018 at 4:38 AM, Martin Jambor <mjambor@suse.cz> wrote: > > Hi, > > > > On Thu, Jan 11 2018, Jeff Law wrote: > >> On 01/07/2018 03:59 PM, H.J. Lu wrote: > >>> Add -mindirect-branch-loop= option to control loop filler in call > >>> and return thunks generated by -mindirect-branch=. 'lfence' uses > "lfence" > >>> as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" > >>> as loop filler. The default is 'lfence'. > >>> > >>> gcc/ > >>> > >>> * config/i386/i386-opts.h (indirect_branch_loop): New. > >>> * config/i386/i386.c (output_indirect_thunk): Support > >>> -mindirect-branch-loop=. > >>> * config/i386/i386.opt (mindirect-branch-loop=): New option. > >>> (indirect_branch_loop): New. > >>> (lfence): Likewise. > >>> (pause): Likewise. > >>> (nop): Likewise. > >>> * doc/invoke.texi: Document -mindirect-branch-loop= option. > >>> > >>> gcc/testsuite/ > >>> > >>> * gcc.target/i386/indirect-thunk-loop-1.c: New test. > >>> * gcc.target/i386/indirect-thunk-loop-2.c: Likewise. > >>> * gcc.target/i386/indirect-thunk-loop-3.c: Likewise. > >>> * gcc.target/i386/indirect-thunk-loop-4.c: Likewise. > >>> * gcc.target/i386/indirect-thunk-loop-5.c: Likewise. > >> I think we should drop the ability to change the filler until such > >> time as we really need it. Just pick one and go with it. I think > >> David suggested that they wanted "pause". I'm obviously fine with that. > >> > > > > unless I am mistaken (which is frankly quite possible, I am still not > > quite up to speed about the nuances), AMD strongly prefers the lfence > > variant. OTOH, IIUC, in kernel this will be run-time patched but so > > it does not matter in the most pressing case and we might want to have > > a mechanism doing something similar for protecting userspace later on. > > But perhaps it is enough to keep the option? > > > > Muthu and/or Venkat, can you please comment? > > If we do want it, I will submit a separate patch AFTER the current patch set > has been approved and checked into GCC 8. > As per AMD architects, using “lfence” in “retpoline” is better than “pause” for our targets. So please allow filler to use "lfence". Regards. Venkat. > -- > H.J.
Hi all, > -----Original Message----- > From: Kumar, Venkataramanan > Sent: Friday, January 12, 2018 8:16 PM > To: 'H.J. Lu' <hjl.tools@gmail.com>; Martin Jambor <mjambor@suse.cz> > Cc: Nagarajan, Muthu kumar raj <Muthukumarraj.Nagarajan@amd.com>; > GCC Patches <gcc-patches@gcc.gnu.org>; Jeff Law <law@redhat.com>; Uros > Bizjak (ubizjak@gmail.com) <ubizjak@gmail.com>; 'Jan Hubicka' > <jh@suse.de> > Subject: RE: [PATCH 2/5] x86: Add -mindirect-branch-loop= > > Hi all, > > > -----Original Message----- > > From: H.J. Lu [mailto:hjl.tools@gmail.com] > > Sent: Friday, January 12, 2018 7:36 PM > > To: Martin Jambor <mjambor@suse.cz> > > Cc: Nagarajan, Muthu kumar raj <Muthukumarraj.Nagarajan@amd.com>; > > Kumar, Venkataramanan <Venkataramanan.Kumar@amd.com>; GCC > Patches > > <gcc-patches@gcc.gnu.org>; Jeff Law <law@redhat.com> > > Subject: Re: [PATCH 2/5] x86: Add -mindirect-branch-loop= > > > > On Fri, Jan 12, 2018 at 4:38 AM, Martin Jambor <mjambor@suse.cz> wrote: > > > Hi, > > > > > > On Thu, Jan 11 2018, Jeff Law wrote: > > >> On 01/07/2018 03:59 PM, H.J. Lu wrote: > > >>> Add -mindirect-branch-loop= option to control loop filler in call > > >>> and return thunks generated by -mindirect-branch=. 'lfence' uses > > "lfence" > > >>> as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" > > >>> as loop filler. The default is 'lfence'. > > >>> > > >>> gcc/ > > >>> > > >>> * config/i386/i386-opts.h (indirect_branch_loop): New. > > >>> * config/i386/i386.c (output_indirect_thunk): Support > > >>> -mindirect-branch-loop=. > > >>> * config/i386/i386.opt (mindirect-branch-loop=): New option. > > >>> (indirect_branch_loop): New. > > >>> (lfence): Likewise. > > >>> (pause): Likewise. > > >>> (nop): Likewise. > > >>> * doc/invoke.texi: Document -mindirect-branch-loop= option. > > >>> > > >>> gcc/testsuite/ > > >>> > > >>> * gcc.target/i386/indirect-thunk-loop-1.c: New test. > > >>> * gcc.target/i386/indirect-thunk-loop-2.c: Likewise. > > >>> * gcc.target/i386/indirect-thunk-loop-3.c: Likewise. > > >>> * gcc.target/i386/indirect-thunk-loop-4.c: Likewise. > > >>> * gcc.target/i386/indirect-thunk-loop-5.c: Likewise. > > >> I think we should drop the ability to change the filler until such > > >> time as we really need it. Just pick one and go with it. I think > > >> David suggested that they wanted "pause". I'm obviously fine with that. > > >> > > > > > > unless I am mistaken (which is frankly quite possible, I am still > > > not quite up to speed about the nuances), AMD strongly prefers the > > > lfence variant. OTOH, IIUC, in kernel this will be run-time patched > > > but so it does not matter in the most pressing case and we might > > > want to have a mechanism doing something similar for protecting > userspace later on. > > > But perhaps it is enough to keep the option? > > > > > > Muthu and/or Venkat, can you please comment? > > > > If we do want it, I will submit a separate patch AFTER the current > > patch set has been approved and checked into GCC 8. > > > > As per AMD architects, using “lfence” in “retpoline” is better than “pause” for > our targets. > So please allow filler to use "lfence". > We also leant that "lfence" is a dispatch serializing instruction. The Pause instruction is not serializing on AMD processors and has high latencies. Regards. Venkat. > > -- > > H.J.
On 01/12/2018 08:09 AM, Kumar, Venkataramanan wrote: > Hi all, > >> -----Original Message----- >> From: Kumar, Venkataramanan >> Sent: Friday, January 12, 2018 8:16 PM >> To: 'H.J. Lu' <hjl.tools@gmail.com>; Martin Jambor <mjambor@suse.cz> >> Cc: Nagarajan, Muthu kumar raj <Muthukumarraj.Nagarajan@amd.com>; >> GCC Patches <gcc-patches@gcc.gnu.org>; Jeff Law <law@redhat.com>; Uros >> Bizjak (ubizjak@gmail.com) <ubizjak@gmail.com>; 'Jan Hubicka' >> <jh@suse.de> >> Subject: RE: [PATCH 2/5] x86: Add -mindirect-branch-loop= >> >> Hi all, >> >>> -----Original Message----- >>> From: H.J. Lu [mailto:hjl.tools@gmail.com] >>> Sent: Friday, January 12, 2018 7:36 PM >>> To: Martin Jambor <mjambor@suse.cz> >>> Cc: Nagarajan, Muthu kumar raj <Muthukumarraj.Nagarajan@amd.com>; >>> Kumar, Venkataramanan <Venkataramanan.Kumar@amd.com>; GCC >> Patches >>> <gcc-patches@gcc.gnu.org>; Jeff Law <law@redhat.com> >>> Subject: Re: [PATCH 2/5] x86: Add -mindirect-branch-loop= >>> >>> On Fri, Jan 12, 2018 at 4:38 AM, Martin Jambor <mjambor@suse.cz> wrote: >>>> Hi, >>>> >>>> On Thu, Jan 11 2018, Jeff Law wrote: >>>>> On 01/07/2018 03:59 PM, H.J. Lu wrote: >>>>>> Add -mindirect-branch-loop= option to control loop filler in call >>>>>> and return thunks generated by -mindirect-branch=. 'lfence' uses >>> "lfence" >>>>>> as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" >>>>>> as loop filler. The default is 'lfence'. >>>>>> >>>>>> gcc/ >>>>>> >>>>>> * config/i386/i386-opts.h (indirect_branch_loop): New. >>>>>> * config/i386/i386.c (output_indirect_thunk): Support >>>>>> -mindirect-branch-loop=. >>>>>> * config/i386/i386.opt (mindirect-branch-loop=): New option. >>>>>> (indirect_branch_loop): New. >>>>>> (lfence): Likewise. >>>>>> (pause): Likewise. >>>>>> (nop): Likewise. >>>>>> * doc/invoke.texi: Document -mindirect-branch-loop= option. >>>>>> >>>>>> gcc/testsuite/ >>>>>> >>>>>> * gcc.target/i386/indirect-thunk-loop-1.c: New test. >>>>>> * gcc.target/i386/indirect-thunk-loop-2.c: Likewise. >>>>>> * gcc.target/i386/indirect-thunk-loop-3.c: Likewise. >>>>>> * gcc.target/i386/indirect-thunk-loop-4.c: Likewise. >>>>>> * gcc.target/i386/indirect-thunk-loop-5.c: Likewise. >>>>> I think we should drop the ability to change the filler until such >>>>> time as we really need it. Just pick one and go with it. I think >>>>> David suggested that they wanted "pause". I'm obviously fine with that. >>>>> >>>> >>>> unless I am mistaken (which is frankly quite possible, I am still >>>> not quite up to speed about the nuances), AMD strongly prefers the >>>> lfence variant. OTOH, IIUC, in kernel this will be run-time patched >>>> but so it does not matter in the most pressing case and we might >>>> want to have a mechanism doing something similar for protecting >> userspace later on. >>>> But perhaps it is enough to keep the option? >>>> >>>> Muthu and/or Venkat, can you please comment? >>> >>> If we do want it, I will submit a separate patch AFTER the current >>> patch set has been approved and checked into GCC 8. >>> >> >> As per AMD architects, using “lfence” in “retpoline” is better than “pause” for >> our targets. >> So please allow filler to use "lfence". >> > We also leant that "lfence" is a dispatch serializing instruction. The Pause instruction is not serializing on AMD processors and has high latencies. So the concern I have if we end up wanting different sequences for different micro-architectures is you either up having to build multiple kernels, libraries and executables or you end up doing dynamic patching (ifuncs which are often used for this kind of dynamic selection don't really help in this particular scenario). It's unlikely we'll see distributors building multiple kernels, libraries and executables within a processor family. The kernel does have dynamic patching, but we don't have that for anything in user space (but I think folks will be looking at that in the relatively near future). So it's difficult to see a scenario where having the compiler make different codegen choices for these sequences based on the microarchitecture is all that useful. ISTM we want to pick a sequence that is reasonably OK, then look to dynamic patching to micro-optimize if the benefit is significant. Ideally going forward these issues will be largely addressed at the hardware level with perhaps some cooperation with kernel and repolines become a mitigation technique for legacy hardware. Thoughts Venkat? Jeff
On 01/12/2018 07:05 AM, H.J. Lu wrote: > On Fri, Jan 12, 2018 at 4:38 AM, Martin Jambor <mjambor@suse.cz> wrote: >> Hi, >> >> On Thu, Jan 11 2018, Jeff Law wrote: >>> On 01/07/2018 03:59 PM, H.J. Lu wrote: >>>> Add -mindirect-branch-loop= option to control loop filler in call and >>>> return thunks generated by -mindirect-branch=. 'lfence' uses "lfence" >>>> as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" >>>> as loop filler. The default is 'lfence'. >>>> >>>> gcc/ >>>> >>>> * config/i386/i386-opts.h (indirect_branch_loop): New. >>>> * config/i386/i386.c (output_indirect_thunk): Support >>>> -mindirect-branch-loop=. >>>> * config/i386/i386.opt (mindirect-branch-loop=): New option. >>>> (indirect_branch_loop): New. >>>> (lfence): Likewise. >>>> (pause): Likewise. >>>> (nop): Likewise. >>>> * doc/invoke.texi: Document -mindirect-branch-loop= option. >>>> >>>> gcc/testsuite/ >>>> >>>> * gcc.target/i386/indirect-thunk-loop-1.c: New test. >>>> * gcc.target/i386/indirect-thunk-loop-2.c: Likewise. >>>> * gcc.target/i386/indirect-thunk-loop-3.c: Likewise. >>>> * gcc.target/i386/indirect-thunk-loop-4.c: Likewise. >>>> * gcc.target/i386/indirect-thunk-loop-5.c: Likewise. >>> I think we should drop the ability to change the filler until such time >>> as we really need it. Just pick one and go with it. I think David >>> suggested that they wanted "pause". I'm obviously fine with that. >>> >> >> unless I am mistaken (which is frankly quite possible, I am still not >> quite up to speed about the nuances), AMD strongly prefers the lfence >> variant. OTOH, IIUC, in kernel this will be run-time patched but so it >> does not matter in the most pressing case and we might want to have a >> mechanism doing something similar for protecting userspace later on. >> But perhaps it is enough to keep the option? >> >> Muthu and/or Venkat, can you please comment? > > If we do want it, I will submit a separate patch AFTER the current patch > set has been approved and checked into GCC 8. That would be my preference, if we need it at all. jeff
Hi HJ, > -----Original Message----- > From: Kumar, Venkataramanan > Sent: Friday, January 12, 2018 8:39 PM > To: 'H.J. Lu' <hjl.tools@gmail.com>; 'Martin Jambor' <mjambor@suse.cz> > Cc: Nagarajan, Muthu kumar raj <Muthukumarraj.Nagarajan@amd.com>; > 'GCC Patches' <gcc-patches@gcc.gnu.org>; 'Jeff Law' <law@redhat.com>; > Uros Bizjak (ubizjak@gmail.com) <ubizjak@gmail.com>; 'Jan Hubicka' > <jh@suse.de> > Subject: RE: [PATCH 2/5] x86: Add -mindirect-branch-loop= > > Hi all, > > > -----Original Message----- > > From: Kumar, Venkataramanan > > Sent: Friday, January 12, 2018 8:16 PM > > To: 'H.J. Lu' <hjl.tools@gmail.com>; Martin Jambor <mjambor@suse.cz> > > Cc: Nagarajan, Muthu kumar raj <Muthukumarraj.Nagarajan@amd.com>; > > GCC Patches <gcc-patches@gcc.gnu.org>; Jeff Law <law@redhat.com>; > Uros > > Bizjak (ubizjak@gmail.com) <ubizjak@gmail.com>; 'Jan Hubicka' > > <jh@suse.de> > > Subject: RE: [PATCH 2/5] x86: Add -mindirect-branch-loop= > > > > Hi all, > > > > > -----Original Message----- > > > From: H.J. Lu [mailto:hjl.tools@gmail.com] > > > Sent: Friday, January 12, 2018 7:36 PM > > > To: Martin Jambor <mjambor@suse.cz> > > > Cc: Nagarajan, Muthu kumar raj > <Muthukumarraj.Nagarajan@amd.com>; > > > Kumar, Venkataramanan <Venkataramanan.Kumar@amd.com>; GCC > > Patches > > > <gcc-patches@gcc.gnu.org>; Jeff Law <law@redhat.com> > > > Subject: Re: [PATCH 2/5] x86: Add -mindirect-branch-loop= > > > > > > On Fri, Jan 12, 2018 at 4:38 AM, Martin Jambor <mjambor@suse.cz> > wrote: > > > > Hi, > > > > > > > > On Thu, Jan 11 2018, Jeff Law wrote: > > > >> On 01/07/2018 03:59 PM, H.J. Lu wrote: > > > >>> Add -mindirect-branch-loop= option to control loop filler in > > > >>> call and return thunks generated by -mindirect-branch=. > > > >>> 'lfence' uses > > > "lfence" > > > >>> as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" > > > >>> as loop filler. The default is 'lfence'. > > > >>> > > > >>> gcc/ > > > >>> > > > >>> * config/i386/i386-opts.h (indirect_branch_loop): New. > > > >>> * config/i386/i386.c (output_indirect_thunk): Support > > > >>> -mindirect-branch-loop=. > > > >>> * config/i386/i386.opt (mindirect-branch-loop=): New option. > > > >>> (indirect_branch_loop): New. > > > >>> (lfence): Likewise. > > > >>> (pause): Likewise. > > > >>> (nop): Likewise. > > > >>> * doc/invoke.texi: Document -mindirect-branch-loop= option. > > > >>> > > > >>> gcc/testsuite/ > > > >>> > > > >>> * gcc.target/i386/indirect-thunk-loop-1.c: New test. > > > >>> * gcc.target/i386/indirect-thunk-loop-2.c: Likewise. > > > >>> * gcc.target/i386/indirect-thunk-loop-3.c: Likewise. > > > >>> * gcc.target/i386/indirect-thunk-loop-4.c: Likewise. > > > >>> * gcc.target/i386/indirect-thunk-loop-5.c: Likewise. > > > >> I think we should drop the ability to change the filler until > > > >> such time as we really need it. Just pick one and go with it. I > > > >> think David suggested that they wanted "pause". I'm obviously fine > with that. > > > >> > > > > > > > > unless I am mistaken (which is frankly quite possible, I am still > > > > not quite up to speed about the nuances), AMD strongly prefers the > > > > lfence variant. OTOH, IIUC, in kernel this will be run-time > > > > patched but so it does not matter in the most pressing case and we > > > > might want to have a mechanism doing something similar for > > > > protecting > > userspace later on. > > > > But perhaps it is enough to keep the option? > > > > > > > > Muthu and/or Venkat, can you please comment? > > > > > > If we do want it, I will submit a separate patch AFTER the current > > > patch set has been approved and checked into GCC 8. > > > > > > > As per AMD architects, using “lfence” in “retpoline” is better than > > “pause” for our targets. > > So please allow filler to use "lfence". > > > We also leant that "lfence" is a dispatch serializing instruction. The Puse > instruction is not serializing on AMD processors and has high latencies. > Any reason why Intel has chosen "pause" over "lfence" as the default loop filler for Retpoline? > Regards. > Venkat. > > > -- > > > H.J.
On Fri, Jan 12, 2018 at 10:27 AM, Kumar, Venkataramanan <Venkataramanan.Kumar@amd.com> wrote: > Hi HJ, > >> -----Original Message----- >> From: Kumar, Venkataramanan >> Sent: Friday, January 12, 2018 8:39 PM >> To: 'H.J. Lu' <hjl.tools@gmail.com>; 'Martin Jambor' <mjambor@suse.cz> >> Cc: Nagarajan, Muthu kumar raj <Muthukumarraj.Nagarajan@amd.com>; >> 'GCC Patches' <gcc-patches@gcc.gnu.org>; 'Jeff Law' <law@redhat.com>; >> Uros Bizjak (ubizjak@gmail.com) <ubizjak@gmail.com>; 'Jan Hubicka' >> <jh@suse.de> >> Subject: RE: [PATCH 2/5] x86: Add -mindirect-branch-loop= >> >> Hi all, >> >> > -----Original Message----- >> > From: Kumar, Venkataramanan >> > Sent: Friday, January 12, 2018 8:16 PM >> > To: 'H.J. Lu' <hjl.tools@gmail.com>; Martin Jambor <mjambor@suse.cz> >> > Cc: Nagarajan, Muthu kumar raj <Muthukumarraj.Nagarajan@amd.com>; >> > GCC Patches <gcc-patches@gcc.gnu.org>; Jeff Law <law@redhat.com>; >> Uros >> > Bizjak (ubizjak@gmail.com) <ubizjak@gmail.com>; 'Jan Hubicka' >> > <jh@suse.de> >> > Subject: RE: [PATCH 2/5] x86: Add -mindirect-branch-loop= >> > >> > Hi all, >> > >> > > -----Original Message----- >> > > From: H.J. Lu [mailto:hjl.tools@gmail.com] >> > > Sent: Friday, January 12, 2018 7:36 PM >> > > To: Martin Jambor <mjambor@suse.cz> >> > > Cc: Nagarajan, Muthu kumar raj >> <Muthukumarraj.Nagarajan@amd.com>; >> > > Kumar, Venkataramanan <Venkataramanan.Kumar@amd.com>; GCC >> > Patches >> > > <gcc-patches@gcc.gnu.org>; Jeff Law <law@redhat.com> >> > > Subject: Re: [PATCH 2/5] x86: Add -mindirect-branch-loop= >> > > >> > > On Fri, Jan 12, 2018 at 4:38 AM, Martin Jambor <mjambor@suse.cz> >> wrote: >> > > > Hi, >> > > > >> > > > On Thu, Jan 11 2018, Jeff Law wrote: >> > > >> On 01/07/2018 03:59 PM, H.J. Lu wrote: >> > > >>> Add -mindirect-branch-loop= option to control loop filler in >> > > >>> call and return thunks generated by -mindirect-branch=. >> > > >>> 'lfence' uses >> > > "lfence" >> > > >>> as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" >> > > >>> as loop filler. The default is 'lfence'. >> > > >>> >> > > >>> gcc/ >> > > >>> >> > > >>> * config/i386/i386-opts.h (indirect_branch_loop): New. >> > > >>> * config/i386/i386.c (output_indirect_thunk): Support >> > > >>> -mindirect-branch-loop=. >> > > >>> * config/i386/i386.opt (mindirect-branch-loop=): New option. >> > > >>> (indirect_branch_loop): New. >> > > >>> (lfence): Likewise. >> > > >>> (pause): Likewise. >> > > >>> (nop): Likewise. >> > > >>> * doc/invoke.texi: Document -mindirect-branch-loop= option. >> > > >>> >> > > >>> gcc/testsuite/ >> > > >>> >> > > >>> * gcc.target/i386/indirect-thunk-loop-1.c: New test. >> > > >>> * gcc.target/i386/indirect-thunk-loop-2.c: Likewise. >> > > >>> * gcc.target/i386/indirect-thunk-loop-3.c: Likewise. >> > > >>> * gcc.target/i386/indirect-thunk-loop-4.c: Likewise. >> > > >>> * gcc.target/i386/indirect-thunk-loop-5.c: Likewise. >> > > >> I think we should drop the ability to change the filler until >> > > >> such time as we really need it. Just pick one and go with it. I >> > > >> think David suggested that they wanted "pause". I'm obviously fine >> with that. >> > > >> >> > > > >> > > > unless I am mistaken (which is frankly quite possible, I am still >> > > > not quite up to speed about the nuances), AMD strongly prefers the >> > > > lfence variant. OTOH, IIUC, in kernel this will be run-time >> > > > patched but so it does not matter in the most pressing case and we >> > > > might want to have a mechanism doing something similar for >> > > > protecting >> > userspace later on. >> > > > But perhaps it is enough to keep the option? >> > > > >> > > > Muthu and/or Venkat, can you please comment? >> > > >> > > If we do want it, I will submit a separate patch AFTER the current >> > > patch set has been approved and checked into GCC 8. >> > > >> > >> > As per AMD architects, using “lfence” in “retpoline” is better than >> > “pause” for our targets. >> > So please allow filler to use "lfence". >> > >> We also leant that "lfence" is a dispatch serializing instruction. The Puse >> instruction is not serializing on AMD processors and has high latencies. >> > Any reason why Intel has chosen "pause" over "lfence" as the default loop filler for Retpoline? > My original patch uses "lfence". I was asked to use "pause": https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00969.html
Hi All, > -----Original Message----- > From: H.J. Lu [mailto:hjl.tools@gmail.com] > Sent: Saturday, January 13, 2018 1:11 AM > To: Kumar, Venkataramanan <Venkataramanan.Kumar@amd.com> > Cc: Nagarajan, Muthu kumar raj <Muthukumarraj.Nagarajan@amd.com>; > GCC Patches <gcc-patches@gcc.gnu.org>; Martin Jambor > <mjambor@suse.cz>; Jeff Law <law@redhat.com>; Uros Bizjak > (ubizjak@gmail.com) <ubizjak@gmail.com>; Jan Hubicka <jh@suse.de>; > Dharmakan, Rohit arul raj <Rohitarulraj.Dharmakan@amd.com> > Subject: Re: [PATCH 2/5] x86: Add -mindirect-branch-loop= > > On Fri, Jan 12, 2018 at 10:27 AM, Kumar, Venkataramanan > <Venkataramanan.Kumar@amd.com> wrote: > > Hi HJ, > > > >> -----Original Message----- > >> From: Kumar, Venkataramanan > >> Sent: Friday, January 12, 2018 8:39 PM > >> To: 'H.J. Lu' <hjl.tools@gmail.com>; 'Martin Jambor' > >> <mjambor@suse.cz> > >> Cc: Nagarajan, Muthu kumar raj <Muthukumarraj.Nagarajan@amd.com>; > >> 'GCC Patches' <gcc-patches@gcc.gnu.org>; 'Jeff Law' <law@redhat.com>; > >> Uros Bizjak (ubizjak@gmail.com) <ubizjak@gmail.com>; 'Jan Hubicka' > >> <jh@suse.de> > >> Subject: RE: [PATCH 2/5] x86: Add -mindirect-branch-loop= > >> > >> Hi all, > >> > >> > -----Original Message----- > >> > From: Kumar, Venkataramanan > >> > Sent: Friday, January 12, 2018 8:16 PM > >> > To: 'H.J. Lu' <hjl.tools@gmail.com>; Martin Jambor > >> > <mjambor@suse.cz> > >> > Cc: Nagarajan, Muthu kumar raj > <Muthukumarraj.Nagarajan@amd.com>; > >> > GCC Patches <gcc-patches@gcc.gnu.org>; Jeff Law <law@redhat.com>; > >> Uros > >> > Bizjak (ubizjak@gmail.com) <ubizjak@gmail.com>; 'Jan Hubicka' > >> > <jh@suse.de> > >> > Subject: RE: [PATCH 2/5] x86: Add -mindirect-branch-loop= > >> > > >> > Hi all, > >> > > >> > > -----Original Message----- > >> > > From: H.J. Lu [mailto:hjl.tools@gmail.com] > >> > > Sent: Friday, January 12, 2018 7:36 PM > >> > > To: Martin Jambor <mjambor@suse.cz> > >> > > Cc: Nagarajan, Muthu kumar raj > >> <Muthukumarraj.Nagarajan@amd.com>; > >> > > Kumar, Venkataramanan <Venkataramanan.Kumar@amd.com>; GCC > >> > Patches > >> > > <gcc-patches@gcc.gnu.org>; Jeff Law <law@redhat.com> > >> > > Subject: Re: [PATCH 2/5] x86: Add -mindirect-branch-loop= > >> > > > >> > > On Fri, Jan 12, 2018 at 4:38 AM, Martin Jambor <mjambor@suse.cz> > >> wrote: > >> > > > Hi, > >> > > > > >> > > > On Thu, Jan 11 2018, Jeff Law wrote: > >> > > >> On 01/07/2018 03:59 PM, H.J. Lu wrote: > >> > > >>> Add -mindirect-branch-loop= option to control loop filler in > >> > > >>> call and return thunks generated by -mindirect-branch=. > >> > > >>> 'lfence' uses > >> > > "lfence" > >> > > >>> as loop filler. 'pause' uses "pause" as loop filler. 'nop' uses "nop" > >> > > >>> as loop filler. The default is 'lfence'. > >> > > >>> > >> > > >>> gcc/ > >> > > >>> > >> > > >>> * config/i386/i386-opts.h (indirect_branch_loop): New. > >> > > >>> * config/i386/i386.c (output_indirect_thunk): Support > >> > > >>> -mindirect-branch-loop=. > >> > > >>> * config/i386/i386.opt (mindirect-branch-loop=): New option. > >> > > >>> (indirect_branch_loop): New. > >> > > >>> (lfence): Likewise. > >> > > >>> (pause): Likewise. > >> > > >>> (nop): Likewise. > >> > > >>> * doc/invoke.texi: Document -mindirect-branch-loop= option. > >> > > >>> > >> > > >>> gcc/testsuite/ > >> > > >>> > >> > > >>> * gcc.target/i386/indirect-thunk-loop-1.c: New test. > >> > > >>> * gcc.target/i386/indirect-thunk-loop-2.c: Likewise. > >> > > >>> * gcc.target/i386/indirect-thunk-loop-3.c: Likewise. > >> > > >>> * gcc.target/i386/indirect-thunk-loop-4.c: Likewise. > >> > > >>> * gcc.target/i386/indirect-thunk-loop-5.c: Likewise. > >> > > >> I think we should drop the ability to change the filler until > >> > > >> such time as we really need it. Just pick one and go with it. > >> > > >> I think David suggested that they wanted "pause". I'm > >> > > >> obviously fine > >> with that. > >> > > >> > >> > > > > >> > > > unless I am mistaken (which is frankly quite possible, I am > >> > > > still not quite up to speed about the nuances), AMD strongly > >> > > > prefers the lfence variant. OTOH, IIUC, in kernel this will be > >> > > > run-time patched but so it does not matter in the most pressing > >> > > > case and we might want to have a mechanism doing something > >> > > > similar for protecting > >> > userspace later on. > >> > > > But perhaps it is enough to keep the option? > >> > > > > >> > > > Muthu and/or Venkat, can you please comment? > >> > > > >> > > If we do want it, I will submit a separate patch AFTER the > >> > > current patch set has been approved and checked into GCC 8. > >> > > > >> > > >> > As per AMD architects, using “lfence” in “retpoline” is better than > >> > “pause” for our targets. > >> > So please allow filler to use "lfence". > >> > > >> We also leant that "lfence" is a dispatch serializing instruction. > >> The Puse instruction is not serializing on AMD processors and has high > latencies. > >> > > Any reason why Intel has chosen "pause" over "lfence" as the default loop > filler for Retpoline? > > > > My original patch uses "lfence". I was asked to use "pause": > > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00969.html If everyone is ok, my suggestion is to use "lfence" as the default loop filler for retpoline. Please confirm. > > -- > H.J.
On Sat, 2018-01-13 at 03:11 +0000, Kumar, Venkataramanan wrote: > > > My original patch uses "lfence". I was asked to use "pause": > > > > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00969.html > > If everyone is ok, my suggestion is to use "lfence" as the default > loop filler for retpoline. > > Please confirm. I have had the same request for the kernel patches. I'm happy with it but would like confirmation from Intel and from Paul Turner (whose idea this is, and who has overseen most of the coherent analysis). FWIW I haven't actually *changed* the kernel patch yet, awaiting that confirmation. I understand this is a power optimisation only; preventing the CPU from spinning in that loop when it's mispredicted a return to it.
On Sat, 2018-01-13 at 03:11 +0000, Kumar, Venkataramanan wrote: > > > My original patch uses "lfence". I was asked to use "pause": > > > > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00969.html > > If everyone is ok, my suggestion is to use "lfence" as the default > loop filler for retpoline. > > Please confirm. I have the same response to this as I did to precisely the same request in the kernel implementation: I'm OK with it but I'd like an explicit approval from Intel, and from Paul Turner at Google who architected the retpoline concept in the first place. cf. https://marc.info/?l=linux-kernel&m=151585246530355
On Sat, Jan 13, 2018 at 8:34 AM, David Woodhouse <dwmw2@infradead.org> wrote: > On Sat, 2018-01-13 at 03:11 +0000, Kumar, Venkataramanan wrote: >> >> > My original patch uses "lfence". I was asked to use "pause": >> > >> > https://gcc.gnu.org/ml/gcc-patches/2018-01/msg00969.html >> >> If everyone is ok, my suggestion is to use "lfence" as the default >> loop filler for retpoline. >> >> Please confirm. > > I have the same response to this as I did to precisely the same request > in the kernel implementation: I'm OK with it but I'd like an explicit > approval from Intel, and from Paul Turner at Google who architected the > retpoline concept in the first place. > > cf. https://marc.info/?l=linux-kernel&m=151585246530355 > There is no urgency on this. We can take another look AFTER my current patch set is approved and checked in.
> > If everyone is ok, my suggestion is to use "lfence" as the default > > loop filler for retpoline. can we do BOTH a pause and lfence. (that way on cpu's where pause is the power stop, it works, and on cpus where it's a fallthrough (AMD) it goes to the lfence)
Hi Arjan, > -----Original Message----- > From: Van De Ven, Arjan [mailto:arjan.van.de.ven@intel.com] > Sent: Saturday, January 13, 2018 10:16 PM > To: David Woodhouse <dwmw2@infradead.org>; Kumar, Venkataramanan > <Venkataramanan.Kumar@amd.com>; H.J. Lu <hjl.tools@gmail.com>; Jeff > Law <law@redhat.com>; Paul Turner <pjt@google.com>; Mallick, Asit K > <asit.k.mallick@intel.com> > Cc: Nagarajan, Muthu kumar raj <Muthukumarraj.Nagarajan@amd.com>; > GCC Patches <gcc-patches@gcc.gnu.org>; Martin Jambor > <mjambor@suse.cz>; Uros Bizjak (ubizjak@gmail.com) > <ubizjak@gmail.com>; Jan Hubicka <jh@suse.de>; Dharmakan, Rohit arul raj > <Rohitarulraj.Dharmakan@amd.com> > Subject: RE: [PATCH 2/5] x86: Add -mindirect-branch-loop= > > > > If everyone is ok, my suggestion is to use "lfence" as the default > > > loop filler for retpoline. > > can we do BOTH a pause and lfence. > (that way on cpu's where pause is the power stop, it works, and on cpus > where it's a fallthrough (AMD) it goes to the lfence) > I checked with our Architect. Having just "pause" is the concern. It should also be fine for AMD to use "pause" followed by "lfence" in the loop of retpoline. Regards, Venkat.
diff --git a/gcc/config/i386/i386-opts.h b/gcc/config/i386/i386-opts.h index f14cbeee7a1..52f7cb979cc 100644 --- a/gcc/config/i386/i386-opts.h +++ b/gcc/config/i386/i386-opts.h @@ -114,4 +114,10 @@ enum indirect_branch { indirect_branch_thunk_extern }; +enum indirect_branch_loop { + indirect_branch_loop_lfence, + indirect_branch_loop_pause, + indirect_branch_loop_nop +}; + #endif diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index ac4d1f62f50..f327a6b1b62 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -10753,8 +10753,23 @@ output_indirect_thunk (bool need_bnd_p, int regno) ASM_OUTPUT_INTERNAL_LABEL (asm_out_file, indirectlabel1); - /* lfence . */ - fprintf (asm_out_file, "\tlfence\n"); + switch (ix86_indirect_branch_loop) + { + case indirect_branch_loop_lfence: + /* lfence. */ + fprintf (asm_out_file, "\tlfence\n"); + break; + case indirect_branch_loop_pause: + /* pause. */ + fprintf (asm_out_file, "\tpause\n"); + break; + case indirect_branch_loop_nop: + /* nop. */ + fprintf (asm_out_file, "\tnop\n"); + break; + default: + gcc_unreachable (); + } /* Jump. */ fputs ("\tjmp\t", asm_out_file); diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index 22c806206e4..f3e43da0aed 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -1041,3 +1041,20 @@ Enum(indirect_branch) String(thunk-inline) Value(indirect_branch_thunk_inline) EnumValue Enum(indirect_branch) String(thunk-extern) Value(indirect_branch_thunk_extern) + +mindirect-branch-loop= +Target Report RejectNegative Joined Enum(indirect_branch_loop) Var(ix86_indirect_branch_loop) Init(indirect_branch_loop_lfence) +Control loop filler in call and return thunk for indirect call and jump. + +Enum +Name(indirect_branch_loop) Type(enum indirect_branch_loop) +Known looop choices (for use with the -mindirect-branch-loop= option): + +EnumValue +Enum(indirect_branch_loop) String(lfence) Value(indirect_branch_loop_lfence) + +EnumValue +Enum(indirect_branch_loop) String(pause) Value(indirect_branch_loop_pause) + +EnumValue +Enum(indirect_branch_loop) String(nop) Value(indirect_branch_loop_nop) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 46461d1ada3..ee5248aba59 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -1227,7 +1227,7 @@ See RS/6000 and PowerPC Options. -mstack-protector-guard-offset=@var{offset} @gol -mstack-protector-guard-symbol=@var{symbol} -mmitigate-rop @gol -mgeneral-regs-only -mcall-ms2sysv-xlogues @gol --mindirect-branch=@var{choice}} +-mindirect-branch=@var{choice} -mindirect-branch-loop=@var{choice}} @emph{x86 Windows Options} @gccoptlist{-mconsole -mcygwin -mno-cygwin -mdll @gol @@ -26776,6 +26776,13 @@ to external call and return thunk provided in a separate object file. You can control this behavior for a specific function by using the function attribute @code{indirect_branch}. @xref{Function Attributes}. +@item -mindirect-branch-loop=@var{choice} +@opindex -mindirect-branch-boop +Control loop filler in call and return thunk for indirect call and jump. +@samp{lfence} uses @code{lfence} as loop filler. @samp{pause} uses +@code{pause} as loop filler. @samp{nop} uses @code{nop} as loop filler. +The default is @samp{lfence}. + @end table These @samp{-m} switches are supported in addition to the above diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-1.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-1.c new file mode 100644 index 00000000000..1b0e2c58775 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-1.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mindirect-branch=thunk -mindirect-branch-loop=pause -fno-pic" } */ + +typedef void (*dispatch_t)(long offset); + +dispatch_t dispatch; + +void +male_indirect_jump (long offset) +{ + dispatch(offset); +} + +/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */ +/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */ +/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */ +/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */ +/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */ +/* { dg-final { scan-assembler {\tpause} } } */ diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-2.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-2.c new file mode 100644 index 00000000000..feace47a765 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-2.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mindirect-branch=thunk -mindirect-branch-loop=nop -fno-pic" } */ + +typedef void (*dispatch_t)(long offset); + +dispatch_t dispatch[256]; + +void +male_indirect_jump (long offset) +{ + dispatch[offset](offset); +} + +/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */ +/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */ +/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */ +/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */ +/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */ +/* { dg-final { scan-assembler {\tnop} } } */ diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-3.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-3.c new file mode 100644 index 00000000000..ad2165fa7aa --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-3.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mindirect-branch=thunk -mindirect-branch-loop=lfence -fno-pic" } */ + +typedef void (*dispatch_t)(long offset); + +dispatch_t dispatch; + +void +male_indirect_jump (long offset) +{ + dispatch(offset); +} + +/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */ +/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */ +/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */ +/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */ +/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */ +/* { dg-final { scan-assembler {\tlfence} } } */ diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-4.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-4.c new file mode 100644 index 00000000000..4ba997da966 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-4.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mindirect-branch=thunk-inline -mindirect-branch-loop=pause -fno-pic" } */ + +typedef void (*dispatch_t)(long offset); + +dispatch_t dispatch; + +void +male_indirect_jump (long offset) +{ + dispatch(offset); +} + +/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */ +/* { dg-final { scan-assembler "jmp\[ \t\]*\.LIND" } } */ +/* { dg-final { scan-assembler "call\[ \t\]*\.LIND" } } */ +/* { dg-final { scan-assembler {\tpause} } } */ +/* { dg-final { scan-assembler-not "__x86_indirect_thunk" } } */ +/* { dg-final { scan-assembler-not "pushq\[ \t\]%rax" { target x32 } } } */ diff --git a/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-5.c b/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-5.c new file mode 100644 index 00000000000..10fb2193f5e --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/indirect-thunk-loop-5.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mindirect-branch=thunk-extern -mindirect-branch-loop=pause -fno-pic" } */ + +typedef void (*dispatch_t)(long offset); + +dispatch_t dispatch; + +void +male_indirect_jump (long offset) +{ + dispatch(offset); +} + +/* { dg-final { scan-assembler "push(?:l|q)\[ \t\]*_?dispatch" { target { ! x32 } } } } */ +/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk" { target { ! x32 } } } } */ +/* { dg-final { scan-assembler "jmp\[ \t\]*__x86_indirect_thunk_(r|e)ax" { target x32 } } } */ +/* { dg-final { scan-assembler-not {\t(lfence|pause|nop)} } } */ +/* { dg-final { scan-assembler-not "jmp\[ \t\]*\.LIND" } } */ +/* { dg-final { scan-assembler-not "call\[ \t\]*\.LIND" } } */