Message ID | CAFULd4Y-A6vMaXJ1+EzkFexJ7xeWb3ri=XpHAK1UCXGFB0si4w@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Sun, Oct 4, 2015 at 3:29 AM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Sun, Oct 4, 2015 at 7:23 AM, Yulia Koval <vaalfreja@gmail.com> wrote: >> Hi, >> >> Here is the last version of the patch. Regtested/bootstraped for >> Linux/i686 and Linux/x86_64. >> >> Date: Fri, 4 Sep 2015 08:53:23 -0700 >> Subject: [PATCH] Implement x86 interrupt attribute >> >> The interrupt and exception handlers are called by x86 processors. X86 >> hardware pushes information onto stack and calls the handler. The >> requirements are >> >> 1. Both interrupt and exception handlers must use the 'IRET' instruction, >> instead of the 'RET' instruction, to return from the handlers. >> 2. All registers are callee-saved in interrupt and exception handlers. >> 3. The difference between interrupt and exception handlers is the >> exception handler must pop 'ERROR_CODE' off the stack before the 'IRET' >> instruction. >> >> The design goals of interrupt and exception handlers for x86 processors >> are: >> >> 1. Support both 32-bit and 64-bit modes. >> 2. Flexible for compilers to optimize. >> 3. Easy to use by programmers. >> >> To implement interrupt and exception handlers for x86 processors, a >> compiler should support: >> >> 'interrupt' attribute >> >> Use this attribute to indicate that the specified function with >> mandatory arguments is an interrupt or exception handler. The compiler >> generates function entry and exit sequences suitable for use in an >> interrupt handler when this attribute is present. The 'IRET' instruction, >> instead of the 'RET' instruction, is used to return from interrupt or >> exception handlers. All registers, except for the EFLAGS register which >> is restored by the 'IRET' instruction, are preserved by the compiler. >> >> Any interruptible-without-stack-switch code must be compiled with >> -mno-red-zone since interrupt handlers can and will, because of the >> hardware design, touch the red zone. >> >> 1. interrupt handler must be declared with a mandatory pointer argument: >> >> struct interrupt_frame; >> >> __attribute__ ((interrupt)) >> void >> f (struct interrupt_frame *frame) >> { >> ... >> } >> >> and user must properly define the structure the pointer pointing to. >> >> 2. exception handler: >> >> The exception handler is very similar to the interrupt handler with >> a different mandatory function signature: >> >> typedef unsigned long long int uword_t; >> typedef unsigned int uword_t; >> >> struct interrupt_frame; >> >> __attribute__ ((interrupt)) >> void >> f (struct interrupt_frame *frame, uword_t error_code) >> { >> ... >> } >> >> and compiler pops the error code off stack before the 'IRET' instruction. >> >> The exception handler should only be used for exceptions which push an >> error code and all other exceptions must use the interrupt handler. >> The system will crash if the wrong handler is used. >> >> To be feature complete, compiler may implement the optional >> 'no_caller_saved_registers' attribute: >> >> Use this attribute to indicate that the specified function has no >> caller-saved registers. That is, all registers are callee-saved. >> The compiler generates proper function entry and exit sequences to >> save and restore any modified registers. >> >> The user can call functions specified with 'no_caller_saved_registers' >> attribute from an interrupt handler without saving and restoring all >> call clobbered registers. > > Looking a bit deeper into the code, it looks that we want to realign > the stack in the interrupt handler. Let's assume that interrupt > handler is calling some other function that saves SSE vector regs to > the stack. According to the x86 ABI, incoming stack of the called > function is assumed to be aligned to 16 bytes. But, interrupt handler > violates this assumption, since the stack could be aligned to only 4 > bytes for 32bit and 8 bytes for 64bit targets. Entering the called > function with stack, aligned to less than 16 bytes will certainly > violate ABI. > > So, it looks to me that we need to realign the stack in the interrupt > handler unconditionally to 16bytes. In this case, we also won't need > the following changes: > Current stack alignment implementation requires at least one, maybe two, scratch registers: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67841 Extend it to the interrupt handler, which doesn't have any scratch registers may require significant changes in backend as well as register allocator.
On Sun, Oct 4, 2015 at 8:15 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >> Looking a bit deeper into the code, it looks that we want to realign >> the stack in the interrupt handler. Let's assume that interrupt >> handler is calling some other function that saves SSE vector regs to >> the stack. According to the x86 ABI, incoming stack of the called >> function is assumed to be aligned to 16 bytes. But, interrupt handler >> violates this assumption, since the stack could be aligned to only 4 >> bytes for 32bit and 8 bytes for 64bit targets. Entering the called >> function with stack, aligned to less than 16 bytes will certainly >> violate ABI. >> >> So, it looks to me that we need to realign the stack in the interrupt >> handler unconditionally to 16bytes. In this case, we also won't need >> the following changes: >> > > Current stack alignment implementation requires at least > one, maybe two, scratch registers: > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67841 > > Extend it to the interrupt handler, which doesn't have any scratch > registers may require significant changes in backend as well as > register allocator. But without realignment, the handler is unusable for anything but simple functions. The handler will crash when called function will try to save vector reg to stack. Uros.
On Sun, Oct 4, 2015 at 1:00 PM, Uros Bizjak <ubizjak@gmail.com> wrote: > On Sun, Oct 4, 2015 at 8:15 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > >>> Looking a bit deeper into the code, it looks that we want to realign >>> the stack in the interrupt handler. Let's assume that interrupt >>> handler is calling some other function that saves SSE vector regs to >>> the stack. According to the x86 ABI, incoming stack of the called >>> function is assumed to be aligned to 16 bytes. But, interrupt handler >>> violates this assumption, since the stack could be aligned to only 4 >>> bytes for 32bit and 8 bytes for 64bit targets. Entering the called >>> function with stack, aligned to less than 16 bytes will certainly >>> violate ABI. >>> >>> So, it looks to me that we need to realign the stack in the interrupt >>> handler unconditionally to 16bytes. In this case, we also won't need >>> the following changes: >>> >> >> Current stack alignment implementation requires at least >> one, maybe two, scratch registers: >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67841 >> >> Extend it to the interrupt handler, which doesn't have any scratch >> registers may require significant changes in backend as well as >> register allocator. > > But without realignment, the handler is unusable for anything but > simple functions. The handler will crash when called function will try > to save vector reg to stack. > We can use unaligned load and store to avoid crash.
On Sun, Oct 4, 2015 at 10:51 PM, H.J. Lu <hjl.tools@gmail.com> wrote: > On Sun, Oct 4, 2015 at 1:00 PM, Uros Bizjak <ubizjak@gmail.com> wrote: >> On Sun, Oct 4, 2015 at 8:15 PM, H.J. Lu <hjl.tools@gmail.com> wrote: >> >>>> Looking a bit deeper into the code, it looks that we want to realign >>>> the stack in the interrupt handler. Let's assume that interrupt >>>> handler is calling some other function that saves SSE vector regs to >>>> the stack. According to the x86 ABI, incoming stack of the called >>>> function is assumed to be aligned to 16 bytes. But, interrupt handler >>>> violates this assumption, since the stack could be aligned to only 4 >>>> bytes for 32bit and 8 bytes for 64bit targets. Entering the called >>>> function with stack, aligned to less than 16 bytes will certainly >>>> violate ABI. >>>> >>>> So, it looks to me that we need to realign the stack in the interrupt >>>> handler unconditionally to 16bytes. In this case, we also won't need >>>> the following changes: >>>> >>> >>> Current stack alignment implementation requires at least >>> one, maybe two, scratch registers: >>> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67841 >>> >>> Extend it to the interrupt handler, which doesn't have any scratch >>> registers may require significant changes in backend as well as >>> register allocator. >> >> But without realignment, the handler is unusable for anything but >> simple functions. The handler will crash when called function will try >> to save vector reg to stack. >> > > We can use unaligned load and store to avoid crash. Oh, sorry, I meant "called function will crash", like: -> interrupt when %rsp = 0x...8 -> -> interrupt handler -> -> calls some function that tries to save xmm reg to stack -> crash in the called function Uros.
On Oct 4, 2015, at 11:15 AM, H.J. Lu <hjl.tools@gmail.com> wrote: > Current stack alignment implementation requires at least > one, maybe two, scratch registers: So, I have some cases where I need scratch registers as well. I always save 2 registers and they go first (and restore last), so I can always use them.
--- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -875,10 +875,18 @@ case MODE_V16SF: case MODE_V8SF: case MODE_V4SF: - if (TARGET_AVX + /* We must support misaligned SSE load and store in interrupt + handler or there are no caller-saved registers and we are + in 32-bit mode since ix86_emit_save_reg_using_mov generates + the normal *mov<mode>_internal pattern to save and restore + SSE registers with misaligned stack. */ + if ((TARGET_AVX + || cfun->machine->func_type != TYPE_NORMAL + || (!TARGET_64BIT + && cfun->machine->no_caller_saved_registers)) && (misaligned_operand (operands[0], <MODE>mode) || misaligned_operand (operands[1], <MODE>mode))) - return "vmovups\t{%1, %0|%0, %1}"; + return "%vmovups\t{%1, %0|%0, %1}"; else return "%vmovaps\t{%1, %0|%0, %1}";