diff mbox

Subject: [PATCH 1/2] x86: get back 15 vectors

Message ID 4B424305.7050803@kernel.org
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Yinghai Lu Jan. 4, 2010, 7:35 p.m. UTC
On 01/04/2010 11:09 AM, Eric W. Biederman wrote:
> "H. Peter Anvin" <hpa@zytor.com> writes:
> 
>> On 01/04/2010 08:18 AM, Eric W. Biederman wrote:
>>> Yinghai Lu <yinghai@kernel.org> writes:
>>>
>>> This patch is wrong.
>>>
>>>> between FIRST_EXTERNAL_VECTOR (0x20) and FIRST_DEVICE_VECTOR (0x41)
>>>>
>>>> for 0x20 and 0x2f, we are safe be used_vectors will prevent it to use used one.
>>>
>>> We can not use any of 0x20 - 0x2f for ioapic irqs.  We need the entire
>>> priority level to ensure that the irq move cleanup ipi is of a lower
>>> priority.
>>>
>>
>> Almost makes one want to abuse 0x1f for that.  Although 0x00..0x1f are
>> reserved for exceptions, the APICs range down to 0x10, and well, when
>> 0x1f ends up actually getting used as an exception vector that we
>> support, then we can trivially change that.  In the meantime it would
>> actually make use of an otherwise-unusable APIC priority level.
> 
> An optimization like that (with a big fat comment) seems reasonable
> to me.

so we can use [0x10, 0x1f]

sth like this?

Subject: [PATCH 1/2] x86: get back 16 vectors

-v2: according to hpa that we could start from 0x10
     according to Eric, we should hold 16 vectors for IRQ MOVE

Signed-off-by: Yinghai Lu <yinghai@kernel.org>

---
 arch/x86/include/asm/irq_vectors.h |    3 ++-
 arch/x86/kernel/apic/io_apic.c     |    5 +++--
 arch/x86/kernel/irqinit.c          |   11 +++++++++--
 3 files changed, 14 insertions(+), 5 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Suresh Siddha Jan. 4, 2010, 7:45 p.m. UTC | #1
On Mon, 2010-01-04 at 11:35 -0800, Yinghai Lu wrote:
> sth like this?
> 
> Subject: [PATCH 1/2] x86: get back 16 vectors
> 
> -v2: according to hpa that we could start from 0x10
>      according to Eric, we should hold 16 vectors for IRQ MOVE
> 
> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> 

Yinghai we have to change IRQ_MOVE_CLEANUP_VECTOR to 0x1f or so. From
the cpu perspective this vector is documented as illegal, so we need to
check if this change will work on the cpu's we have today to get some
confidence.

thanks,
suresh

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
H. Peter Anvin Jan. 4, 2010, 7:48 p.m. UTC | #2
[Adding Suresh to the Cc: list]

On 01/04/2010 11:35 AM, Yinghai Lu wrote:
> 
> so we can use [0x10, 0x1f]
> 
> sth like this?
> 

No!!!

[0x10, 0x1f] is reserved for exceptions.  We can probably get away with
stealing *one* vector... presumably at the end (0x1f).  However, we can
absolutely not use the whole block: 0x10-0x13 is occupied by exceptions
we already have OS support for (#MF, #AC, #MC, and #XM), and it's pretty
much guaranteed we'll have more coming.  However, growth is quite slow
and since this is a kernel-internal vector (not accessible to user
space) it is not creating an API.

In other words, we could change FIRST_EXTERNAL_VECTOR to 0x1f, and use
it for IRQ_MOVE_CLEANUP_VECTOR.  Then use 0x20..0x2f for the legacy vectors.

	-hpa
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
H. Peter Anvin Jan. 4, 2010, 7:50 p.m. UTC | #3
On 01/04/2010 11:45 AM, Suresh Siddha wrote:
> On Mon, 2010-01-04 at 11:35 -0800, Yinghai Lu wrote:
>> sth like this?
>>
>> Subject: [PATCH 1/2] x86: get back 16 vectors
>>
>> -v2: according to hpa that we could start from 0x10
>>      according to Eric, we should hold 16 vectors for IRQ MOVE
>>
>> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
>>
> 
> Yinghai we have to change IRQ_MOVE_CLEANUP_VECTOR to 0x1f or so. From
> the cpu perspective this vector is documented as illegal, so we need to
> check if this change will work on the cpu's we have today to get some
> confidence.
> 

It's documented as reserved, not illegal.  The ability for the APIC to
generate vectors starting at 0x10 is documented, as is the ability for
the CPU to receive any vector number as an interrupt -- in fact, the
legacy BIOS relies on being able to receive interrupts starting at
vector 0x08.  It causes problems galore, but only at the software level.

	-hpa
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric W. Biederman Jan. 4, 2010, 8:08 p.m. UTC | #4
Yinghai Lu <yinghai@kernel.org> writes:

> On 01/04/2010 11:09 AM, Eric W. Biederman wrote:
>> "H. Peter Anvin" <hpa@zytor.com> writes:
>> 
>>> On 01/04/2010 08:18 AM, Eric W. Biederman wrote:
>>>> Yinghai Lu <yinghai@kernel.org> writes:
>>>>
>>>> This patch is wrong.
>>>>
>>>>> between FIRST_EXTERNAL_VECTOR (0x20) and FIRST_DEVICE_VECTOR (0x41)
>>>>>
>>>>> for 0x20 and 0x2f, we are safe be used_vectors will prevent it to use used one.
>>>>
>>>> We can not use any of 0x20 - 0x2f for ioapic irqs.  We need the entire
>>>> priority level to ensure that the irq move cleanup ipi is of a lower
>>>> priority.
>>>>
>>>
>>> Almost makes one want to abuse 0x1f for that.  Although 0x00..0x1f are
>>> reserved for exceptions, the APICs range down to 0x10, and well, when
>>> 0x1f ends up actually getting used as an exception vector that we
>>> support, then we can trivially change that.  In the meantime it would
>>> actually make use of an otherwise-unusable APIC priority level.
>> 
>> An optimization like that (with a big fat comment) seems reasonable
>> to me.
>
> so we can use [0x10, 0x1f]
>
> sth like this?

Something.  We can not use all of 0x10 - 0x1f, it is simply
that hardware can address all of that. 0x10 is already defined
as something I forget what.   0x12 is already the MCE_VECTOR.


Since hardware has not yet defined 0x1f (and is not likely to for
a while.  We can use that).  So we wind up using hardware priority
a single ipi, and hardware exceptions.

Eric
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Suresh Siddha Jan. 5, 2010, 12:05 a.m. UTC | #5
On Mon, 2010-01-04 at 11:50 -0800, H. Peter Anvin wrote:
> On 01/04/2010 11:45 AM, Suresh Siddha wrote:
> > On Mon, 2010-01-04 at 11:35 -0800, Yinghai Lu wrote:
> >> sth like this?
> >>
> >> Subject: [PATCH 1/2] x86: get back 16 vectors
> >>
> >> -v2: according to hpa that we could start from 0x10
> >>      according to Eric, we should hold 16 vectors for IRQ MOVE
> >>
> >> Signed-off-by: Yinghai Lu <yinghai@kernel.org>
> >>
> > 
> > Yinghai we have to change IRQ_MOVE_CLEANUP_VECTOR to 0x1f or so. From
> > the cpu perspective this vector is documented as illegal, so we need to
> > check if this change will work on the cpu's we have today to get some
> > confidence.
> > 
> 
> It's documented as reserved, not illegal.  The ability for the APIC to
> generate vectors starting at 0x10 is documented, as is the ability for
> the CPU to receive any vector number as an interrupt -- in fact, the
> legacy BIOS relies on being able to receive interrupts starting at
> vector 0x08.  It causes problems galore, but only at the software level.

I have checked out couple of platforms (including 32-bit atom) and 0x1f
vector logic seems to be working.

Hopefully we won't have other hardware or software issues (vmm
restrictions etc) with this logic.

thanks,
suresh

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

Index: linux-2.6/arch/x86/kernel/apic/io_apic.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/apic/io_apic.c
+++ linux-2.6/arch/x86/kernel/apic/io_apic.c
@@ -1162,7 +1162,8 @@  __assign_irq_vector(int irq, struct irq_
 	 * Also, we've got to be careful not to trash gate
 	 * 0x80, because int 0x80 is hm, kind of importantish. ;)
 	 */
-	static int current_vector = FIRST_DEVICE_VECTOR, current_offset = 0;
+	static int current_vector = 0;
+	static int current_offset = 0;
 	unsigned int old_vector;
 	int cpu, err;
 	cpumask_var_t tmp_mask;
@@ -1198,7 +1199,7 @@  next:
 		if (vector >= first_system_vector) {
 			/* If out of vectors on large boxen, must share them. */
 			offset = (offset + 1) % 8;
-			vector = FIRST_DEVICE_VECTOR + offset;
+			vector = 0 + offset;
 		}
 		if (unlikely(current_vector == vector))
 			continue;
Index: linux-2.6/arch/x86/include/asm/irq_vectors.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/irq_vectors.h
+++ linux-2.6/arch/x86/include/asm/irq_vectors.h
@@ -30,8 +30,9 @@ 
 /*
  * IDT vectors usable for external interrupt sources start
  * at 0x20:
+ * hpa said we can start from 0x10
  */
-#define FIRST_EXTERNAL_VECTOR		0x20
+#define FIRST_EXTERNAL_VECTOR		0x10
 
 #ifdef CONFIG_X86_32
 # define SYSCALL_VECTOR			0x80
Index: linux-2.6/arch/x86/kernel/irqinit.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/irqinit.c
+++ linux-2.6/arch/x86/kernel/irqinit.c
@@ -149,6 +149,8 @@  static void __init smp_intr_init(void)
 {
 #ifdef CONFIG_SMP
 #if defined(CONFIG_X86_64) || defined(CONFIG_X86_LOCAL_APIC)
+	int i;
+
 	/*
 	 * The reschedule interrupt is a CPU-to-CPU reschedule-helper
 	 * IPI, driven by wakeup.
@@ -174,7 +176,9 @@  static void __init smp_intr_init(void)
 
 	/* Low priority IPI to cleanup after moving an irq */
 	set_intr_gate(IRQ_MOVE_CLEANUP_VECTOR, irq_move_cleanup_interrupt);
-	set_bit(IRQ_MOVE_CLEANUP_VECTOR, used_vectors);
+	/* Eric said: Need to hold entire priority */
+	for (i = IRQ_MOVE_CLEANUP_VECTOR; i < IRQ_MOVE_CLEANUP_VECTOR+0x10; i++)
+		set_bit(i, used_vectors);
 
 	/* IPI used for rebooting/stopping */
 	alloc_intr_gate(REBOOT_VECTOR, reboot_interrupt);
@@ -222,6 +226,9 @@  void __init native_init_IRQ(void)
 	/* Execute any quirks before the call gates are initialised: */
 	x86_init.irqs.pre_vector_init();
 
+	for (i = 0; i < 0x10; i++)
+		set_bit(i, used_vectors);
+
 	apic_intr_init();
 
 	/*
@@ -229,7 +236,7 @@  void __init native_init_IRQ(void)
 	 * us. (some of these will be overridden and become
 	 * 'special' SMP interrupts)
 	 */
-	for (i = FIRST_EXTERNAL_VECTOR; i < NR_VECTORS; i++) {
+	for (i = 0; i < NR_VECTORS; i++) {
 		/* IA32_SYSCALL_VECTOR could be used in trap_init already. */
 		if (!test_bit(i, used_vectors))
 			set_intr_gate(i, interrupt[i-FIRST_EXTERNAL_VECTOR]);