mbox series

[RFC,0/2] irqchip/gic: Allow the use of SGI interrupts

Message ID 20191023000547.7831-1-f.fainelli@gmail.com
Headers show
Series irqchip/gic: Allow the use of SGI interrupts | expand

Message

Florian Fainelli Oct. 23, 2019, 12:05 a.m. UTC
Hi all,

Sending this as RFC so as to gather comments on the approach chosen
here. The Broadcom STB mailbox driver and its firmware in EL3 use a
combination of "smc" for inbound (Linux to monitor) and SGI for outbound
(monitor to Linux) signaling. This mailbox driver can be seen here:

https://github.com/ffainelli/linux/commit/17cc97919f4cd2583d67e624273da8b54b44a4a7

(we may switch to the recently proposed standard arm-smc mailbox driver
proposed by Peng Fang, but we would need interrupt notification anyway).

In our downstream kernel, we have hacked the arch/*/kernel/smp.c code to
permit the installation of custom "IPI" handlers, this is obviously
wrong and absolutely not suitable for usptream.

Here, we allow the GIC to recognize SGI interrupt specified in Device
Tree with a new specifier in the first cell (2) and then we let the
mapping and translation occur provided that we are above the NR_IPI
range.

Immediate problems that I am aware of:

- on ARM (32-bit) NR_IPI does not include IPI_CPU_BACKTRACE, so we could
  (are) be off by one in our check against NR_IPI

Florian Fainelli (3):
  dt-bindings: Define interrupt type for SGI interrupts
  irqchip/gic: Allow the use of SGI interrupts

 .../interrupt-controller/arm,gic.yaml         |   2 +-
 drivers/irqchip/irq-gic.c                     |  41 ++-
 .../interrupt-controller/arm-gic.h            |   1 +
 7 files changed, 313 insertions(+), 16 deletions(-)
 create mode 100644 drivers/mailbox/brcmstb-mailbox.c

Comments

Marc Zyngier Oct. 23, 2019, 1:22 p.m. UTC | #1
Hi Florian,

Needless to say, I mostly have questions...

On 2019-10-23 01:05, Florian Fainelli wrote:
> SGI interrupts are a convenient way for trusted firmware to target a
> specific set of CPUs. Update the ARM GIC code to allow the 
> translation
> and mapping of SGI interrupts.
>
> Since the kernel already uses SGIs for various inter-processor 
> interrupt
> activities, we specifically make sure that we do not let users of the
> IRQ API to even try to map those.
>
> Internal IPIs remain dispatched through handle_IPI() while public 
> SGIs
> get promoted to a normal interrupt flow management.
>
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> ---
>  drivers/irqchip/irq-gic.c | 41 
> +++++++++++++++++++++++++++------------
>  1 file changed, 29 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
> index 30ab623343d3..dcfdbaacdd64 100644
> --- a/drivers/irqchip/irq-gic.c
> +++ b/drivers/irqchip/irq-gic.c
> @@ -385,7 +385,10 @@ static void __exception_irq_entry
> gic_handle_irq(struct pt_regs *regs)
>  			 * Pairs with the write barrier in gic_raise_softirq
>  			 */
>  			smp_rmb();
> -			handle_IPI(irqnr, regs);
> +			if (irqnr < NR_IPI)
> +				handle_IPI(irqnr, regs);
> +			else
> +				handle_domain_irq(gic->domain, irqnr, regs);

Double EOI, UNPREDICTABLE territory, your state machine is now dead.

>  #endif
>  			continue;
>  		}
> @@ -1005,20 +1008,34 @@ static int gic_irq_domain_translate(struct
> irq_domain *d,
>  		if (fwspec->param_count < 3)
>  			return -EINVAL;
>
> -		/* Get the interrupt number and add 16 to skip over SGIs */
> -		*hwirq = fwspec->param[1] + 16;
> -
> -		/*
> -		 * For SPIs, we need to add 16 more to get the GIC irq
> -		 * ID number
> -		 */
> -		if (!fwspec->param[0])
> +		*hwirq = fwspec->param[1];
> +		switch (fwspec->param[0]) {
> +		case 0:
> +			/*
> +			 * For SPIs, we need to add 16 more to get the GIC irq
> +			 * ID number
> +			 */
> +			*hwirq += 16;
> +			/* fall through */
> +		case 1:
> +			/* Add 16 to skip over SGIs */
>  			*hwirq += 16;
> +			*type = fwspec->param[2] & IRQ_TYPE_SENSE_MASK;
>
> -		*type = fwspec->param[2] & IRQ_TYPE_SENSE_MASK;
> +			/* Make it clear that broken DTs are... broken */
> +			WARN_ON(*type == IRQ_TYPE_NONE);
> +			break;
> +		case 2:
> +			/* Refuse to map internal IPIs */
> +			if (*hwirq < NR_IPI)

So depending on how the kernel uses SGIs, you can or cannot use these 
SGIs.
That looks like a good way to corner ourselves into not being to change 
much.

Also, do you expect this to work for both Group-0 and Group-1 
interrupts
(since you imply that this works as a communication medium with the 
secure
side)? Given that the kernel running in NS has no way to enable/disable
Group-0 interrupts, this looks terminally flawed. Or is that Group-1 
only?

How do we describe which SGIs are guaranteed to be available to Linux?

> +				return -EPERM;
> +
> +			*type = IRQ_TYPE_NONE;

Or not. SGI are edge triggered, by definition.

> +			break;
> +		default:
> +			break;
> +		}
>
> -		/* Make it clear that broken DTs are... broken */
> -		WARN_ON(*type == IRQ_TYPE_NONE);

Really?

         M.
Florian Fainelli Oct. 23, 2019, 5:02 p.m. UTC | #2
Hello marc,

On 10/23/19 6:22 AM, Marc Zyngier wrote:
> Hi Florian,
> 
> Needless to say, I mostly have questions...
> 
> On 2019-10-23 01:05, Florian Fainelli wrote:
>> SGI interrupts are a convenient way for trusted firmware to target a
>> specific set of CPUs. Update the ARM GIC code to allow the translation
>> and mapping of SGI interrupts.
>>
>> Since the kernel already uses SGIs for various inter-processor interrupt
>> activities, we specifically make sure that we do not let users of the
>> IRQ API to even try to map those.
>>
>> Internal IPIs remain dispatched through handle_IPI() while public SGIs
>> get promoted to a normal interrupt flow management.
>>
>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>> ---
>>  drivers/irqchip/irq-gic.c | 41 +++++++++++++++++++++++++++------------
>>  1 file changed, 29 insertions(+), 12 deletions(-)
>>
>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>> index 30ab623343d3..dcfdbaacdd64 100644
>> --- a/drivers/irqchip/irq-gic.c
>> +++ b/drivers/irqchip/irq-gic.c
>> @@ -385,7 +385,10 @@ static void __exception_irq_entry
>> gic_handle_irq(struct pt_regs *regs)
>>               * Pairs with the write barrier in gic_raise_softirq
>>               */
>>              smp_rmb();
>> -            handle_IPI(irqnr, regs);
>> +            if (irqnr < NR_IPI)
>> +                handle_IPI(irqnr, regs);
>> +            else
>> +                handle_domain_irq(gic->domain, irqnr, regs);
> 
> Double EOI, UNPREDICTABLE territory, your state machine is now dead.

Oh yes, the interrupt flow now also goes through ->irq_eoi (that's the
whole point), meh.

> 
>>  #endif
>>              continue;
>>          }
>> @@ -1005,20 +1008,34 @@ static int gic_irq_domain_translate(struct
>> irq_domain *d,
>>          if (fwspec->param_count < 3)
>>              return -EINVAL;
>>
>> -        /* Get the interrupt number and add 16 to skip over SGIs */
>> -        *hwirq = fwspec->param[1] + 16;
>> -
>> -        /*
>> -         * For SPIs, we need to add 16 more to get the GIC irq
>> -         * ID number
>> -         */
>> -        if (!fwspec->param[0])
>> +        *hwirq = fwspec->param[1];
>> +        switch (fwspec->param[0]) {
>> +        case 0:
>> +            /*
>> +             * For SPIs, we need to add 16 more to get the GIC irq
>> +             * ID number
>> +             */
>> +            *hwirq += 16;
>> +            /* fall through */
>> +        case 1:
>> +            /* Add 16 to skip over SGIs */
>>              *hwirq += 16;
>> +            *type = fwspec->param[2] & IRQ_TYPE_SENSE_MASK;
>>
>> -        *type = fwspec->param[2] & IRQ_TYPE_SENSE_MASK;
>> +            /* Make it clear that broken DTs are... broken */
>> +            WARN_ON(*type == IRQ_TYPE_NONE);
>> +            break;
>> +        case 2:
>> +            /* Refuse to map internal IPIs */
>> +            if (*hwirq < NR_IPI)
> 
> So depending on how the kernel uses SGIs, you can or cannot use these SGIs.
> That looks like a good way to corner ourselves into not being to change
> much.

arch/arm/kernel/smp.c has a forward looking statement about SGI numbering:

        /*
         * SGI8-15 can be reserved by secure firmware, and thus may
         * not be usable by the kernel. Please keep the above limited
         * to at most 8 entries.
         */

is this something that can be used as an universal and unbreakable rule
for the ARM64 kernel as well in order to ensure SGIs 8-15 can be usable
through the IRQ API or is this simply not a guarantee at all?

> 
> Also, do you expect this to work for both Group-0 and Group-1 interrupts
> (since you imply that this works as a communication medium with the secure
> side)? Given that the kernel running in NS has no way to enable/disable
> Group-0 interrupts, this looks terminally flawed. Or is that Group-1 only?

That would be Group-1 interrupts only, are you suggesting there is an
additional check being done that such SGIs are actually part of Group-1?

> 
> How do we describe which SGIs are guaranteed to be available to Linux?

In our case, the Device Tree mailbox node gets populated its interrupts
property with the SGI number(s), and that same number is also passed as
a configuration parameter to the trusted firmware. Or are you echoing
back to your earlier comment about the fact that if the kernel changes
its own definition of NR_IPI then we suddenly start breaking IRQ API
uses of SGIs in a certain range?

> 
>> +                return -EPERM;
>> +
>> +            *type = IRQ_TYPE_NONE;
> 
> Or not. SGI are edge triggered, by definition.
> 
>> +            break;
>> +        default:
>> +            break;
>> +        }
>>
>> -        /* Make it clear that broken DTs are... broken */
>> -        WARN_ON(*type == IRQ_TYPE_NONE);
> 
> Really?

Given the comment in gic_set_type() about SGIs, the WARN_ON() was moved
above to continue checking for GIC_SPI and GIC_PPI, but we should
extract the type from the Devic eTree and only permit an edge mask.
Marc Zyngier Oct. 24, 2019, 8:27 a.m. UTC | #3
On 2019-10-23 18:02, Florian Fainelli wrote:
> Hello marc,
>
> On 10/23/19 6:22 AM, Marc Zyngier wrote:
>> Hi Florian,
>>
>> Needless to say, I mostly have questions...
>>
>> On 2019-10-23 01:05, Florian Fainelli wrote:
>>> SGI interrupts are a convenient way for trusted firmware to target 
>>> a
>>> specific set of CPUs. Update the ARM GIC code to allow the 
>>> translation
>>> and mapping of SGI interrupts.
>>>
>>> Since the kernel already uses SGIs for various inter-processor 
>>> interrupt
>>> activities, we specifically make sure that we do not let users of 
>>> the
>>> IRQ API to even try to map those.
>>>
>>> Internal IPIs remain dispatched through handle_IPI() while public 
>>> SGIs
>>> get promoted to a normal interrupt flow management.
>>>
>>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>>> ---
>>>  drivers/irqchip/irq-gic.c | 41 
>>> +++++++++++++++++++++++++++------------
>>>  1 file changed, 29 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
>>> index 30ab623343d3..dcfdbaacdd64 100644
>>> --- a/drivers/irqchip/irq-gic.c
>>> +++ b/drivers/irqchip/irq-gic.c
>>> @@ -385,7 +385,10 @@ static void __exception_irq_entry
>>> gic_handle_irq(struct pt_regs *regs)
>>>               * Pairs with the write barrier in gic_raise_softirq
>>>               */
>>>              smp_rmb();
>>> -            handle_IPI(irqnr, regs);
>>> +            if (irqnr < NR_IPI)
>>> +                handle_IPI(irqnr, regs);
>>> +            else
>>> +                handle_domain_irq(gic->domain, irqnr, regs);
>>
>> Double EOI, UNPREDICTABLE territory, your state machine is now dead.
>
> Oh yes, the interrupt flow now also goes through ->irq_eoi (that's 
> the
> whole point), meh.

Indeed. But to be honest, we should probably consider moving all the 
SGI
handling to normal interrupts. There's hardly any reason why we should 
keep
SGIs out of the normal interrupt model, other than maybe performance 
(and
that's pretty dubious).

>
>>
>>>  #endif
>>>              continue;
>>>          }
>>> @@ -1005,20 +1008,34 @@ static int gic_irq_domain_translate(struct
>>> irq_domain *d,
>>>          if (fwspec->param_count < 3)
>>>              return -EINVAL;
>>>
>>> -        /* Get the interrupt number and add 16 to skip over SGIs 
>>> */
>>> -        *hwirq = fwspec->param[1] + 16;
>>> -
>>> -        /*
>>> -         * For SPIs, we need to add 16 more to get the GIC irq
>>> -         * ID number
>>> -         */
>>> -        if (!fwspec->param[0])
>>> +        *hwirq = fwspec->param[1];
>>> +        switch (fwspec->param[0]) {
>>> +        case 0:
>>> +            /*
>>> +             * For SPIs, we need to add 16 more to get the GIC irq
>>> +             * ID number
>>> +             */
>>> +            *hwirq += 16;
>>> +            /* fall through */
>>> +        case 1:
>>> +            /* Add 16 to skip over SGIs */
>>>              *hwirq += 16;
>>> +            *type = fwspec->param[2] & IRQ_TYPE_SENSE_MASK;
>>>
>>> -        *type = fwspec->param[2] & IRQ_TYPE_SENSE_MASK;
>>> +            /* Make it clear that broken DTs are... broken */
>>> +            WARN_ON(*type == IRQ_TYPE_NONE);
>>> +            break;
>>> +        case 2:
>>> +            /* Refuse to map internal IPIs */
>>> +            if (*hwirq < NR_IPI)
>>
>> So depending on how the kernel uses SGIs, you can or cannot use 
>> these SGIs.
>> That looks like a good way to corner ourselves into not being to 
>> change
>> much.
>
> arch/arm/kernel/smp.c has a forward looking statement about SGI 
> numbering:
>
>         /*
>          * SGI8-15 can be reserved by secure firmware, and thus may
>          * not be usable by the kernel. Please keep the above limited
>          * to at most 8 entries.
>          */
>
> is this something that can be used as an universal and unbreakable 
> rule
> for the ARM64 kernel as well in order to ensure SGIs 8-15 can be 
> usable
> through the IRQ API or is this simply not a guarantee at all?

There is no guarantee whatsoever. There's an ARM recommendation about 
the
above split, but that's it. Hardly something that can be enforced.

Now, your firmware is the one that gives you the DT, so if it is 
inconsistent
in configuring the interrupt and presenting it to the kernel, tough 
luck.

>> Also, do you expect this to work for both Group-0 and Group-1 
>> interrupts
>> (since you imply that this works as a communication medium with the 
>> secure
>> side)? Given that the kernel running in NS has no way to 
>> enable/disable
>> Group-0 interrupts, this looks terminally flawed. Or is that Group-1 
>> only?
>
> That would be Group-1 interrupts only, are you suggesting there is an
> additional check being done that such SGIs are actually part of 
> Group-1?

You can try and change the configuration of that interrupt (priority, 
for
example), and see if that sticks. If it doesn't, you're in trouble (and
nothing you can do about it).

>>
>> How do we describe which SGIs are guaranteed to be available to 
>> Linux?
>
> In our case, the Device Tree mailbox node gets populated its 
> interrupts
> property with the SGI number(s), and that same number is also passed 
> as
> a configuration parameter to the trusted firmware. Or are you echoing
> back to your earlier comment about the fact that if the kernel 
> changes
> its own definition of NR_IPI then we suddenly start breaking IRQ API
> uses of SGIs in a certain range?

That's indeed my worry. There is also the fact that the kernel itself 
will
never expose such reservation in DT, so we'd have to tread carefully 
here.
We probably need to specify that *only* SGI8-15 are allowed to be 
described
as such.

Another thing: why don't you use a PPI? If you use a GICv2, you're also 
using
ancient cores, and they have PPIs to spare. Or do you rely on being 
able to
inject interrupts from one core to another?

Thanks,

         M.