diff mbox series

[04/14] powerpc/vas: Setup IRQ mapping and register port for each window

Message ID 1574816731.13250.9.camel@hbabu-laptop (mailing list archive)
State Superseded
Headers show
Series powerpc/vas: Page fault handling for user space NX requests | expand

Commit Message

Haren Myneni Nov. 27, 2019, 1:05 a.m. UTC
Read interrupt and port values from the device tree, setup IRQ
mapping and register IRQ for each VAS instance. Set port value for
each NX window. When NX sees a fault on CRB, kernel gets an interrupt
and handles the fault.

IRQ setup and fault handling is needed only for user space send
windows. So for kernel requests, ignore if interrupts property is
not available.

Signed-off-by: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Signed-off-by: Haren Myneni <haren@us.ibm.com>
---
 arch/powerpc/platforms/powernv/vas-window.c | 14 ++++++
 arch/powerpc/platforms/powernv/vas.c        | 68 ++++++++++++++++++++++++++---
 arch/powerpc/platforms/powernv/vas.h        |  2 +
 3 files changed, 78 insertions(+), 6 deletions(-)

Comments

Christoph Hellwig Nov. 27, 2019, 8:33 a.m. UTC | #1
> +static irqreturn_t vas_irq_handler(int virq, void *data)
> +{
> +	struct vas_instance *vinst = data;
> +
> +	pr_devel("VAS %d: virq %d\n", vinst->vas_id, virq);
> +
> +	return IRQ_HANDLED;
> +}

An empty interrupt handler is rather pointless.  It later grows code,
but adding it without that is a bad idea.  Please squash the patches
into sesible chunks.
Oliver O'Halloran Dec. 18, 2019, 7:18 a.m. UTC | #2
On Wed, Nov 27, 2019 at 12:07 PM Haren Myneni <haren@linux.vnet.ibm.com> wrote:
>
> *snip*
>
> @@ -36,7 +62,18 @@ static int init_vas_instance(struct platform_device *pdev)
>                 return -ENODEV;
>         }
>
> -       if (pdev->num_resources != 4) {
> +       rc = of_property_read_u64(dn, "ibm,vas-port", &port);
> +       if (rc) {
> +               pr_err("No ibm,vas-port property for %s?\n", pdev->name);
> +               /* No interrupts property */
> +               nresources = 4;
> +       }
> +
> +       /*
> +        * interrupts property is available with 'ibm,vas-port' property.
> +        * 4 Resources and 1 IRQ if interrupts property is available.
> +        */
> +       if (pdev->num_resources != nresources) {
>                 pr_err("Unexpected DT configuration for [%s, %d]\n",
>                                 pdev->name, vasid);
>                 return -ENODEV;

Right, so adding the IRQ in firmware will break the VAS driver in
existing kernels since it changes the resource count. This is IMO a
bug in the VAS driver that you should fix, but it does mean we need to
think twice about having firmware assign an interrupt at boot.

I had a closer look at this series and I'm not convinced that any
firmware changes are actually required either. We already have OPAL
calls for allocating an hwirq for the kernel to use and for getting
the IRQ's XIVE trigger port (see pnv_ocxl_alloc_xive_irq() for an
example). Why not use those here too? Doing so would allow us to
assign interrupts to individual windows too which might be useful for
the windows used by the kernel.
Haren Myneni Dec. 18, 2019, 11:13 p.m. UTC | #3
On Wed, 2019-12-18 at 18:18 +1100, Oliver O'Halloran wrote:
> On Wed, Nov 27, 2019 at 12:07 PM Haren Myneni <haren@linux.vnet.ibm.com> wrote:
> >
> > *snip*
> >
> > @@ -36,7 +62,18 @@ static int init_vas_instance(struct platform_device *pdev)
> >                 return -ENODEV;
> >         }
> >
> > -       if (pdev->num_resources != 4) {
> > +       rc = of_property_read_u64(dn, "ibm,vas-port", &port);
> > +       if (rc) {
> > +               pr_err("No ibm,vas-port property for %s?\n", pdev->name);
> > +               /* No interrupts property */
> > +               nresources = 4;
> > +       }
> > +
> > +       /*
> > +        * interrupts property is available with 'ibm,vas-port' property.
> > +        * 4 Resources and 1 IRQ if interrupts property is available.
> > +        */
> > +       if (pdev->num_resources != nresources) {
> >                 pr_err("Unexpected DT configuration for [%s, %d]\n",
> >                                 pdev->name, vasid);
> >                 return -ENODEV;
> 
> Right, so adding the IRQ in firmware will break the VAS driver in
> existing kernels since it changes the resource count. This is IMO a
> bug in the VAS driver that you should fix, but it does mean we need to
> think twice about having firmware assign an interrupt at boot.

Correct, Hence added vas-user-space nvram switch in skiboot.  

> 
> I had a closer look at this series and I'm not convinced that any
> firmware changes are actually required either. We already have OPAL
> calls for allocating an hwirq for the kernel to use and for getting
> the IRQ's XIVE trigger port (see pnv_ocxl_alloc_xive_irq() for an
> example). Why not use those here too? Doing so would allow us to
> assign interrupts to individual windows too which might be useful for
> the windows used by the kernel.

Thanks for the pointer. like using pnv_ocxl_alloc_xive_irq(), we can
disregard FW change. BTW, VAS fault handling is needed only for user
space VAS windows. 

 int vas_alloc_xive_irq(u32 chipid, u32 *irq, u64 *trigger_addr)
{
        __be64 flags, trigger_page;
        u32 hwirq;
        s64 rc;

        hwirq = opal_xive_allocate_irq_raw(chipid);
        if (hwirq < 0)
                return -ENOENT;

        rc = opal_xive_get_irq_info(hwirq, &flags, NULL, &trigger_page,
NULL,
                                NULL);
        if (rc || !trigger_page) {
                xive_native_free_irq(hwirq);
                return -ENOENT;
        }

        *irq = hwirq;
        *trigger_addr = be64_to_cpu(trigger_page);
        return 0;
}

We can have common function for VAS and cxl except per chip IRQ
allocation is needed for each VAS instance. I will post patch-set with
this change.

Thanks
Haren
Haren Myneni Dec. 19, 2019, 10:31 p.m. UTC | #4
On Wed, 2019-12-18 at 15:13 -0800, Haren Myneni wrote:
> On Wed, 2019-12-18 at 18:18 +1100, Oliver O'Halloran wrote:
> > On Wed, Nov 27, 2019 at 12:07 PM Haren Myneni <haren@linux.vnet.ibm.com> wrote:
> > >
> > > *snip*
> > >
> > > @@ -36,7 +62,18 @@ static int init_vas_instance(struct platform_device *pdev)
> > >                 return -ENODEV;
> > >         }
> > >
> > > -       if (pdev->num_resources != 4) {
> > > +       rc = of_property_read_u64(dn, "ibm,vas-port", &port);
> > > +       if (rc) {
> > > +               pr_err("No ibm,vas-port property for %s?\n", pdev->name);
> > > +               /* No interrupts property */
> > > +               nresources = 4;
> > > +       }
> > > +
> > > +       /*
> > > +        * interrupts property is available with 'ibm,vas-port' property.
> > > +        * 4 Resources and 1 IRQ if interrupts property is available.
> > > +        */
> > > +       if (pdev->num_resources != nresources) {
> > >                 pr_err("Unexpected DT configuration for [%s, %d]\n",
> > >                                 pdev->name, vasid);
> > >                 return -ENODEV;
> > 
> > Right, so adding the IRQ in firmware will break the VAS driver in
> > existing kernels since it changes the resource count. This is IMO a
> > bug in the VAS driver that you should fix, but it does mean we need to
> > think twice about having firmware assign an interrupt at boot.
> 
> Correct, Hence added vas-user-space nvram switch in skiboot.  
> 
> > 
> > I had a closer look at this series and I'm not convinced that any
> > firmware changes are actually required either. We already have OPAL
> > calls for allocating an hwirq for the kernel to use and for getting
> > the IRQ's XIVE trigger port (see pnv_ocxl_alloc_xive_irq() for an
> > example). Why not use those here too? Doing so would allow us to
> > assign interrupts to individual windows too which might be useful for
> > the windows used by the kernel.
> 
> Thanks for the pointer. like using pnv_ocxl_alloc_xive_irq(), we can
> disregard FW change. BTW, VAS fault handling is needed only for user
> space VAS windows. 
> 
>  int vas_alloc_xive_irq(u32 chipid, u32 *irq, u64 *trigger_addr)
> {
>         __be64 flags, trigger_page;
>         u32 hwirq;
>         s64 rc;
> 
>         hwirq = opal_xive_allocate_irq_raw(chipid);
>         if (hwirq < 0)
>                 return -ENOENT;
> 
>         rc = opal_xive_get_irq_info(hwirq, &flags, NULL, &trigger_page,
> NULL,
>                                 NULL);
>         if (rc || !trigger_page) {
>                 xive_native_free_irq(hwirq);
>                 return -ENOENT;
>         }
> 
>         *irq = hwirq;
>         *trigger_addr = be64_to_cpu(trigger_page);
>         return 0;
> }
> 
> We can have common function for VAS and cxl except per chip IRQ
> allocation is needed for each VAS instance. I will post patch-set with
> this change.
> 

power9 will have only XIVE interrupt controller including on open-power
systems. Correct?

VAS need per chip IRQ allocation. The current interfaces (ex:
xive_native_alloc_irq(void)) allocates IRQ on any chip
(OPAL_XIVE_ANY_CHIP)
So to use these interfaces for VAS, any concerns with the following
patch:
Changes: passing chip_id to xive_native_alloc_irq() and define
xive_native_alloc_get_irq_info() in xive/native.c which can be used in
ocxl and VAS.

diff --git a/arch/powerpc/include/asm/xive.h b/arch/powerpc/include/asm/xive.h
index 24cdf97..b310062 100644
--- a/arch/powerpc/include/asm/xive.h
+++ b/arch/powerpc/include/asm/xive.h
@@ -108,7 +108,7 @@ struct xive_q {
 extern int xive_native_populate_irq_data(u32 hw_irq,
 					 struct xive_irq_data *data);
 extern void xive_cleanup_irq_data(struct xive_irq_data *xd);
-extern u32 xive_native_alloc_irq(void);
+extern u32 xive_native_alloc_irq(u32 chip_id);
 extern void xive_native_free_irq(u32 irq);
 extern int xive_native_configure_irq(u32 hw_irq, u32 target, u8 prio, u32 sw_irq);
 
@@ -137,7 +137,8 @@ extern int xive_native_set_queue_state(u32 vp_id, uint32_t prio, u32 qtoggle,
 				       u32 qindex);
 extern int xive_native_get_vp_state(u32 vp_id, u64 *out_state);
 extern bool xive_native_has_queue_state_support(void);
-
+extern int xive_native_alloc_get_irq_info(u32 chip_id, u32 *irq,
+					u64 *trigger_addr);
 #else
 
 static inline bool xive_enabled(void) { return false; }
diff --git a/arch/powerpc/kvm/book3s_xive.c b/arch/powerpc/kvm/book3s_xive.c
index 66858b7..59009e1 100644
--- a/arch/powerpc/kvm/book3s_xive.c
+++ b/arch/powerpc/kvm/book3s_xive.c
@@ -1299,7 +1299,7 @@ int kvmppc_xive_connect_vcpu(struct kvm_device *dev,
 	vcpu->arch.xive_cam_word = cpu_to_be32(xc->vp_cam | TM_QW1W2_VO);
 
 	/* Allocate IPI */
-	xc->vp_ipi = xive_native_alloc_irq();
+	xc->vp_ipi = xive_native_alloc_irq(OPAL_XIVE_ANY_CHIP);
 	if (!xc->vp_ipi) {
 		pr_err("Failed to allocate xive irq for VCPU IPI\n");
 		r = -EIO;
@@ -1711,7 +1711,7 @@ static int xive_set_source(struct kvmppc_xive *xive, long irq, u64 addr)
 	 * one and get the corresponding data
 	 */
 	if (!state->ipi_number) {
-		state->ipi_number = xive_native_alloc_irq();
+		state->ipi_number = xive_native_alloc_irq(OPAL_XIVE_ANY_CHIP);
 		if (state->ipi_number == 0) {
 			pr_devel("Failed to allocate IPI !\n");
 			return -ENOMEM;
diff --git a/arch/powerpc/kvm/book3s_xive_native.c b/arch/powerpc/kvm/book3s_xive_native.c
index d83adb1..0adb228 100644
--- a/arch/powerpc/kvm/book3s_xive_native.c
+++ b/arch/powerpc/kvm/book3s_xive_native.c
@@ -359,7 +359,7 @@ static int kvmppc_xive_native_set_source(struct kvmppc_xive *xive, long irq,
 	 * one and get the corresponding data
 	 */
 	if (!state->ipi_number) {
-		state->ipi_number = xive_native_alloc_irq();
+		state->ipi_number = xive_native_alloc_irq(OPAL_XIVE_ANY_CHIP);
 		if (state->ipi_number == 0) {
 			pr_err("Failed to allocate IRQ !\n");
 			rc = -ENXIO;
diff --git a/arch/powerpc/platforms/powernv/ocxl.c b/arch/powerpc/platforms/powernv/ocxl.c
index 8c65aac..fb8f99a 100644
--- a/arch/powerpc/platforms/powernv/ocxl.c
+++ b/arch/powerpc/platforms/powernv/ocxl.c
@@ -487,24 +487,8 @@ int pnv_ocxl_spa_remove_pe_from_cache(void *platform_data, int pe_handle)
 
 int pnv_ocxl_alloc_xive_irq(u32 *irq, u64 *trigger_addr)
 {
-	__be64 flags, trigger_page;
-	s64 rc;
-	u32 hwirq;
-
-	hwirq = xive_native_alloc_irq();
-	if (!hwirq)
-		return -ENOENT;
-
-	rc = opal_xive_get_irq_info(hwirq, &flags, NULL, &trigger_page, NULL,
-				NULL);
-	if (rc || !trigger_page) {
-		xive_native_free_irq(hwirq);
-		return -ENOENT;
-	}
-	*irq = hwirq;
-	*trigger_addr = be64_to_cpu(trigger_page);
-	return 0;
-
+	return xive_native_alloc_get_irq_info(OPAL_XIVE_ANY_CHIP, irq,
+						trigger_addr);
 }
 EXPORT_SYMBOL_GPL(pnv_ocxl_alloc_xive_irq);
 
diff --git a/arch/powerpc/sysdev/xive/native.c b/arch/powerpc/sysdev/xive/native.c
index 0ff6b73..c450838 100644
--- a/arch/powerpc/sysdev/xive/native.c
+++ b/arch/powerpc/sysdev/xive/native.c
@@ -279,12 +279,12 @@ static int xive_native_get_ipi(unsigned int cpu, struct xive_cpu *xc)
 }
 #endif /* CONFIG_SMP */
 
-u32 xive_native_alloc_irq(void)
+u32 xive_native_alloc_irq(u32 chip_id)
 {
 	s64 rc;
 
 	for (;;) {
-		rc = opal_xive_allocate_irq(OPAL_XIVE_ANY_CHIP);
+		rc = opal_xive_allocate_irq(chip_id);
 		if (rc != OPAL_BUSY)
 			break;
 		msleep(OPAL_BUSY_DELAY_MS);
@@ -295,6 +295,29 @@ u32 xive_native_alloc_irq(void)
 }
 EXPORT_SYMBOL_GPL(xive_native_alloc_irq);
 
+int xive_native_alloc_get_irq_info(u32 chip_id, u32 *irq, u64 *trigger_addr)
+{
+	__be64 flags, trigger_page;
+	u32 hwirq;
+	s64 rc;
+
+	hwirq = xive_native_alloc_irq(chip_id);
+	if (!hwirq)
+		return -ENOENT;
+
+	rc = opal_xive_get_irq_info(hwirq, &flags, NULL, &trigger_page, NULL,
+				NULL);
+	if (rc || !trigger_page) {
+		xive_native_free_irq(hwirq);
+		return -ENOENT;
+	}
+	*irq = hwirq;
+	*trigger_addr = be64_to_cpu(trigger_page);
+
+	return 0;
+}
+EXPORT_SYMBOL(xive_native_alloc_get_irq_info);
+
 void xive_native_free_irq(u32 irq)
 {
 	for (;;) {
diff mbox series

Patch

diff --git a/arch/powerpc/platforms/powernv/vas-window.c b/arch/powerpc/platforms/powernv/vas-window.c
index ea5ca02..ad6be91 100644
--- a/arch/powerpc/platforms/powernv/vas-window.c
+++ b/arch/powerpc/platforms/powernv/vas-window.c
@@ -758,6 +758,8 @@  static void init_winctx_for_rxwin(struct vas_window *rxwin,
 
 	winctx->min_scope = VAS_SCOPE_LOCAL;
 	winctx->max_scope = VAS_SCOPE_VECTORED_GROUP;
+	if (rxwin->vinst->virq)
+		winctx->irq_port = rxwin->vinst->irq_port;
 }
 
 static bool rx_win_args_valid(enum vas_cop_type cop,
@@ -959,6 +961,8 @@  static void init_winctx_for_txwin(struct vas_window *txwin,
 	winctx->tc_mode = txattr->tc_mode;
 	winctx->min_scope = VAS_SCOPE_LOCAL;
 	winctx->max_scope = VAS_SCOPE_VECTORED_GROUP;
+	if (txwin->vinst->virq)
+		winctx->irq_port = txwin->vinst->irq_port;
 
 	winctx->pswid = 0;
 }
@@ -1050,6 +1054,16 @@  struct vas_window *vas_tx_win_open(int vasid, enum vas_cop_type cop,
 		}
 	} else {
 		/*
+		 * Interrupt hanlder setup failed. Means NX can not generate
+		 * fault for page fault. So not opening for user space tx
+		 * window.
+		 */
+		if (!vinst->virq) {
+			rc = -ENODEV;
+			goto free_window;
+		}
+
+		/*
 		 * A user mapping must ensure that context switch issues
 		 * CP_ABORT for this thread.
 		 */
diff --git a/arch/powerpc/platforms/powernv/vas.c b/arch/powerpc/platforms/powernv/vas.c
index ed9cc6d..71bddaa 100644
--- a/arch/powerpc/platforms/powernv/vas.c
+++ b/arch/powerpc/platforms/powernv/vas.c
@@ -14,6 +14,8 @@ 
 #include <linux/of_platform.h>
 #include <linux/of_address.h>
 #include <linux/of.h>
+#include <linux/irqdomain.h>
+#include <linux/interrupt.h>
 #include <asm/prom.h>
 
 #include "vas.h"
@@ -23,9 +25,33 @@ 
 
 static DEFINE_PER_CPU(int, cpu_vas_id);
 
+static irqreturn_t vas_irq_handler(int virq, void *data)
+{
+	struct vas_instance *vinst = data;
+
+	pr_devel("VAS %d: virq %d\n", vinst->vas_id, virq);
+
+	return IRQ_HANDLED;
+}
+
+static void vas_irq_fault_handle_setup(struct vas_instance *vinst)
+{
+	int rc;
+	char devname[64];
+
+	snprintf(devname, sizeof(devname), "vas-inst-%d", vinst->vas_id);
+	rc = request_irq(vinst->virq, vas_irq_handler, 0, devname, vinst);
+	if (rc) {
+		pr_err("VAS[%d]: Request IRQ(%d) failed with %d\n",
+				vinst->vas_id, vinst->virq, rc);
+		vinst->virq = 0;
+	}
+}
+
 static int init_vas_instance(struct platform_device *pdev)
 {
-	int rc, cpu, vasid;
+	int rc, cpu, vasid, nresources = 5;
+	uint64_t port;
 	struct resource *res;
 	struct vas_instance *vinst;
 	struct device_node *dn = pdev->dev.of_node;
@@ -36,7 +62,18 @@  static int init_vas_instance(struct platform_device *pdev)
 		return -ENODEV;
 	}
 
-	if (pdev->num_resources != 4) {
+	rc = of_property_read_u64(dn, "ibm,vas-port", &port);
+	if (rc) {
+		pr_err("No ibm,vas-port property for %s?\n", pdev->name);
+		/* No interrupts property */
+		nresources = 4;
+	}
+
+	/*
+	 * interrupts property is available with 'ibm,vas-port' property.
+	 * 4 Resources and 1 IRQ if interrupts property is available.
+	 */
+	if (pdev->num_resources != nresources) {
 		pr_err("Unexpected DT configuration for [%s, %d]\n",
 				pdev->name, vasid);
 		return -ENODEV;
@@ -51,6 +88,7 @@  static int init_vas_instance(struct platform_device *pdev)
 	mutex_init(&vinst->mutex);
 	vinst->vas_id = vasid;
 	vinst->pdev = pdev;
+	vinst->irq_port = port;
 
 	res = &pdev->resource[0];
 	vinst->hvwc_bar_start = res->start;
@@ -66,12 +104,23 @@  static int init_vas_instance(struct platform_device *pdev)
 		pr_err("Bad 'paste_win_id_shift' in DT, %llx\n", res->end);
 		goto free_vinst;
 	}
-
 	vinst->paste_win_id_shift = 63 - res->end;
 
-	pr_devel("Initialized instance [%s, %d], paste_base 0x%llx, "
-			"paste_win_id_shift 0x%llx\n", pdev->name, vasid,
-			vinst->paste_base_addr, vinst->paste_win_id_shift);
+	/* interrupts property */
+	if (pdev->num_resources == 5) {
+		res = &pdev->resource[4];
+		vinst->virq = res->start;
+		if (vinst->virq <= 0) {
+			pr_err("IRQ resource is not available for [%s, %d]\n",
+				pdev->name, vasid);
+			vinst->virq = 0;
+		}
+	}
+
+	pr_devel("Initialized instance [%s, %d] paste_base 0x%llx paste_win_id_shift 0x%llx IRQ %d Port 0x%llx\n",
+			pdev->name, vasid, vinst->paste_base_addr,
+			vinst->paste_win_id_shift, vinst->virq,
+			vinst->irq_port);
 
 	for_each_possible_cpu(cpu) {
 		if (cpu_to_chip_id(cpu) == of_get_ibm_chip_id(dn))
@@ -82,6 +131,13 @@  static int init_vas_instance(struct platform_device *pdev)
 	list_add(&vinst->node, &vas_instances);
 	mutex_unlock(&vas_mutex);
 
+	/*
+	 * IRQ and fault handling setup is needed only for user space
+	 * send windows.
+	 */
+	if (vinst->virq)
+		vas_irq_fault_handle_setup(vinst);
+
 	vas_instance_init_dbgdir(vinst);
 
 	dev_set_drvdata(&pdev->dev, vinst);
diff --git a/arch/powerpc/platforms/powernv/vas.h b/arch/powerpc/platforms/powernv/vas.h
index 9cc5251..bf7d3db 100644
--- a/arch/powerpc/platforms/powernv/vas.h
+++ b/arch/powerpc/platforms/powernv/vas.h
@@ -313,6 +313,8 @@  struct vas_instance {
 	u64 paste_base_addr;
 	u64 paste_win_id_shift;
 
+	u64 irq_port;
+	int virq;
 	struct mutex mutex;
 	struct vas_window *rxwin[VAS_COP_TYPE_MAX];
 	struct vas_window *windows[VAS_WINDOWS_PER_CHIP];