diff mbox series

[kvm-unit-tests,v7,10/13] arm/arm64: ITS: INT functional tests

Message ID 20200320092428.20880-11-eric.auger@redhat.com
State New
Headers show
Series arm/arm64: Add ITS tests | expand

Commit Message

Eric Auger March 20, 2020, 9:24 a.m. UTC
Triggers LPIs through the INT command.

the test checks the LPI hits the right CPU and triggers
the right LPI intid, ie. the translation is correct.

Updates to the config table also are tested, along with inv
and invall commands.

Signed-off-by: Eric Auger <eric.auger@redhat.com>

---
v5 -> v6:
- removed collection-unmap test
- moved the collection invalidation after the MAPCs

v4 -> v5:
- move the test stub from the header to arm/gic.c

v3 -> v4:
- assert in lpi_handler if the interrupt is not an LPI
- remove check_lpi_stats from its_prerequisites()

v2 -> v3:
- add comments
- keep the report_skip in case there aren't 4 vcpus to be able to
  run other tests in the its category.
- fix the prefix pop
- move its_event and its_stats to arm/gic.c
---
 arm/gic.c         | 215 +++++++++++++++++++++++++++++++++++++++++++---
 arm/unittests.cfg |   7 ++
 2 files changed, 211 insertions(+), 11 deletions(-)

Comments

Zenghui Yu March 30, 2020, 10:43 a.m. UTC | #1
Hi Eric,

On 2020/3/20 17:24, Eric Auger wrote:
> Triggers LPIs through the INT command.
> 
> the test checks the LPI hits the right CPU and triggers
> the right LPI intid, ie. the translation is correct.
> 
> Updates to the config table also are tested, along with inv
> and invall commands.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>

[...]

So I've tested this series and found that the "INT" test will sometimes
fail.

"not ok 12 - gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en 
PE #3 after migration
not ok 13 - gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on 
PE #2 after migration"

 From logs:
"INFO: gicv3: its-migration: Migration complete
INT dev_id=2 event_id=20
INFO: gicv3: its-migration: No LPI received whereas (cpuid=3, 
intid=8195) was expected
FAIL: gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en PE #3 
after migration
INT dev_id=7 event_id=255
INFO: gicv3: its-migration: No LPI received whereas (cpuid=2, 
intid=8196) was expected
FAIL: gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on PE #2 
after migration"

> +static void check_lpi_stats(const char *msg)
> +{
> +	bool pass = false;
> +
> +	mdelay(100);

After changing this to 'mdelay(1000)', the above error doesn't show up
anymore. But it sounds strange that 100ms is not enough to deliver a
single LPI. I haven't dig it further but will get back here later.

> +	smp_rmb(); /* pairs with wmb in lpi_handler */
> +	if (lpi_stats.observed.cpu_id != lpi_stats.expected.cpu_id ||
> +	    lpi_stats.observed.lpi_id != lpi_stats.expected.lpi_id) {
> +		if (lpi_stats.observed.cpu_id == -1 &&
> +		    lpi_stats.observed.lpi_id == -1) {
> +			report_info("No LPI received whereas (cpuid=%d, intid=%d) "
> +				    "was expected", lpi_stats.expected.cpu_id,
> +				    lpi_stats.expected.lpi_id);
> +		} else {
> +			report_info("Unexpected LPI (cpuid=%d, intid=%d)",
> +				    lpi_stats.observed.cpu_id,
> +				    lpi_stats.observed.lpi_id);
> +		}
> +	} else {
> +		pass = true;
> +	}
> +	report(pass, "%s", msg);
> +}

This patch itself looks good to me,
Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>


Thanks
Eric Auger April 2, 2020, 8:50 a.m. UTC | #2
Hi Zenghui,

On 3/30/20 12:43 PM, Zenghui Yu wrote:
> Hi Eric,
> 
> On 2020/3/20 17:24, Eric Auger wrote:
>> Triggers LPIs through the INT command.
>>
>> the test checks the LPI hits the right CPU and triggers
>> the right LPI intid, ie. the translation is correct.
>>
>> Updates to the config table also are tested, along with inv
>> and invall commands.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> 
> [...]
> 
> So I've tested this series and found that the "INT" test will sometimes
> fail.
> 
> "not ok 12 - gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en
> PE #3 after migration
> not ok 13 - gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on
> PE #2 after migration"
> 
> From logs:
> "INFO: gicv3: its-migration: Migration complete
> INT dev_id=2 event_id=20
> INFO: gicv3: its-migration: No LPI received whereas (cpuid=3,
> intid=8195) was expected
> FAIL: gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en PE #3
> after migration
> INT dev_id=7 event_id=255
> INFO: gicv3: its-migration: No LPI received whereas (cpuid=2,
> intid=8196) was expected
> FAIL: gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on PE #2
> after migration"
> 
>> +static void check_lpi_stats(const char *msg)
>> +{
>> +    bool pass = false;
>> +
>> +    mdelay(100);
> 
> After changing this to 'mdelay(1000)', the above error doesn't show up
> anymore. But it sounds strange that 100ms is not enough to deliver a
> single LPI. I haven't dig it further but will get back here later.

Did you find some time to investigate this issue. Changing 100 to 1000
has a huge impact on the overall test duration and I don't think it is
sensible. Could you see what is your minimal value that pass the tests?

Thanks

Eric
> 
>> +    smp_rmb(); /* pairs with wmb in lpi_handler */
>> +    if (lpi_stats.observed.cpu_id != lpi_stats.expected.cpu_id ||
>> +        lpi_stats.observed.lpi_id != lpi_stats.expected.lpi_id) {
>> +        if (lpi_stats.observed.cpu_id == -1 &&
>> +            lpi_stats.observed.lpi_id == -1) {
>> +            report_info("No LPI received whereas (cpuid=%d, intid=%d) "
>> +                    "was expected", lpi_stats.expected.cpu_id,
>> +                    lpi_stats.expected.lpi_id);
>> +        } else {
>> +            report_info("Unexpected LPI (cpuid=%d, intid=%d)",
>> +                    lpi_stats.observed.cpu_id,
>> +                    lpi_stats.observed.lpi_id);
>> +        }
>> +    } else {
>> +        pass = true;
>> +    }
>> +    report(pass, "%s", msg);
>> +}
> 
> This patch itself looks good to me,
> Reviewed-by: Zenghui Yu <yuzenghui@huawei.com>
> 
> 
> Thanks
>
Zenghui Yu April 2, 2020, 12:40 p.m. UTC | #3
Hi Eric,

On 2020/4/2 16:50, Auger Eric wrote:
> Hi Zenghui,
> 
> On 3/30/20 12:43 PM, Zenghui Yu wrote:
>> Hi Eric,
>>
>> On 2020/3/20 17:24, Eric Auger wrote:
>>> Triggers LPIs through the INT command.
>>>
>>> the test checks the LPI hits the right CPU and triggers
>>> the right LPI intid, ie. the translation is correct.
>>>
>>> Updates to the config table also are tested, along with inv
>>> and invall commands.
>>>
>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>
>> [...]
>>
>> So I've tested this series and found that the "INT" test will sometimes
>> fail.
>>
>> "not ok 12 - gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en
>> PE #3 after migration
>> not ok 13 - gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on
>> PE #2 after migration"
>>
>>  From logs:
>> "INFO: gicv3: its-migration: Migration complete
>> INT dev_id=2 event_id=20
>> INFO: gicv3: its-migration: No LPI received whereas (cpuid=3,
>> intid=8195) was expected
>> FAIL: gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en PE #3
>> after migration
>> INT dev_id=7 event_id=255
>> INFO: gicv3: its-migration: No LPI received whereas (cpuid=2,
>> intid=8196) was expected
>> FAIL: gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on PE #2
>> after migration"
>>
>>> +static void check_lpi_stats(const char *msg)
>>> +{
>>> +    bool pass = false;
>>> +
>>> +    mdelay(100);
>>
>> After changing this to 'mdelay(1000)', the above error doesn't show up
>> anymore. But it sounds strange that 100ms is not enough to deliver a
>> single LPI. I haven't dig it further but will get back here later.
> 
> Did you find some time to investigate this issue. Changing 100 to 1000
> has a huge impact on the overall test duration and I don't think it is
> sensible. Could you see what is your minimal value that pass the tests?

I can reproduce this issue with a very *low* probability so I failed
to investigate it :-(.  (It might because the LPI was delivered to a
busy vcpu...)

You can leave it as it is until someone else complain about it again.
Or take the similar approach as check_acked() - wait up to 5s for the
interrupt to be delivered, and bail out as soon as we see it.


Thanks,
Zenghui
Andrew Jones April 2, 2020, 2:41 p.m. UTC | #4
On Thu, Apr 02, 2020 at 08:40:42PM +0800, Zenghui Yu wrote:
> Hi Eric,
> 
> On 2020/4/2 16:50, Auger Eric wrote:
> > Hi Zenghui,
> > 
> > On 3/30/20 12:43 PM, Zenghui Yu wrote:
> > > Hi Eric,
> > > 
> > > On 2020/3/20 17:24, Eric Auger wrote:
> > > > Triggers LPIs through the INT command.
> > > > 
> > > > the test checks the LPI hits the right CPU and triggers
> > > > the right LPI intid, ie. the translation is correct.
> > > > 
> > > > Updates to the config table also are tested, along with inv
> > > > and invall commands.
> > > > 
> > > > Signed-off-by: Eric Auger <eric.auger@redhat.com>
> > > 
> > > [...]
> > > 
> > > So I've tested this series and found that the "INT" test will sometimes
> > > fail.
> > > 
> > > "not ok 12 - gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en
> > > PE #3 after migration
> > > not ok 13 - gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on
> > > PE #2 after migration"
> > > 
> > >  From logs:
> > > "INFO: gicv3: its-migration: Migration complete
> > > INT dev_id=2 event_id=20
> > > INFO: gicv3: its-migration: No LPI received whereas (cpuid=3,
> > > intid=8195) was expected
> > > FAIL: gicv3: its-migration: dev2/eventid=20 triggers LPI 8195 en PE #3
> > > after migration
> > > INT dev_id=7 event_id=255
> > > INFO: gicv3: its-migration: No LPI received whereas (cpuid=2,
> > > intid=8196) was expected
> > > FAIL: gicv3: its-migration: dev7/eventid=255 triggers LPI 8196 on PE #2
> > > after migration"
> > > 
> > > > +static void check_lpi_stats(const char *msg)
> > > > +{
> > > > +    bool pass = false;
> > > > +
> > > > +    mdelay(100);
> > > 
> > > After changing this to 'mdelay(1000)', the above error doesn't show up
> > > anymore. But it sounds strange that 100ms is not enough to deliver a
> > > single LPI. I haven't dig it further but will get back here later.
> > 
> > Did you find some time to investigate this issue. Changing 100 to 1000
> > has a huge impact on the overall test duration and I don't think it is
> > sensible. Could you see what is your minimal value that pass the tests?
> 
> I can reproduce this issue with a very *low* probability so I failed
> to investigate it :-(.  (It might because the LPI was delivered to a
> busy vcpu...)
> 
> You can leave it as it is until someone else complain about it again.
> Or take the similar approach as check_acked() - wait up to 5s for the
> interrupt to be delivered, and bail out as soon as we see it.

I think the check_acked approach would be the best approach.

Thanks,
drew

> 
> 
> Thanks,
> Zenghui
> 
>
diff mbox series

Patch

diff --git a/arm/gic.c b/arm/gic.c
index 649ed81..5f1e595 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -159,6 +159,85 @@  static void ipi_handler(struct pt_regs *regs __unused)
 	}
 }
 
+static void setup_irq(irq_handler_fn handler)
+{
+	gic_enable_defaults();
+#ifdef __arm__
+	install_exception_handler(EXCPTN_IRQ, handler);
+#else
+	install_irq_handler(EL1H_IRQ, handler);
+#endif
+	local_irq_enable();
+}
+
+#if defined(__aarch64__)
+struct its_event {
+	int cpu_id;
+	int lpi_id;
+};
+
+struct its_stats {
+	struct its_event expected;
+	struct its_event observed;
+};
+
+static struct its_stats lpi_stats;
+
+static void lpi_handler(struct pt_regs *regs __unused)
+{
+	u32 irqstat = gic_read_iar();
+	int irqnr = gic_iar_irqnr(irqstat);
+
+	gic_write_eoir(irqstat);
+	assert(irqnr >= 8192);
+	smp_rmb(); /* pairs with wmb in lpi_stats_expect */
+	lpi_stats.observed.cpu_id = smp_processor_id();
+	lpi_stats.observed.lpi_id = irqnr;
+	smp_wmb(); /* pairs with rmb in check_lpi_stats */
+}
+
+static void lpi_stats_expect(int exp_cpu_id, int exp_lpi_id)
+{
+	lpi_stats.expected.cpu_id = exp_cpu_id;
+	lpi_stats.expected.lpi_id = exp_lpi_id;
+	lpi_stats.observed.cpu_id = -1;
+	lpi_stats.observed.lpi_id = -1;
+	smp_wmb(); /* pairs with rmb in handler */
+}
+
+static void check_lpi_stats(const char *msg)
+{
+	bool pass = false;
+
+	mdelay(100);
+	smp_rmb(); /* pairs with wmb in lpi_handler */
+	if (lpi_stats.observed.cpu_id != lpi_stats.expected.cpu_id ||
+	    lpi_stats.observed.lpi_id != lpi_stats.expected.lpi_id) {
+		if (lpi_stats.observed.cpu_id == -1 &&
+		    lpi_stats.observed.lpi_id == -1) {
+			report_info("No LPI received whereas (cpuid=%d, intid=%d) "
+				    "was expected", lpi_stats.expected.cpu_id,
+				    lpi_stats.expected.lpi_id);
+		} else {
+			report_info("Unexpected LPI (cpuid=%d, intid=%d)",
+				    lpi_stats.observed.cpu_id,
+				    lpi_stats.observed.lpi_id);
+		}
+	} else {
+		pass = true;
+	}
+	report(pass, "%s", msg);
+}
+
+static void secondary_lpi_test(void)
+{
+	setup_irq(lpi_handler);
+	cpumask_set_cpu(smp_processor_id(), &ready);
+	while (1)
+		wfi();
+}
+#endif
+
 static void gicv2_ipi_send_self(void)
 {
 	writel(2 << 24 | IPI_IRQ, gicv2_dist_base() + GICD_SGIR);
@@ -216,17 +295,6 @@  static void ipi_test_smp(void)
 	report_prefix_pop();
 }
 
-static void setup_irq(irq_handler_fn handler)
-{
-	gic_enable_defaults();
-#ifdef __arm__
-	install_exception_handler(EXCPTN_IRQ, handler);
-#else
-	install_irq_handler(EL1H_IRQ, handler);
-#endif
-	local_irq_enable();
-}
-
 static void ipi_send(void)
 {
 	setup_irq(ipi_handler);
@@ -521,6 +589,7 @@  static void gic_test_mmio(void)
 #if defined(__arm__)
 
 static void test_its_introspection(void) {}
+static void test_its_trigger(void) {}
 
 #else /* __aarch64__ */
 
@@ -559,6 +628,126 @@  static void test_its_introspection(void)
 	report_info("collection table entry_size = 0x%x", coll_baser->esz);
 }
 
+static int its_prerequisites(int nb_cpus)
+{
+	int cpu;
+
+	if (!gicv3_its_base()) {
+		report_skip("No ITS, skip ...");
+		return -1;
+	}
+
+	if (nr_cpus < nb_cpus) {
+		report_skip("Test requires at least %d vcpus", nb_cpus);
+		return -1;
+	}
+
+	stats_reset();
+
+	setup_irq(lpi_handler);
+
+	for_each_present_cpu(cpu) {
+		if (cpu == 0)
+			continue;
+		smp_boot_secondary(cpu, secondary_lpi_test);
+	}
+	wait_on_ready();
+
+	its_enable_defaults();
+
+	return 0;
+}
+
+static void test_its_trigger(void)
+{
+	struct its_collection *col3, *col2;
+	struct its_device *dev2, *dev7;
+
+	if (its_prerequisites(4))
+		return;
+
+	dev2 = its_create_device(2 /* dev id */, 8 /* nb_ites */);
+	dev7 = its_create_device(7 /* dev id */, 8 /* nb_ites */);
+
+	col3 = its_create_collection(3 /* col id */, 3/* target PE */);
+	col2 = its_create_collection(2 /* col id */, 2/* target PE */);
+
+	gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
+	gicv3_lpi_set_config(8196, LPI_PROP_DEFAULT);
+
+	report_prefix_push("int");
+	/*
+	 * dev=2, eventid=20  -> lpi= 8195, col=3
+	 * dev=7, eventid=255 -> lpi= 8196, col=2
+	 * Trigger dev2, eventid=20 and dev7, eventid=255
+	 * Check both LPIs hit
+	 */
+
+	its_send_mapd(dev2, true);
+	its_send_mapd(dev7, true);
+
+	its_send_mapc(col3, true);
+	its_send_mapc(col2, true);
+
+	its_send_invall(col2);
+	its_send_invall(col3);
+
+	its_send_mapti(dev2, 8195 /* lpi id */, 20 /* event id */, col3);
+	its_send_mapti(dev7, 8196 /* lpi id */, 255 /* event id */, col2);
+
+	lpi_stats_expect(3, 8195);
+	its_send_int(dev2, 20);
+	check_lpi_stats("dev=2, eventid=20  -> lpi= 8195, col=3");
+
+	lpi_stats_expect(2, 8196);
+	its_send_int(dev7, 255);
+	check_lpi_stats("dev=7, eventid=255 -> lpi= 8196, col=2");
+
+	report_prefix_pop();
+
+	report_prefix_push("inv/invall");
+
+	/*
+	 * disable 8195, check dev2/eventid=20 does not trigger the
+	 * corresponding LPI
+	 */
+	gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT & ~LPI_PROP_ENABLED);
+	its_send_inv(dev2, 20);
+
+	lpi_stats_expect(-1, -1);
+	its_send_int(dev2, 20);
+	check_lpi_stats("dev2/eventid=20 does not trigger any LPI");
+
+	/*
+	 * re-enable the LPI but willingly do not call invall
+	 * so the change in config is not taken into account.
+	 * The LPI should not hit
+	 */
+	gicv3_lpi_set_config(8195, LPI_PROP_DEFAULT);
+	lpi_stats_expect(-1, -1);
+	its_send_int(dev2, 20);
+	check_lpi_stats("dev2/eventid=20 still does not trigger any LPI");
+
+	/* Now call the invall and check the LPI hits */
+	its_send_invall(col3);
+	lpi_stats_expect(3, 8195);
+	its_send_int(dev2, 20);
+	check_lpi_stats("dev2/eventid=20 now triggers an LPI");
+
+	report_prefix_pop();
+
+	report_prefix_push("mapd valid=false");
+	/*
+	 * Unmap device 2 and check the eventid 20 formerly
+	 * attached to it does not hit anymore
+	 */
+
+	its_send_mapd(dev2, false);
+	lpi_stats_expect(-1, -1);
+	its_send_int(dev2, 20);
+	check_lpi_stats("no LPI after device unmap");
+	report_prefix_pop();
+}
 #endif
 
 int main(int argc, char **argv)
@@ -592,6 +781,10 @@  int main(int argc, char **argv)
 		report_prefix_push(argv[1]);
 		gic_test_mmio();
 		report_prefix_pop();
+	} else if (!strcmp(argv[1], "its-trigger")) {
+		report_prefix_push(argv[1]);
+		test_its_trigger();
+		report_prefix_pop();
 	} else if (strcmp(argv[1], "its-introspection") == 0) {
 		report_prefix_push(argv[1]);
 		test_its_introspection();
diff --git a/arm/unittests.cfg b/arm/unittests.cfg
index 23d378e..b9a7a2c 100644
--- a/arm/unittests.cfg
+++ b/arm/unittests.cfg
@@ -129,6 +129,13 @@  extra_params = -machine gic-version=3 -append 'its-introspection'
 groups = its
 arch = arm64
 
+[its-trigger]
+file = gic.flat
+smp = $MAX_SMP
+extra_params = -machine gic-version=3 -append 'its-trigger'
+groups = its
+arch = arm64
+
 # Test PSCI emulation
 [psci]
 file = psci.flat