[4/6] xive: Use io_complete() when changing the ESB mask bits

Message ID 20171206173928.25628-4-benh@kernel.crashing.org
State Superseded
Headers show
Series
  • [1/6] xive: Mark a freed IRQ's IVE as valid and masked
Related show

Commit Message

Benjamin Herrenschmidt Dec. 6, 2017, 5:39 p.m.
Thus ensuring that the load has completed before anything else
is done (ie, the interrupt is actually masked).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 hw/xive.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Oliver Dec. 7, 2017, 1:47 a.m. | #1
On Thu, Dec 7, 2017 at 4:39 AM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> Thus ensuring that the load has completed before anything else
> is done (ie, the interrupt is actually masked).
>
> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
>  hw/xive.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/xive.c b/hw/xive.c
> index 76939b79..104e1e85 100644
> --- a/hw/xive.c
> +++ b/hw/xive.c
> @@ -2451,7 +2451,7 @@ static void xive_update_irq_mask(struct xive_src *s, uint32_t idx, bool masked)
>         if (s->flags & XIVE_SRC_SHIFT_BUG)
>                 offset <<= 4;
>
> -       in_be64(mmio_base + offset);
> +       in_complete(in_be64(mmio_base + offset));

Can you elaborate a bit of what this is doing an why it needs to be done?

>  }
>
>  static int64_t xive_sync(struct xive *x)
> --
> 2.14.3
>
> _______________________________________________
> Skiboot mailing list
> Skiboot@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/skiboot

Tested-by: Oliver O'Halloran <oohall@gmail.com>
Benjamin Herrenschmidt Dec. 7, 2017, 5:48 p.m. | #2
On Thu, 2017-12-07 at 12:47 +1100, Oliver wrote:
> On Thu, Dec 7, 2017 at 4:39 AM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
> > Thus ensuring that the load has completed before anything else
> > is done (ie, the interrupt is actually masked).
> > 
> > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > ---
> >  hw/xive.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/hw/xive.c b/hw/xive.c
> > index 76939b79..104e1e85 100644
> > --- a/hw/xive.c
> > +++ b/hw/xive.c
> > @@ -2451,7 +2451,7 @@ static void xive_update_irq_mask(struct xive_src *s, uint32_t idx, bool masked)
> >         if (s->flags & XIVE_SRC_SHIFT_BUG)
> >                 offset <<= 4;
> > 
> > -       in_be64(mmio_base + offset);
> > +       in_complete(in_be64(mmio_base + offset));
> 
> Can you elaborate a bit of what this is doing an why it needs to be done?

Well, the cset says it :-) Basically the twi/isync construct is the
same we use in the kernel in every in_* and it's supposed to make
the processor "consume" the load, and not execute past the isync until
the load value has returned.

Ben.

> 
> >  }
> > 
> >  static int64_t xive_sync(struct xive *x)
> > --
> > 2.14.3
> > 
> > _______________________________________________
> > Skiboot mailing list
> > Skiboot@lists.ozlabs.org
> > https://lists.ozlabs.org/listinfo/skiboot
> 
> Tested-by: Oliver O'Halloran <oohall@gmail.com>
Oliver Dec. 8, 2017, 12:01 a.m. | #3
On Fri, Dec 8, 2017 at 4:48 AM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Thu, 2017-12-07 at 12:47 +1100, Oliver wrote:
>> On Thu, Dec 7, 2017 at 4:39 AM, Benjamin Herrenschmidt
>> <benh@kernel.crashing.org> wrote:
>> > Thus ensuring that the load has completed before anything else
>> > is done (ie, the interrupt is actually masked).
>> >
>> > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> > ---
>> >  hw/xive.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> >
>> > diff --git a/hw/xive.c b/hw/xive.c
>> > index 76939b79..104e1e85 100644
>> > --- a/hw/xive.c
>> > +++ b/hw/xive.c
>> > @@ -2451,7 +2451,7 @@ static void xive_update_irq_mask(struct xive_src *s, uint32_t idx, bool masked)
>> >         if (s->flags & XIVE_SRC_SHIFT_BUG)
>> >                 offset <<= 4;
>> >
>> > -       in_be64(mmio_base + offset);
>> > +       in_complete(in_be64(mmio_base + offset));
>>
>> Can you elaborate a bit of what this is doing an why it needs to be done?
>
> Well, the cset says it :-) Basically the twi/isync construct is the
> same we use in the kernel in every in_* and it's supposed to make
> the processor "consume" the load, and not execute past the isync until
> the load value has returned.

Sure, but why is this necessary/preferable to, say, an eieio barrier?

>
> Ben.
>
>>
>> >  }
>> >
>> >  static int64_t xive_sync(struct xive *x)
>> > --
>> > 2.14.3
>> >
>> > _______________________________________________
>> > Skiboot mailing list
>> > Skiboot@lists.ozlabs.org
>> > https://lists.ozlabs.org/listinfo/skiboot
>>
>> Tested-by: Oliver O'Halloran <oohall@gmail.com>
Benjamin Herrenschmidt Dec. 8, 2017, 3:01 a.m. | #4
On Fri, 2017-12-08 at 11:01 +1100, Oliver wrote:
> On Fri, Dec 8, 2017 at 4:48 AM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
> > On Thu, 2017-12-07 at 12:47 +1100, Oliver wrote:
> > > On Thu, Dec 7, 2017 at 4:39 AM, Benjamin Herrenschmidt
> > > <benh@kernel.crashing.org> wrote:
> > > > Thus ensuring that the load has completed before anything else
> > > > is done (ie, the interrupt is actually masked).
> > > > 
> > > > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > > > ---
> > > >  hw/xive.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/hw/xive.c b/hw/xive.c
> > > > index 76939b79..104e1e85 100644
> > > > --- a/hw/xive.c
> > > > +++ b/hw/xive.c
> > > > @@ -2451,7 +2451,7 @@ static void xive_update_irq_mask(struct xive_src *s, uint32_t idx, bool masked)
> > > >         if (s->flags & XIVE_SRC_SHIFT_BUG)
> > > >                 offset <<= 4;
> > > > 
> > > > -       in_be64(mmio_base + offset);
> > > > +       in_complete(in_be64(mmio_base + offset));
> > > 
> > > Can you elaborate a bit of what this is doing an why it needs to be done?
> > 
> > Well, the cset says it :-) Basically the twi/isync construct is the
> > same we use in the kernel in every in_* and it's supposed to make
> > the processor "consume" the load, and not execute past the isync until
> > the load value has returned.
> 
> Sure, but why is this necessary/preferable to, say, an eieio barrier?

An eieio is just a barrier. It will order previous loads or stores with
later ones. It provides no guarantee about the load having actually
completed and returned data to the CPU. The loads (or stores) will be
*issued* in order by the barrier, but the load can be stuck in the
device for a while and the CPU can continue execute past it until it
needs the resulting data (which can be especially long when we don't
care about said result at all as above).

This can be a problem for example in timed IO constructs. For example
imagine a resent sequence for a device needs the bit set for 1us.
something like:

	- store 1 to reset register
	- load from reset register (to "push" the store)
	- delay 1us
	- store 0 to reset register

Now the problem is that nothing in the delay loop will consume the
loaded value so the load might actually only complete late into the
loop and thus the delay isn't respected.

The twi "trap never" followed by isync forces the core to consume the
load data and prevent execution beyond the sequence. The twi itself
consumes the data and isync guarantees that any previous potentially
trapping instruction has completed. (Hopefully the core doesn't
optimize that twi form, but that sequence is old AIX lore so I assume
it's still ok).

In the case of the mask bits, it's just a quick optimisation, ie, it
forces the core to "push out" the ESB load and wait for its completion
before continuing.

It's also handy when we shut down the xive as it guarantees the
completion of the side effect load before we start tearing things
down. (Mind you we do a xive sync so it's probably not strictly
necessary).

Ben.

> > 
> > Ben.
> > 
> > > 
> > > >  }
> > > > 
> > > >  static int64_t xive_sync(struct xive *x)
> > > > --
> > > > 2.14.3
> > > > 
> > > > _______________________________________________
> > > > Skiboot mailing list
> > > > Skiboot@lists.ozlabs.org
> > > > https://lists.ozlabs.org/listinfo/skiboot
> > > 
> > > Tested-by: Oliver O'Halloran <oohall@gmail.com>
Benjamin Herrenschmidt Dec. 8, 2017, 3:02 a.m. | #5
On Fri, 2017-12-08 at 11:01 +1100, Oliver wrote:
> On Fri, Dec 8, 2017 at 4:48 AM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
> > On Thu, 2017-12-07 at 12:47 +1100, Oliver wrote:
> > > On Thu, Dec 7, 2017 at 4:39 AM, Benjamin Herrenschmidt
> > > <benh@kernel.crashing.org> wrote:
> > > > Thus ensuring that the load has completed before anything else
> > > > is done (ie, the interrupt is actually masked).
> > > > 
> > > > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> > > > ---
> > > >  hw/xive.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/hw/xive.c b/hw/xive.c
> > > > index 76939b79..104e1e85 100644
> > > > --- a/hw/xive.c
> > > > +++ b/hw/xive.c
> > > > @@ -2451,7 +2451,7 @@ static void xive_update_irq_mask(struct xive_src *s, uint32_t idx, bool masked)
> > > >         if (s->flags & XIVE_SRC_SHIFT_BUG)
> > > >                 offset <<= 4;
> > > > 
> > > > -       in_be64(mmio_base + offset);
> > > > +       in_complete(in_be64(mmio_base + offset));
> > > 
> > > Can you elaborate a bit of what this is doing an why it needs to be done?
> > 
> > Well, the cset says it :-) Basically the twi/isync construct is the
> > same we use in the kernel in every in_* and it's supposed to make
> > the processor "consume" the load, and not execute past the isync until
> > the load value has returned.
> 
> Sure, but why is this necessary/preferable to, say, an eieio barrier?

An eieio is just a barrier. It will order previous loads or stores with
later ones. It provides no guarantee about the load having actually
completed and returned data to the CPU. The loads (or stores) will be
*issued* in order by the barrier, but the load can be stuck in the
device for a while and the CPU can continue execute past it until it
needs the resulting data (which can be especially long when we don't
care about said result at all as above).

This can be a problem for example in timed IO constructs. For example
imagine a resent sequence for a device needs the bit set for 1us.
something like:

	- store 1 to reset register
	- load from reset register (to "push" the store)
	- delay 1us
	- store 0 to reset register

Now the problem is that nothing in the delay loop will consume the
loaded value so the load might actually only complete late into the
loop and thus the delay isn't respected.

The twi "trap never" followed by isync forces the core to consume the
load data and prevent execution beyond the sequence. The twi itself
consumes the data and isync guarantees that any previous potentially
trapping instruction has completed. (Hopefully the core doesn't
optimize that twi form, but that sequence is old AIX lore so I assume
it's still ok).

In the case of the mask bits, it's just a quick optimisation, ie, it
forces the core to "push out" the ESB load and wait for its completion
before continuing.

It's also handy when we shut down the xive as it guarantees the
completion of the side effect load before we start tearing things
down. (Mind you we do a xive sync so it's probably not strictly
necessary).

But yeah as long as whatever we do next involves stores or load a
barrier is enough for correctness. I suspect I was just being tired
and paranoid on this one.

Ben.

> > 
> > Ben.
> > 
> > > 
> > > >  }
> > > > 
> > > >  static int64_t xive_sync(struct xive *x)
> > > > --
> > > > 2.14.3
> > > > 
> > > > _______________________________________________
> > > > Skiboot mailing list
> > > > Skiboot@lists.ozlabs.org
> > > > https://lists.ozlabs.org/listinfo/skiboot
> > > 
> > > Tested-by: Oliver O'Halloran <oohall@gmail.com>

Patch

diff --git a/hw/xive.c b/hw/xive.c
index 76939b79..104e1e85 100644
--- a/hw/xive.c
+++ b/hw/xive.c
@@ -2451,7 +2451,7 @@  static void xive_update_irq_mask(struct xive_src *s, uint32_t idx, bool masked)
 	if (s->flags & XIVE_SRC_SHIFT_BUG)
 		offset <<= 4;
 
-	in_be64(mmio_base + offset);
+	in_complete(in_be64(mmio_base + offset));
 }
 
 static int64_t xive_sync(struct xive *x)