diff mbox series

[bpf-next] xsk: update rings for load-acquire/store-release semantics

Message ID 20200120092149.13775-1-bjorn.topel@gmail.com
State Changes Requested
Delegated to: BPF Maintainers
Headers show
Series [bpf-next] xsk: update rings for load-acquire/store-release semantics | expand

Commit Message

Björn Töpel Jan. 20, 2020, 9:21 a.m. UTC
From: Björn Töpel <bjorn.topel@intel.com>

Currently, the AF_XDP rings uses fences for the kernel-side
produce/consume functions. By updating rings for
load-acquire/store-release semantics, the full barrier (smp_mb()) on
the consumer side can be replaced.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 net/xdp/xsk_queue.h | 31 ++++++++++++++-----------------
 1 file changed, 14 insertions(+), 17 deletions(-)

Comments

Daniel Borkmann Jan. 20, 2020, 11:27 p.m. UTC | #1
On 1/20/20 10:21 AM, Björn Töpel wrote:
> From: Björn Töpel <bjorn.topel@intel.com>
> 
> Currently, the AF_XDP rings uses fences for the kernel-side
> produce/consume functions. By updating rings for
> load-acquire/store-release semantics, the full barrier (smp_mb()) on
> the consumer side can be replaced.
> 
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>

If I'm not missing something from the ring update scheme, don't you also need
to adapt to STORE.rel ->producer with matching barrier in tools/lib/bpf/xsk.h ?

Btw, alternative model could also be 09d62154f613 ("tools, perf: add and use
optimized ring_buffer_{read_head, write_tail} helpers") for the kernel side
in order to get rid of the smp_mb() on x86.

Thanks,
Daniel
John Fastabend Jan. 21, 2020, 1:24 a.m. UTC | #2
Björn Töpel wrote:
> From: Björn Töpel <bjorn.topel@intel.com>
> 
> Currently, the AF_XDP rings uses fences for the kernel-side
> produce/consume functions. By updating rings for
> load-acquire/store-release semantics, the full barrier (smp_mb()) on
> the consumer side can be replaced.
> 
> Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
> ---
>  net/xdp/xsk_queue.h | 31 ++++++++++++++-----------------
>  1 file changed, 14 insertions(+), 17 deletions(-)
> 
> diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
> index bec2af11853a..2fff80576ee1 100644
> --- a/net/xdp/xsk_queue.h
> +++ b/net/xdp/xsk_queue.h
> @@ -39,19 +39,18 @@ struct xsk_queue {
>  	u64 invalid_descs;
>  };

Should the xsk_cons_* libbpf routines also be updated then as well?
In general my understanding is the userspace and kernel should be
in sync?
Björn Töpel Jan. 21, 2020, 11:50 a.m. UTC | #3
On Tue, 21 Jan 2020 at 00:51, Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 1/20/20 10:21 AM, Björn Töpel wrote:
> > From: Björn Töpel <bjorn.topel@intel.com>
> >
> > Currently, the AF_XDP rings uses fences for the kernel-side
> > produce/consume functions. By updating rings for
> > load-acquire/store-release semantics, the full barrier (smp_mb()) on
> > the consumer side can be replaced.
> >
> > Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
>
> If I'm not missing something from the ring update scheme, don't you also need
> to adapt to STORE.rel ->producer with matching barrier in tools/lib/bpf/xsk.h ?
>

Daniel/John,

Hmm, I was under the impression that *wasn't* the case. Quoting
memory-barriers.txt:

--8<--
When dealing with CPU-CPU interactions, certain types of memory
barrier should always be paired.  A lack of appropriate pairing is
almost certainly an error.

General barriers pair with each other, though they also pair with most
other types of barriers, albeit without multicopy atomicity.  An
acquire barrier pairs with a release barrier, but both may also pair
with other barriers, including of course general barriers.  A write
barrier pairs with a data dependency barrier, a control dependency, an
acquire barrier, a release barrier, a read barrier, or a general
barrier.  Similarly a read barrier, control dependency, or a data
dependency barrier pairs with a write barrier, an acquire barrier, a
release barrier, or a general barrier:
-->8--

(The usual "I'm not an expert, just quoting memory-barriers.txt"
disclaimer applies...)

In libbpf, we already have a "weaker" barrier (libbpf_smp_rwmb()). See
commit 2c5935f1b2b6 ("libbpf: optimize barrier for XDP socket rings")
for more information.

I agree that at some point ld.acq/st.rel barriers should be adopted in
libbpf at some point, but not that it's broken now. Then again,
reading Will's (Cc'd) perf-ring comment here [1], makes me a bit
uneasy even though he says "theoretical". ;-)

> Btw, alternative model could also be 09d62154f613 ("tools, perf: add and use
> optimized ring_buffer_{read_head, write_tail} helpers") for the kernel side
> in order to get rid of the smp_mb() on x86.
>

Interesting! I'll have a look!

[1] https://lore.kernel.org/lkml/20180523164234.GJ2983@arm.com/
Will Deacon March 16, 2020, 6:44 p.m. UTC | #4
On Tue, Jan 21, 2020 at 12:50:23PM +0100, Björn Töpel wrote:
> On Tue, 21 Jan 2020 at 00:51, Daniel Borkmann <daniel@iogearbox.net> wrote:
> >
> > On 1/20/20 10:21 AM, Björn Töpel wrote:
> > > From: Björn Töpel <bjorn.topel@intel.com>
> > >
> > > Currently, the AF_XDP rings uses fences for the kernel-side
> > > produce/consume functions. By updating rings for
> > > load-acquire/store-release semantics, the full barrier (smp_mb()) on
> > > the consumer side can be replaced.
> > >
> > > Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
> >
> > If I'm not missing something from the ring update scheme, don't you also need
> > to adapt to STORE.rel ->producer with matching barrier in tools/lib/bpf/xsk.h ?
> >
> 
> Daniel/John,
> 
> Hmm, I was under the impression that *wasn't* the case. Quoting
> memory-barriers.txt:
> 
> --8<--
> When dealing with CPU-CPU interactions, certain types of memory
> barrier should always be paired.  A lack of appropriate pairing is
> almost certainly an error.
> 
> General barriers pair with each other, though they also pair with most
> other types of barriers, albeit without multicopy atomicity.  An
> acquire barrier pairs with a release barrier, but both may also pair
> with other barriers, including of course general barriers.  A write
> barrier pairs with a data dependency barrier, a control dependency, an
> acquire barrier, a release barrier, a read barrier, or a general
> barrier.  Similarly a read barrier, control dependency, or a data
> dependency barrier pairs with a write barrier, an acquire barrier, a
> release barrier, or a general barrier:
> -->8--

The key part here is "albeit without multicopy atomicity". I don't think
you care about that at all for these rings as you're very clearly passing a
message from the producer side to the consumer side in a point-to-point like
manner, so I think you're ok to change the kernel independently from
userspace (but I would still recommend updating both eventually).

The only thing you might run into is if anybody is relying on the smp_mb()
in the consumer to order other unrelated stuff either side of the consume
operation (or even another consume operation to a different ring!), but it
looks like you can't rely on that in the xsk queue implementation anyway
because you cache the global state and so the barriers are conditional.

Will
Björn Töpel March 17, 2020, 7:14 p.m. UTC | #5
On Mon, 16 Mar 2020 at 19:44, Will Deacon <will@kernel.org> wrote:
>
> On Tue, Jan 21, 2020 at 12:50:23PM +0100, Björn Töpel wrote:
> > On Tue, 21 Jan 2020 at 00:51, Daniel Borkmann <daniel@iogearbox.net> wrote:
> > >
> > > On 1/20/20 10:21 AM, Björn Töpel wrote:
> > > > From: Björn Töpel <bjorn.topel@intel.com>
> > > >
> > > > Currently, the AF_XDP rings uses fences for the kernel-side
> > > > produce/consume functions. By updating rings for
> > > > load-acquire/store-release semantics, the full barrier (smp_mb()) on
> > > > the consumer side can be replaced.
> > > >
> > > > Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
> > >
> > > If I'm not missing something from the ring update scheme, don't you also need
> > > to adapt to STORE.rel ->producer with matching barrier in tools/lib/bpf/xsk.h ?
> > >
> >
> > Daniel/John,
> >
> > Hmm, I was under the impression that *wasn't* the case. Quoting
> > memory-barriers.txt:
> >
> > --8<--
> > When dealing with CPU-CPU interactions, certain types of memory
> > barrier should always be paired.  A lack of appropriate pairing is
> > almost certainly an error.
> >
> > General barriers pair with each other, though they also pair with most
> > other types of barriers, albeit without multicopy atomicity.  An
> > acquire barrier pairs with a release barrier, but both may also pair
> > with other barriers, including of course general barriers.  A write
> > barrier pairs with a data dependency barrier, a control dependency, an
> > acquire barrier, a release barrier, a read barrier, or a general
> > barrier.  Similarly a read barrier, control dependency, or a data
> > dependency barrier pairs with a write barrier, an acquire barrier, a
> > release barrier, or a general barrier:
> > -->8--
>
> The key part here is "albeit without multicopy atomicity". I don't think
> you care about that at all for these rings as you're very clearly passing a
> message from the producer side to the consumer side in a point-to-point like
> manner, so I think you're ok to change the kernel independently from
> userspace (but I would still recommend updating both eventually).
>
> The only thing you might run into is if anybody is relying on the smp_mb()
> in the consumer to order other unrelated stuff either side of the consume
> operation (or even another consume operation to a different ring!), but it
> looks like you can't rely on that in the xsk queue implementation anyway
> because you cache the global state and so the barriers are conditional.
>

Thanks for getting back, and for the clarification! I'll do a respin
(as part of a another series) that include the userland changes.

Cheers,
Björn

> Will
Will Deacon March 17, 2020, 9:03 p.m. UTC | #6
On Tue, Mar 17, 2020 at 08:14:09PM +0100, Björn Töpel wrote:
> On Mon, 16 Mar 2020 at 19:44, Will Deacon <will@kernel.org> wrote:
> >
> > On Tue, Jan 21, 2020 at 12:50:23PM +0100, Björn Töpel wrote:
> > > On Tue, 21 Jan 2020 at 00:51, Daniel Borkmann <daniel@iogearbox.net> wrote:
> > > >
> > > > On 1/20/20 10:21 AM, Björn Töpel wrote:
> > > > > From: Björn Töpel <bjorn.topel@intel.com>
> > > > >
> > > > > Currently, the AF_XDP rings uses fences for the kernel-side
> > > > > produce/consume functions. By updating rings for
> > > > > load-acquire/store-release semantics, the full barrier (smp_mb()) on
> > > > > the consumer side can be replaced.
> > > > >
> > > > > Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
> > > >
> > > > If I'm not missing something from the ring update scheme, don't you also need
> > > > to adapt to STORE.rel ->producer with matching barrier in tools/lib/bpf/xsk.h ?
> > > >
> > >
> > > Daniel/John,
> > >
> > > Hmm, I was under the impression that *wasn't* the case. Quoting
> > > memory-barriers.txt:
> > >
> > > --8<--
> > > When dealing with CPU-CPU interactions, certain types of memory
> > > barrier should always be paired.  A lack of appropriate pairing is
> > > almost certainly an error.
> > >
> > > General barriers pair with each other, though they also pair with most
> > > other types of barriers, albeit without multicopy atomicity.  An
> > > acquire barrier pairs with a release barrier, but both may also pair
> > > with other barriers, including of course general barriers.  A write
> > > barrier pairs with a data dependency barrier, a control dependency, an
> > > acquire barrier, a release barrier, a read barrier, or a general
> > > barrier.  Similarly a read barrier, control dependency, or a data
> > > dependency barrier pairs with a write barrier, an acquire barrier, a
> > > release barrier, or a general barrier:
> > > -->8--
> >
> > The key part here is "albeit without multicopy atomicity". I don't think
> > you care about that at all for these rings as you're very clearly passing a
> > message from the producer side to the consumer side in a point-to-point like
> > manner, so I think you're ok to change the kernel independently from
> > userspace (but I would still recommend updating both eventually).
> >
> > The only thing you might run into is if anybody is relying on the smp_mb()
> > in the consumer to order other unrelated stuff either side of the consume
> > operation (or even another consume operation to a different ring!), but it
> > looks like you can't rely on that in the xsk queue implementation anyway
> > because you cache the global state and so the barriers are conditional.
> >
> 
> Thanks for getting back, and for the clarification! I'll do a respin
> (as part of a another series) that include the userland changes.

I'm just sorry it took me so long. I got tied up with personal stuff,
then conferences and finally, when it got to the top of my list, I took
some holiday.

Will
diff mbox series

Patch

diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h
index bec2af11853a..2fff80576ee1 100644
--- a/net/xdp/xsk_queue.h
+++ b/net/xdp/xsk_queue.h
@@ -39,19 +39,18 @@  struct xsk_queue {
 	u64 invalid_descs;
 };
 
-/* The structure of the shared state of the rings are the same as the
- * ring buffer in kernel/events/ring_buffer.c. For the Rx and completion
- * ring, the kernel is the producer and user space is the consumer. For
- * the Tx and fill rings, the kernel is the consumer and user space is
- * the producer.
+/* The structure of the shared state of the rings are a simple
+ * circular buffer, as outlined in
+ * Documentation/core-api/circular-buffers.rst. For the Rx and
+ * completion ring, the kernel is the producer and user space is the
+ * consumer. For the Tx and fill rings, the kernel is the consumer and
+ * user space is the producer.
  *
  * producer                         consumer
  *
- * if (LOAD ->consumer) {           LOAD ->producer
- *                    (A)           smp_rmb()       (C)
+ * if (LOAD ->consumer) {  (A)      LOAD.acq ->producer  (C)
  *    STORE $data                   LOAD $data
- *    smp_wmb()       (B)           smp_mb()        (D)
- *    STORE ->producer              STORE ->consumer
+ *    STORE.rel ->producer (B)      STORE.rel ->consumer (D)
  * }
  *
  * (A) pairs with (D), and (B) pairs with (C).
@@ -220,15 +219,14 @@  static inline bool xskq_cons_read_desc(struct xsk_queue *q,
 
 static inline void __xskq_cons_release(struct xsk_queue *q)
 {
-	smp_mb(); /* D, matches A */
-	WRITE_ONCE(q->ring->consumer, q->cached_cons);
+	/* D, matches A */
+	smp_store_release(&q->ring->consumer, q->cached_cons);
 }
 
 static inline void __xskq_cons_peek(struct xsk_queue *q)
 {
-	/* Refresh the local pointer */
-	q->cached_prod = READ_ONCE(q->ring->producer);
-	smp_rmb(); /* C, matches B */
+	/* C, matches B */
+	q->cached_prod = smp_load_acquire(&q->ring->producer);
 }
 
 static inline void xskq_cons_get_entries(struct xsk_queue *q)
@@ -340,9 +338,8 @@  static inline int xskq_prod_reserve_desc(struct xsk_queue *q,
 
 static inline void __xskq_prod_submit(struct xsk_queue *q, u32 idx)
 {
-	smp_wmb(); /* B, matches C */
-
-	WRITE_ONCE(q->ring->producer, idx);
+	/* B, matches C */
+	smp_store_release(&q->ring->producer, idx);
 }
 
 static inline void xskq_prod_submit(struct xsk_queue *q)