mbox series

[v5,00/12] Add support for io_uring

Message ID 20190610134905.22294-1-mehta.aaru20@gmail.com
Headers show
Series Add support for io_uring | expand

Message

Aarushi Mehta June 10, 2019, 1:48 p.m. UTC
This patch series adds support for the newly developed io_uring Linux AIO
interface. Linux io_uring is faster than Linux's AIO asynchronous I/O code,
offers efficient buffered asynchronous I/O support, the ability to do I/O
without performing a system call via polled I/O, and other efficiency enhancements.

Testing it requires a host kernel (5.1+) and the liburing library.
Use the option -drive aio=io_uring to enable it.

v5:
- Adds completion polling
- Extends qemu-io
- Adds qemu-iotest

v4:
- Add error handling
- Add trace events
- Remove aio submission based code

v3:
- Fix major errors in io_uring (sorry)
- Option now enumerates for CONFIG_LINUX_IO_URING
- pkg config support added

Aarushi Mehta (12):
  configure: permit use of io_uring
  qapi/block-core: add option for io_uring Only enumerates option for
    devices that support it
  block/block: add BDRV flag for io_uring
  block/io_uring: implements interfaces for io_uring Aborts when sqe
    fails to be set as sqes cannot be returned to the ring.
  stubs: add stubs for io_uring interface
  util/async: add aio interfaces for io_uring
  blockdev: accept io_uring as option
  block/file-posix.c: extend to use io_uring
  block: add trace events for io_uring
  block/io_uring: adds userspace completion polling
  qemu-io: adds support for io_uring
  qemu-iotests/087: checks for io_uring

 MAINTAINERS                |   8 +
 block/Makefile.objs        |   3 +
 block/file-posix.c         |  85 ++++++++--
 block/io_uring.c           | 339 +++++++++++++++++++++++++++++++++++++
 block/trace-events         |   8 +
 blockdev.c                 |   4 +-
 configure                  |  27 +++
 include/block/aio.h        |  16 +-
 include/block/block.h      |   1 +
 include/block/raw-aio.h    |  12 ++
 qapi/block-core.json       |   4 +-
 qemu-io.c                  |  13 ++
 stubs/Makefile.objs        |   1 +
 stubs/io_uring.c           |  32 ++++
 tests/qemu-iotests/087     |  26 +++
 tests/qemu-iotests/087.out |  10 ++
 util/async.c               |  36 ++++
 17 files changed, 606 insertions(+), 19 deletions(-)
 create mode 100644 block/io_uring.c
 create mode 100644 stubs/io_uring.c

Comments

Stefan Hajnoczi June 11, 2019, 9:56 a.m. UTC | #1
On Mon, Jun 10, 2019 at 07:18:53PM +0530, Aarushi Mehta wrote:
> This patch series adds support for the newly developed io_uring Linux AIO
> interface. Linux io_uring is faster than Linux's AIO asynchronous I/O code,
> offers efficient buffered asynchronous I/O support, the ability to do I/O
> without performing a system call via polled I/O, and other efficiency enhancements.
> 
> Testing it requires a host kernel (5.1+) and the liburing library.
> Use the option -drive aio=io_uring to enable it.
> 
> v5:
> - Adds completion polling
> - Extends qemu-io
> - Adds qemu-iotest

Flush is not hooked up.  Please use the io_uring IOURING_OP_FSYNC that
you've already written and connect it to file-posix.c.

When doing this watch out for the qiov->size check during completion
processing.  Flush doesn't have a qiov so it may be NULL.

Stefan
Stefan Hajnoczi June 22, 2019, 3:13 p.m. UTC | #2
On Tue, Jun 11, 2019 at 10:57 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> On Mon, Jun 10, 2019 at 07:18:53PM +0530, Aarushi Mehta wrote:
> > This patch series adds support for the newly developed io_uring Linux AIO
> > interface. Linux io_uring is faster than Linux's AIO asynchronous I/O code,
> > offers efficient buffered asynchronous I/O support, the ability to do I/O
> > without performing a system call via polled I/O, and other efficiency enhancements.
> >
> > Testing it requires a host kernel (5.1+) and the liburing library.
> > Use the option -drive aio=io_uring to enable it.
> >
> > v5:
> > - Adds completion polling
> > - Extends qemu-io
> > - Adds qemu-iotest
>
> Flush is not hooked up.  Please use the io_uring IOURING_OP_FSYNC that
> you've already written and connect it to file-posix.c.

IOURING_OP_FSYNC is in fact synchronous.  This means io_uring_enter()
blocks until this operation completes.  This is not desirable since
the AIO engine should not block the QEMU thread it's running from for
a long time (e.g. network file system that is not responding).

I think it's best *not* to use io_uring for fsync.  Instead we can
continue to use the thread pool, just like Linux AIO.

Stefan
Stefan Hajnoczi June 23, 2019, 2:03 p.m. UTC | #3
On Sat, Jun 22, 2019 at 4:13 PM Stefan Hajnoczi <stefanha@gmail.com> wrote:
> On Tue, Jun 11, 2019 at 10:57 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > On Mon, Jun 10, 2019 at 07:18:53PM +0530, Aarushi Mehta wrote:
> > > This patch series adds support for the newly developed io_uring Linux AIO
> > > interface. Linux io_uring is faster than Linux's AIO asynchronous I/O code,
> > > offers efficient buffered asynchronous I/O support, the ability to do I/O
> > > without performing a system call via polled I/O, and other efficiency enhancements.
> > >
> > > Testing it requires a host kernel (5.1+) and the liburing library.
> > > Use the option -drive aio=io_uring to enable it.
> > >
> > > v5:
> > > - Adds completion polling
> > > - Extends qemu-io
> > > - Adds qemu-iotest
> >
> > Flush is not hooked up.  Please use the io_uring IOURING_OP_FSYNC that
> > you've already written and connect it to file-posix.c.
>
> IOURING_OP_FSYNC is in fact synchronous.  This means io_uring_enter()
> blocks until this operation completes.  This is not desirable since
> the AIO engine should not block the QEMU thread it's running from for
> a long time (e.g. network file system that is not responding).
>
> I think it's best *not* to use io_uring for fsync.  Instead we can
> continue to use the thread pool, just like Linux AIO.

Looking more closely, this is wrong.  Although fsync is synchronous,
io_uring takes care to bounce it to the workqueue when submitted via
io_uring_enter().  Therefore it appears asynchronous to userspace and
we can and should use io_uring for fsync.

Stefan