mbox series

[0/3] drm/tegra: Add support for fence FDs

Message ID 20180111222249.29105-1-thierry.reding@gmail.com
Headers show
Series drm/tegra: Add support for fence FDs | expand

Message

Thierry Reding Jan. 11, 2018, 10:22 p.m. UTC
From: Thierry Reding <treding@nvidia.com>

This set of patches adds support for fences to Tegra DRM and complements
the fence FD support for Nouveau. Technically this isn't necessary for a
fence-based synchronization loop with Nouveau because the KMS core takes
care of all that, but engines behind host1x can use the IOCTL extensions
provided here to emit fence FDs that in turn can be used to synchronize
their jobs with either the scanout engine or the GPU.

See the following link for the Nouveau fence FD support series:

	https://patchwork.freedesktop.org/series/36361/

Thierry

Mikko Perttunen (3):
  gpu: host1x: Add support for DMA fences
  drm/tegra: Add sync file support to submit interface
  drm/tegra: Support for sync file-based fences in submit

 drivers/gpu/drm/tegra/drm.c        |  82 ++++++++++++---
 drivers/gpu/host1x/Kconfig         |   1 +
 drivers/gpu/host1x/Makefile        |   1 +
 drivers/gpu/host1x/dev.h           |  12 ++-
 drivers/gpu/host1x/fence.c         | 202 +++++++++++++++++++++++++++++++++++++
 drivers/gpu/host1x/fence.h         |  28 +++++
 drivers/gpu/host1x/hw/channel_hw.c |  36 +++++--
 drivers/gpu/host1x/intr.c          |  11 +-
 drivers/gpu/host1x/intr.h          |   8 +-
 drivers/gpu/host1x/syncpt.c        |   2 +
 include/linux/host1x.h             |  12 ++-
 include/uapi/drm/tegra_drm.h       |  10 +-
 12 files changed, 376 insertions(+), 29 deletions(-)
 create mode 100644 drivers/gpu/host1x/fence.c
 create mode 100644 drivers/gpu/host1x/fence.h

Comments

Chris Wilson Jan. 12, 2018, 10:40 a.m. UTC | #1
Quoting Thierry Reding (2018-01-11 22:22:46)
> From: Thierry Reding <treding@nvidia.com>
> 
> This set of patches adds support for fences to Tegra DRM and complements
> the fence FD support for Nouveau. Technically this isn't necessary for a
> fence-based synchronization loop with Nouveau because the KMS core takes
> care of all that, but engines behind host1x can use the IOCTL extensions
> provided here to emit fence FDs that in turn can be used to synchronize
> their jobs with either the scanout engine or the GPU.

Whilst hooking up fences, I advise you to also hook up drm_syncobj.
Internally they each resolve to another fence, so the mechanics are
identical, you just need another array in the uABI for in/out syncobj.
The advantage of drm_syncobj is that userspace can track internal fences
using inexhaustible handles, reserving the precious fd for IPC or KMS.
-Chris
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Thierry Reding Jan. 12, 2018, 3:14 p.m. UTC | #2
On Fri, Jan 12, 2018 at 10:40:16AM +0000, Chris Wilson wrote:
> Quoting Thierry Reding (2018-01-11 22:22:46)
> > From: Thierry Reding <treding@nvidia.com>
> > 
> > This set of patches adds support for fences to Tegra DRM and complements
> > the fence FD support for Nouveau. Technically this isn't necessary for a
> > fence-based synchronization loop with Nouveau because the KMS core takes
> > care of all that, but engines behind host1x can use the IOCTL extensions
> > provided here to emit fence FDs that in turn can be used to synchronize
> > their jobs with either the scanout engine or the GPU.
> 
> Whilst hooking up fences, I advise you to also hook up drm_syncobj.
> Internally they each resolve to another fence, so the mechanics are
> identical, you just need another array in the uABI for in/out syncobj.
> The advantage of drm_syncobj is that userspace can track internal fences
> using inexhaustible handles, reserving the precious fd for IPC or KMS.

I'm not sure that I properly understand how to use these. It looks as if
they are better fence FDs, so in case where you submit internal work you
would go with a drm_syncobj and when you need access to the fence from a
different process or driver, you should use an FD.

Doesn't this mean we can cover this by just adding a flag that marks the
fence as being a handle or an FD? Do we have situations where we want an
FD *and* a handle returned as result of the job submission?

For the above it would suffice to add two additional flags:

	#define DRM_TEGRA_SUBMIT_WAIT_SYNCOBJ (1 << 2)
	#define DRM_TEGRA_SUBMIT_EMIT_SYNCOBJ (1 << 3)

which would even allow both to be combined:

	DRM_TEGRA_SUBMIT_WAIT_SYNCOBJ | DRM_TEGRA_SUBMIT_EMIT_FENCE_FD

would allow the job to wait for an internal syncobj (defined by handle
in the fence member) and return a fence (as FD in the fence member) to
pass on to another process or driver as prefence.

Thierry
Thierry Reding Jan. 12, 2018, 4:04 p.m. UTC | #3
On Fri, Jan 12, 2018 at 03:38:56PM +0000, Chris Wilson wrote:
> Quoting Thierry Reding (2018-01-12 15:14:38)
> > On Fri, Jan 12, 2018 at 10:40:16AM +0000, Chris Wilson wrote:
> > > Quoting Thierry Reding (2018-01-11 22:22:46)
> > > > From: Thierry Reding <treding@nvidia.com>
> > > > 
> > > > This set of patches adds support for fences to Tegra DRM and complements
> > > > the fence FD support for Nouveau. Technically this isn't necessary for a
> > > > fence-based synchronization loop with Nouveau because the KMS core takes
> > > > care of all that, but engines behind host1x can use the IOCTL extensions
> > > > provided here to emit fence FDs that in turn can be used to synchronize
> > > > their jobs with either the scanout engine or the GPU.
> > > 
> > > Whilst hooking up fences, I advise you to also hook up drm_syncobj.
> > > Internally they each resolve to another fence, so the mechanics are
> > > identical, you just need another array in the uABI for in/out syncobj.
> > > The advantage of drm_syncobj is that userspace can track internal fences
> > > using inexhaustible handles, reserving the precious fd for IPC or KMS.
> > 
> > I'm not sure that I properly understand how to use these. It looks as if
> > they are better fence FDs, so in case where you submit internal work you
> > would go with a drm_syncobj and when you need access to the fence from a
> > different process or driver, you should use an FD.
> 
> Yes, simply put they are better fence fds.
> 
> > Doesn't this mean we can cover this by just adding a flag that marks the
> > fence as being a handle or an FD? Do we have situations where we want an
> > FD *and* a handle returned as result of the job submission?
> 
> Probably not, but if you don't need to force userspace to choose, they
> will come up with a situation where it is useful. Though one thing to
> consider with the drm_syncobj is that you will want to handle an array
> of in/out fences, as userspace will pass in an array of VkSemaphore (or
> whatever) rather than compute a singular dma_fence_array by merging.

It's fairly unlikely that Tegra DRM will ever need to be able to deal
with Vulkan (I'm not sure if the pre-Tegra124 GPU is capable of it). But
you're right, might be a good idea to plan for this anyway since it
isn't really complicated and we still have a few reserved bits left.

> > For the above it would suffice to add two additional flags:
> > 
> >         #define DRM_TEGRA_SUBMIT_WAIT_SYNCOBJ (1 << 2)
> >         #define DRM_TEGRA_SUBMIT_EMIT_SYNCOBJ (1 << 3)
> > 
> > which would even allow both to be combined:
> > 
> >         DRM_TEGRA_SUBMIT_WAIT_SYNCOBJ | DRM_TEGRA_SUBMIT_EMIT_FENCE_FD
> > 
> > would allow the job to wait for an internal syncobj (defined by handle
> > in the fence member) and return a fence (as FD in the fence member) to
> > pass on to another process or driver as prefence.
> 
> Would be easy, if you are happy with the limitation of just a single
> wait-fence.

Yeah, the obvious advantage here is that this is totally trivial to
implement both in the kernel and in userspace. Extending it to an array
requires introduction of an extra structure (for a <fence, flags> pair)
and a u64 for the address and a u32 for the number of elements in the
array.

Another thing I'm wondering is if we can't simplify this somewhat by
only supporting syncobjs and then use the SYNCOBJ IOCTLs to convert to
FD if it turns out that we need it.

Let's say we have a use-case where we decode an video frame using a
hardware decoder (which we don't support upstream yet). The decoder
could signal completion using a syncobj. Userspace could then grab the
handle for that syncobj and pass it back in as prefence for a job that
does post-processing using VIC. They both are behind the same DRM
device, so the handle can be resolved properly. The VIC job could then
return a fence via syncobj handle. But in order to pass it to KMS we'd
need to convert it to an FD for synchronization. But we can do that
using the SYNCOBJ IOCTLs already, so there's not really a need for the
SUBMIT IOCTL to support emitting FDs, right?

Given that KMS and VIC are behind the same DRM device, would it be
reasonable to allow KMS to take a syncobj handle as in_fence_fd (perhaps
via in_fence_handle property)?

Thierry