diff mbox

[02/12] drm/etnaviv: add devicetree bindings

Message ID 1449237604-19064-3-git-send-email-l.stach@pengutronix.de
State Changes Requested, archived
Headers show

Commit Message

Lucas Stach Dec. 4, 2015, 1:59 p.m. UTC
Etnaviv follows the same priciple as imx-drm to have a virtual
master device node to bind all the individual GPU cores together
into one DRM device.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
---
 .../bindings/display/etnaviv/etnaviv-drm.txt       | 46 ++++++++++++++++++++++
 1 file changed, 46 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt

Comments

Rob Herring Dec. 4, 2015, 4:29 p.m. UTC | #1
On Fri, Dec 04, 2015 at 02:59:54PM +0100, Lucas Stach wrote:
> Etnaviv follows the same priciple as imx-drm to have a virtual
> master device node to bind all the individual GPU cores together
> into one DRM device.
> 
> Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> ---
>  .../bindings/display/etnaviv/etnaviv-drm.txt       | 46 ++++++++++++++++++++++
>  1 file changed, 46 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> 
> diff --git a/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> new file mode 100644
> index 000000000000..19fde29dc1d7
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> @@ -0,0 +1,46 @@
> +Etnaviv DRM master device
> +================================
> +
> +The Etnaviv DRM master device is a virtual device needed to list all
> +Vivante GPU cores that comprise the GPU subsystem.
> +
> +Required properties:
> +- compatible: Should be one of
> +    "fsl,imx-gpu-subsystem"
> +    "marvell,dove-gpu-subsystem"
> +- cores: Should contain a list of phandles pointing to Vivante GPU devices
> +
> +example:
> +
> +gpu-subsystem {
> +	compatible = "fsl,imx-gpu-subsystem";
> +	cores = <&gpu_2d>, <&gpu_3d>;
> +};

Yeah, I'm not really a fan of doing this simply because DRM wants 1 
driver.

> +
> +
> +Vivante GPU core devices
> +====================
> +
> +Required properties:
> +- compatible: Should be "vivante,gc"

This should at least have the specific cores listed like gc-5000 or 
whatever the numbering is. As is, I can't even tell if this a 2d or 3d 
core.

Also, it probably should have an SOC specific property to deal with SOC 
specific configuration or integration.

> +- reg: should be register base and length as documented in the
> +  datasheet
> +- interrupts: Should contain the cores interrupt line
> +- clocks: should contain one clock for entry in clock-names
> +  see Documentation/devicetree/bindings/clock/clock-bindings.txt
> +- clock-names:
> +   - "bus":    AXI/register clock
> +   - "core":   GPU core clock
> +   - "shader": Shader clock (only required if GPU has feature PIPE_3D)
> +
> +example:
> +
> +gpu_3d: gpu@00130000 {
> +	compatible = "vivante,gc";
> +	reg = <0x00130000 0x4000>;
> +	interrupts = <0 9 IRQ_TYPE_LEVEL_HIGH>;
> +	clocks = <&clks IMX6QDL_CLK_GPU3D_AXI>,
> +	         <&clks IMX6QDL_CLK_GPU3D_CORE>,
> +	         <&clks IMX6QDL_CLK_GPU3D_SHADER>;
> +	clock-names = "bus", "core", "shader";
> +};
> -- 
> 2.6.2
> 
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lucas Stach Dec. 4, 2015, 4:41 p.m. UTC | #2
Am Freitag, den 04.12.2015, 10:29 -0600 schrieb Rob Herring:
> On Fri, Dec 04, 2015 at 02:59:54PM +0100, Lucas Stach wrote:
> > Etnaviv follows the same priciple as imx-drm to have a virtual
> > master device node to bind all the individual GPU cores together
> > into one DRM device.
> > 
> > Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> > ---
> >  .../bindings/display/etnaviv/etnaviv-drm.txt       | 46 ++++++++++++++++++++++
> >  1 file changed, 46 insertions(+)
> >  create mode 100644 Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> > 
> > diff --git a/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> > new file mode 100644
> > index 000000000000..19fde29dc1d7
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> > @@ -0,0 +1,46 @@
> > +Etnaviv DRM master device
> > +================================
> > +
> > +The Etnaviv DRM master device is a virtual device needed to list all
> > +Vivante GPU cores that comprise the GPU subsystem.
> > +
> > +Required properties:
> > +- compatible: Should be one of
> > +    "fsl,imx-gpu-subsystem"
> > +    "marvell,dove-gpu-subsystem"
> > +- cores: Should contain a list of phandles pointing to Vivante GPU devices
> > +
> > +example:
> > +
> > +gpu-subsystem {
> > +	compatible = "fsl,imx-gpu-subsystem";
> > +	cores = <&gpu_2d>, <&gpu_3d>;
> > +};
> 
> Yeah, I'm not really a fan of doing this simply because DRM wants 1 
> driver.
> 
I'm aware of that, but I don't see much value in kicking this discussion
around for every DRM driver submission. This is the binding that has
emerged from a lengthy discussion at KS 2013 in Edinburgh and at least
allows us to standardize on _something_. Also ALSA does a similar thing
to bind codecs and CPU interfaces together.

> > +
> > +
> > +Vivante GPU core devices
> > +====================
> > +
> > +Required properties:
> > +- compatible: Should be "vivante,gc"
> 
> This should at least have the specific cores listed like gc-5000 or 
> whatever the numbering is. As is, I can't even tell if this a 2d or 3d 
> core.
> 
There is really no need for this. The cores have identification
registers that are much more accurate than what we could describe with a
compatible value. So using a more specific compatible just provides
redundant (and possibly wrong) information.

> Also, it probably should have an SOC specific property to deal with SOC 
> specific configuration or integration.
> 
Same as above really. Parts of the identification registers are
different for each SoC integration, even if it's the same IP core, so we
can just derive any needed driver behavior differences from that.

> > +- reg: should be register base and length as documented in the
> > +  datasheet
> > +- interrupts: Should contain the cores interrupt line
> > +- clocks: should contain one clock for entry in clock-names
> > +  see Documentation/devicetree/bindings/clock/clock-bindings.txt
> > +- clock-names:
> > +   - "bus":    AXI/register clock
> > +   - "core":   GPU core clock
> > +   - "shader": Shader clock (only required if GPU has feature PIPE_3D)
> > +
> > +example:
> > +
> > +gpu_3d: gpu@00130000 {
> > +	compatible = "vivante,gc";
> > +	reg = <0x00130000 0x4000>;
> > +	interrupts = <0 9 IRQ_TYPE_LEVEL_HIGH>;
> > +	clocks = <&clks IMX6QDL_CLK_GPU3D_AXI>,
> > +	         <&clks IMX6QDL_CLK_GPU3D_CORE>,
> > +	         <&clks IMX6QDL_CLK_GPU3D_SHADER>;
> > +	clock-names = "bus", "core", "shader";
> > +};
> > -- 
> > 2.6.2
> >
Russell King - ARM Linux Dec. 4, 2015, 4:56 p.m. UTC | #3
On Fri, Dec 04, 2015 at 10:29:56AM -0600, Rob Herring wrote:
> On Fri, Dec 04, 2015 at 02:59:54PM +0100, Lucas Stach wrote:
> > +Vivante GPU core devices
> > +====================
> > +
> > +Required properties:
> > +- compatible: Should be "vivante,gc"
> 
> This should at least have the specific cores listed like gc-5000 or 
> whatever the numbering is. As is, I can't even tell if this a 2d or 3d 
> core.

Who cares whether it's a 2D or 3D core?  There are four options here:

2D
2D and 3D
3D
VG

All of these are determined by feature flags in the core - why should
DT have to specify this information when the hardware tells us, just
because we want something humanly visible?

> Also, it probably should have an SOC specific property to deal with SOC 
> specific configuration or integration.

Thus far, we're not aware of any SoC specific configuration of the
core.  Even if there was, I can't see it making much difference to
the DRM driver itself: the DRM driver mostly about command stream
queuing, GEM buffer management (both of which are independent of SoC
issues) and PM management, which is performed via runtime PM.

Where SoC specific integration would come in is the runtime PM, and
that goes as far as what clocks are present, which we already deal
with, power domains, which again we already deal with, and
interrupts - again, which are dealt with.

Given that this driver has been developed on two SoCs (Dove and iMX6)
which are radically different, I can't see any need for other SoC
specifics.

Did you have anything in mind?

Dove:
GC600 combined 2D and 3D GPU core, one clock, one interrupt, one PM
domain.

iMX6 S/DL/D/Q:
GC320 2D GPU core, GC880 or GC2000 3D GPU core, GC335 VG core, multiple
clocks, multiple interrupts, multiple PM domains.
Russell King - ARM Linux Dec. 4, 2015, 5:05 p.m. UTC | #4
On Fri, Dec 04, 2015 at 05:41:45PM +0100, Lucas Stach wrote:
> Am Freitag, den 04.12.2015, 10:29 -0600 schrieb Rob Herring:
> > On Fri, Dec 04, 2015 at 02:59:54PM +0100, Lucas Stach wrote:
> > > +gpu-subsystem {
> > > +	compatible = "fsl,imx-gpu-subsystem";
> > > +	cores = <&gpu_2d>, <&gpu_3d>;
> > > +};
> > 
> > Yeah, I'm not really a fan of doing this simply because DRM wants 1 
> > driver.
> > 
> I'm aware of that, but I don't see much value in kicking this discussion
> around for every DRM driver submission. This is the binding that has
> emerged from a lengthy discussion at KS 2013 in Edinburgh and at least
> allows us to standardize on _something_. Also ALSA does a similar thing
> to bind codecs and CPU interfaces together.

Absolutely, this is the interface and method that was discussed and
settled upon, and for DT folk to now start saying that they're not
fans of it is _far_ too late.  If they had concerns, they should have
been discussed during the submission of the first users of it, not
after the 4th or 5th user.

Sure, they may be having reservations about it, but then, I think,
it's up to them to come up with a better solution to this, and discuss
it over with the DRM people, remembering that the DRM people are very
adamant that they're not budging on the "not hotplugging bits" issue,
or if they do, it means _radically_ changing the DRM user API for
everything.

> > Also, it probably should have an SOC specific property to deal with SOC 
> > specific configuration or integration.
>
> Same as above really. Parts of the identification registers are
> different for each SoC integration, even if it's the same IP core,
> so we can just derive any needed driver behavior differences from
> that.

I agree.  There are some bugs in various cores (like the GC320) but
it's not clear whether that's a SoC specific issue or whether it's a
GPU core specific issue: all we know is that GC320 revision X suffers
from a certain bug, which we need to work around in userspace - and as
we pass all the GPU identifying information to userspace, adding such
stuff into DT, and then having to find some way to pass it through to
userspace would just add a whole new level of complexity that isn't
required.

It'd be a case of "okay, we know iMX6 has a GC320 with revision X,
so we better set DT flag Y to indicate that it has this bug" when we
already know in userspace that it's a GC320 revision X and userspace
needs to generate the command stream differently based on that.

So, it seems completely pointless to encode this information in DT.
Rob Herring Dec. 4, 2015, 5:33 p.m. UTC | #5
On Fri, Dec 4, 2015 at 10:41 AM, Lucas Stach <l.stach@pengutronix.de> wrote:
> Am Freitag, den 04.12.2015, 10:29 -0600 schrieb Rob Herring:
>> On Fri, Dec 04, 2015 at 02:59:54PM +0100, Lucas Stach wrote:
>> > Etnaviv follows the same priciple as imx-drm to have a virtual
>> > master device node to bind all the individual GPU cores together
>> > into one DRM device.
>> >
>> > Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
>> > ---
>> >  .../bindings/display/etnaviv/etnaviv-drm.txt       | 46 ++++++++++++++++++++++
>> >  1 file changed, 46 insertions(+)
>> >  create mode 100644 Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
>> >
>> > diff --git a/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
>> > new file mode 100644
>> > index 000000000000..19fde29dc1d7
>> > --- /dev/null
>> > +++ b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
>> > @@ -0,0 +1,46 @@
>> > +Etnaviv DRM master device
>> > +================================
>> > +
>> > +The Etnaviv DRM master device is a virtual device needed to list all
>> > +Vivante GPU cores that comprise the GPU subsystem.
>> > +
>> > +Required properties:
>> > +- compatible: Should be one of
>> > +    "fsl,imx-gpu-subsystem"
>> > +    "marvell,dove-gpu-subsystem"
>> > +- cores: Should contain a list of phandles pointing to Vivante GPU devices
>> > +
>> > +example:
>> > +
>> > +gpu-subsystem {
>> > +   compatible = "fsl,imx-gpu-subsystem";
>> > +   cores = <&gpu_2d>, <&gpu_3d>;
>> > +};
>>
>> Yeah, I'm not really a fan of doing this simply because DRM wants 1
>> driver.
>>
> I'm aware of that, but I don't see much value in kicking this discussion
> around for every DRM driver submission. This is the binding that has
> emerged from a lengthy discussion at KS 2013 in Edinburgh and at least
> allows us to standardize on _something_. Also ALSA does a similar thing
> to bind codecs and CPU interfaces together.

This case is quite different though I think. The ALSA case and other
DRM cases are ones that have inter-dependencies between the blocks
(e.g. some sort of h/w connection). What is the inter-dependency here?

Doing this way has also been found to be completely unnecessary and
removed in recent DRM driver reviews. Admittedly, those are cases
where one device can be the master of the others. For 2 parallel
devices, I don't have an alternative other than question why they need
to be a single driver.


>> > +Vivante GPU core devices
>> > +====================
>> > +
>> > +Required properties:
>> > +- compatible: Should be "vivante,gc"
>>
>> This should at least have the specific cores listed like gc-5000 or
>> whatever the numbering is. As is, I can't even tell if this a 2d or 3d
>> core.
>>
> There is really no need for this. The cores have identification
> registers that are much more accurate than what we could describe with a
> compatible value. So using a more specific compatible just provides
> redundant (and possibly wrong) information.
>
>> Also, it probably should have an SOC specific property to deal with SOC
>> specific configuration or integration.
>>
> Same as above really. Parts of the identification registers are
> different for each SoC integration, even if it's the same IP core, so we
> can just derive any needed driver behavior differences from that.

The h/w designers must have had a clue about s/w. First time for
everything... ;)

Ok, please just add a note to the binding why a more specific
compatible is not needed in this case.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Russell King - ARM Linux Dec. 4, 2015, 5:54 p.m. UTC | #6
On Fri, Dec 04, 2015 at 11:33:22AM -0600, Rob Herring wrote:
> On Fri, Dec 4, 2015 at 10:41 AM, Lucas Stach <l.stach@pengutronix.de> wrote:
> > I'm aware of that, but I don't see much value in kicking this discussion
> > around for every DRM driver submission. This is the binding that has
> > emerged from a lengthy discussion at KS 2013 in Edinburgh and at least
> > allows us to standardize on _something_. Also ALSA does a similar thing
> > to bind codecs and CPU interfaces together.
> 
> This case is quite different though I think. The ALSA case and other
> DRM cases are ones that have inter-dependencies between the blocks
> (e.g. some sort of h/w connection). What is the inter-dependency here?

Different GPU cores, where each GPU core is a separate device.  On iMX6,
we have three different GPU cores, two of which need to do cross-GPU
synchronisation for accesses to the buffers.

Sure, we could have decided to publish three entirely different DRM
devices, but that has serious implications: we would be highly non-
standard on the DRI and XF86 cases, to the point where it'd probably
be beyond any hope in getting it to work sanely.  DRI2 requires the
DRM device to be used by MESA to be passed across the X protocol to
authorise the MESA side.

It's already icky enough with the separate DRM device for KMS - which
having poked about with DRI3, I can't see really working at the moment.

> > Same as above really. Parts of the identification registers are
> > different for each SoC integration, even if it's the same IP core, so we
> > can just derive any needed driver behavior differences from that.
> 
> The h/w designers must have had a clue about s/w. First time for
> everything... ;)
> 
> Ok, please just add a note to the binding why a more specific
> compatible is not needed in this case.

If we did want to be safe, we _could_ require something like:

	compatible = "vivante,gc320", "vivante,gc";

but I don't see much value to that, because, from what I can see if the
way the GPU numbering/versioning scheme works, the revision number is
the more important bit of information: I believe this started at 1 which
has been incremented ever since.

This can be seen as the GC600 is a much older core than the GC320.

The GC numbering tells you nothing (as far as can be worked out yet)
about what features, what kind of GPU, or anything like that.

Another example: the iMX6 S/DL has a GC880 instead of a GC2000.  While
both are a 3D GPU, they're not really "compatible" with each other -
GC880 has one pixel pipe which requires userspace buffers to be tiled
one way, vs GC2000 which has a different form of tiling to allow two
pixel pipes to operate efficiently.

The number of pixel pipes (and other features) comes from the feature/
specification registers embedded in the device - there's currently 6
32-bit registers, each containing 32 feature flags, describing what
features are present, and including what bugs are present and/or have
been fixed.

There have been one "slip-up" along the way, where the Vivante software
folk have comments in the code along the lines of "Hardware people should
talk to us!!!" which tends to suggest that they've hit the communication
problem, and as it's for very early stuff, and nothing later, it suggests
that they've learnt that lesson themselves already. :)
Lucas Stach Dec. 4, 2015, 5:56 p.m. UTC | #7
Am Freitag, den 04.12.2015, 11:33 -0600 schrieb Rob Herring:
> On Fri, Dec 4, 2015 at 10:41 AM, Lucas Stach <l.stach@pengutronix.de> wrote:
> > Am Freitag, den 04.12.2015, 10:29 -0600 schrieb Rob Herring:
> >> On Fri, Dec 04, 2015 at 02:59:54PM +0100, Lucas Stach wrote:
> >> > Etnaviv follows the same priciple as imx-drm to have a virtual
> >> > master device node to bind all the individual GPU cores together
> >> > into one DRM device.
> >> >
> >> > Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> >> > ---
> >> >  .../bindings/display/etnaviv/etnaviv-drm.txt       | 46 ++++++++++++++++++++++
> >> >  1 file changed, 46 insertions(+)
> >> >  create mode 100644 Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> >> >
> >> > diff --git a/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> >> > new file mode 100644
> >> > index 000000000000..19fde29dc1d7
> >> > --- /dev/null
> >> > +++ b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> >> > @@ -0,0 +1,46 @@
> >> > +Etnaviv DRM master device
> >> > +================================
> >> > +
> >> > +The Etnaviv DRM master device is a virtual device needed to list all
> >> > +Vivante GPU cores that comprise the GPU subsystem.
> >> > +
> >> > +Required properties:
> >> > +- compatible: Should be one of
> >> > +    "fsl,imx-gpu-subsystem"
> >> > +    "marvell,dove-gpu-subsystem"
> >> > +- cores: Should contain a list of phandles pointing to Vivante GPU devices
> >> > +
> >> > +example:
> >> > +
> >> > +gpu-subsystem {
> >> > +   compatible = "fsl,imx-gpu-subsystem";
> >> > +   cores = <&gpu_2d>, <&gpu_3d>;
> >> > +};
> >>
> >> Yeah, I'm not really a fan of doing this simply because DRM wants 1
> >> driver.
> >>
> > I'm aware of that, but I don't see much value in kicking this discussion
> > around for every DRM driver submission. This is the binding that has
> > emerged from a lengthy discussion at KS 2013 in Edinburgh and at least
> > allows us to standardize on _something_. Also ALSA does a similar thing
> > to bind codecs and CPU interfaces together.
> 
> This case is quite different though I think. The ALSA case and other
> DRM cases are ones that have inter-dependencies between the blocks
> (e.g. some sort of h/w connection). What is the inter-dependency here?
> 
> Doing this way has also been found to be completely unnecessary and
> removed in recent DRM driver reviews. Admittedly, those are cases
> where one device can be the master of the others. For 2 parallel
> devices, I don't have an alternative other than question why they need
> to be a single driver.
> 
If you insist on doing things differently for this driver, we could add
a pass at driver registration that scans through the DT, looking for
nodes matching the GPU core compatible.

I'm not sure if that makes things cleaner though and might bite us later
on. Also I'm not sure if moving away from the binding scheme already
established for other DRM drivers makes things better from a DT
perspective. Personally I would prefer DT binding consistency over
perfection for single drivers, segmenting the DT binding space.

Regards,
Lucas
Jon Nettleton Dec. 4, 2015, 6:02 p.m. UTC | #8
On Fri, Dec 4, 2015 at 6:56 PM, Lucas Stach <l.stach@pengutronix.de> wrote:
> Am Freitag, den 04.12.2015, 11:33 -0600 schrieb Rob Herring:
>> On Fri, Dec 4, 2015 at 10:41 AM, Lucas Stach <l.stach@pengutronix.de> wrote:
>> > Am Freitag, den 04.12.2015, 10:29 -0600 schrieb Rob Herring:
>> >> On Fri, Dec 04, 2015 at 02:59:54PM +0100, Lucas Stach wrote:
>> >> > Etnaviv follows the same priciple as imx-drm to have a virtual
>> >> > master device node to bind all the individual GPU cores together
>> >> > into one DRM device.
>> >> >
>> >> > Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
>> >> > ---
>> >> >  .../bindings/display/etnaviv/etnaviv-drm.txt       | 46 ++++++++++++++++++++++
>> >> >  1 file changed, 46 insertions(+)
>> >> >  create mode 100644 Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
>> >> >
>> >> > diff --git a/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
>> >> > new file mode 100644
>> >> > index 000000000000..19fde29dc1d7
>> >> > --- /dev/null
>> >> > +++ b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
>> >> > @@ -0,0 +1,46 @@
>> >> > +Etnaviv DRM master device
>> >> > +================================
>> >> > +
>> >> > +The Etnaviv DRM master device is a virtual device needed to list all
>> >> > +Vivante GPU cores that comprise the GPU subsystem.
>> >> > +
>> >> > +Required properties:
>> >> > +- compatible: Should be one of
>> >> > +    "fsl,imx-gpu-subsystem"
>> >> > +    "marvell,dove-gpu-subsystem"
>> >> > +- cores: Should contain a list of phandles pointing to Vivante GPU devices
>> >> > +
>> >> > +example:
>> >> > +
>> >> > +gpu-subsystem {
>> >> > +   compatible = "fsl,imx-gpu-subsystem";
>> >> > +   cores = <&gpu_2d>, <&gpu_3d>;
>> >> > +};
>> >>
>> >> Yeah, I'm not really a fan of doing this simply because DRM wants 1
>> >> driver.
>> >>
>> > I'm aware of that, but I don't see much value in kicking this discussion
>> > around for every DRM driver submission. This is the binding that has
>> > emerged from a lengthy discussion at KS 2013 in Edinburgh and at least
>> > allows us to standardize on _something_. Also ALSA does a similar thing
>> > to bind codecs and CPU interfaces together.
>>
>> This case is quite different though I think. The ALSA case and other
>> DRM cases are ones that have inter-dependencies between the blocks
>> (e.g. some sort of h/w connection). What is the inter-dependency here?
>>
>> Doing this way has also been found to be completely unnecessary and
>> removed in recent DRM driver reviews. Admittedly, those are cases
>> where one device can be the master of the others. For 2 parallel
>> devices, I don't have an alternative other than question why they need
>> to be a single driver.
>>
> If you insist on doing things differently for this driver, we could add
> a pass at driver registration that scans through the DT, looking for
> nodes matching the GPU core compatible.
>
> I'm not sure if that makes things cleaner though and might bite us later
> on. Also I'm not sure if moving away from the binding scheme already
> established for other DRM drivers makes things better from a DT
> perspective. Personally I would prefer DT binding consistency over
> perfection for single drivers, segmenting the DT binding space.
>

We should also keep in mind that Vivante is working on newer chipsets
that also include multiple independent 3d cores.  I am not even sure
how userspace would deal with this using the suggested changes.
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rob Herring Dec. 4, 2015, 8:19 p.m. UTC | #9
On Fri, Dec 4, 2015 at 11:56 AM, Lucas Stach <l.stach@pengutronix.de> wrote:
> Am Freitag, den 04.12.2015, 11:33 -0600 schrieb Rob Herring:
>> On Fri, Dec 4, 2015 at 10:41 AM, Lucas Stach <l.stach@pengutronix.de> wrote:
>> > Am Freitag, den 04.12.2015, 10:29 -0600 schrieb Rob Herring:
>> >> On Fri, Dec 04, 2015 at 02:59:54PM +0100, Lucas Stach wrote:
>> >> > Etnaviv follows the same priciple as imx-drm to have a virtual
>> >> > master device node to bind all the individual GPU cores together
>> >> > into one DRM device.
>> >> >
>> >> > Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
>> >> > ---
>> >> >  .../bindings/display/etnaviv/etnaviv-drm.txt       | 46 ++++++++++++++++++++++
>> >> >  1 file changed, 46 insertions(+)
>> >> >  create mode 100644 Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
>> >> >
>> >> > diff --git a/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
>> >> > new file mode 100644
>> >> > index 000000000000..19fde29dc1d7
>> >> > --- /dev/null
>> >> > +++ b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
>> >> > @@ -0,0 +1,46 @@
>> >> > +Etnaviv DRM master device
>> >> > +================================
>> >> > +
>> >> > +The Etnaviv DRM master device is a virtual device needed to list all
>> >> > +Vivante GPU cores that comprise the GPU subsystem.
>> >> > +
>> >> > +Required properties:
>> >> > +- compatible: Should be one of
>> >> > +    "fsl,imx-gpu-subsystem"
>> >> > +    "marvell,dove-gpu-subsystem"
>> >> > +- cores: Should contain a list of phandles pointing to Vivante GPU devices
>> >> > +
>> >> > +example:
>> >> > +
>> >> > +gpu-subsystem {
>> >> > +   compatible = "fsl,imx-gpu-subsystem";
>> >> > +   cores = <&gpu_2d>, <&gpu_3d>;
>> >> > +};
>> >>
>> >> Yeah, I'm not really a fan of doing this simply because DRM wants 1
>> >> driver.
>> >>
>> > I'm aware of that, but I don't see much value in kicking this discussion
>> > around for every DRM driver submission. This is the binding that has
>> > emerged from a lengthy discussion at KS 2013 in Edinburgh and at least
>> > allows us to standardize on _something_. Also ALSA does a similar thing
>> > to bind codecs and CPU interfaces together.
>>
>> This case is quite different though I think. The ALSA case and other
>> DRM cases are ones that have inter-dependencies between the blocks
>> (e.g. some sort of h/w connection). What is the inter-dependency here?
>>
>> Doing this way has also been found to be completely unnecessary and
>> removed in recent DRM driver reviews. Admittedly, those are cases
>> where one device can be the master of the others. For 2 parallel
>> devices, I don't have an alternative other than question why they need
>> to be a single driver.
>>
> If you insist on doing things differently for this driver, we could add
> a pass at driver registration that scans through the DT, looking for
> nodes matching the GPU core compatible.

I've not insisted on anything. I'm only asking a question which didn't
get answered. I'll ask another way. Why can't you have 2 instances of
the same driver given they are only rendering nodes?

> I'm not sure if that makes things cleaner though and might bite us later
> on. Also I'm not sure if moving away from the binding scheme already
> established for other DRM drivers makes things better from a DT
> perspective. Personally I would prefer DT binding consistency over
> perfection for single drivers, segmenting the DT binding space.

This is the least of our issues in terms of consistency among drivers,
but that is in fact what I'm pushing for. This is probably the first
case of a render only driver (at least for DT). So this isn't a case
of just follow what others are doing.

The h/w in this area can be quite different, so the DT bindings are
going to reflect that to some extent. A virtual node makes sense in
some cases, but for others it may not.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Russell King - ARM Linux Dec. 4, 2015, 8:31 p.m. UTC | #10
On Fri, Dec 04, 2015 at 02:19:42PM -0600, Rob Herring wrote:
> On Fri, Dec 4, 2015 at 11:56 AM, Lucas Stach <l.stach@pengutronix.de> wrote:
> > Am Freitag, den 04.12.2015, 11:33 -0600 schrieb Rob Herring:
> >> On Fri, Dec 4, 2015 at 10:41 AM, Lucas Stach <l.stach@pengutronix.de> wrote:
> >> > Am Freitag, den 04.12.2015, 10:29 -0600 schrieb Rob Herring:
> >> >> On Fri, Dec 04, 2015 at 02:59:54PM +0100, Lucas Stach wrote:
> >> >> > Etnaviv follows the same priciple as imx-drm to have a virtual
> >> >> > master device node to bind all the individual GPU cores together
> >> >> > into one DRM device.
> >> >> >
> >> >> > Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> >> >> > ---
> >> >> >  .../bindings/display/etnaviv/etnaviv-drm.txt       | 46 ++++++++++++++++++++++
> >> >> >  1 file changed, 46 insertions(+)
> >> >> >  create mode 100644 Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> >> >> >
> >> >> > diff --git a/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> >> >> > new file mode 100644
> >> >> > index 000000000000..19fde29dc1d7
> >> >> > --- /dev/null
> >> >> > +++ b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> >> >> > @@ -0,0 +1,46 @@
> >> >> > +Etnaviv DRM master device
> >> >> > +================================
> >> >> > +
> >> >> > +The Etnaviv DRM master device is a virtual device needed to list all
> >> >> > +Vivante GPU cores that comprise the GPU subsystem.
> >> >> > +
> >> >> > +Required properties:
> >> >> > +- compatible: Should be one of
> >> >> > +    "fsl,imx-gpu-subsystem"
> >> >> > +    "marvell,dove-gpu-subsystem"
> >> >> > +- cores: Should contain a list of phandles pointing to Vivante GPU devices
> >> >> > +
> >> >> > +example:
> >> >> > +
> >> >> > +gpu-subsystem {
> >> >> > +   compatible = "fsl,imx-gpu-subsystem";
> >> >> > +   cores = <&gpu_2d>, <&gpu_3d>;
> >> >> > +};
> >> >>
> >> >> Yeah, I'm not really a fan of doing this simply because DRM wants 1
> >> >> driver.
> >> >>
> >> > I'm aware of that, but I don't see much value in kicking this discussion
> >> > around for every DRM driver submission. This is the binding that has
> >> > emerged from a lengthy discussion at KS 2013 in Edinburgh and at least
> >> > allows us to standardize on _something_. Also ALSA does a similar thing
> >> > to bind codecs and CPU interfaces together.
> >>
> >> This case is quite different though I think. The ALSA case and other
> >> DRM cases are ones that have inter-dependencies between the blocks
> >> (e.g. some sort of h/w connection). What is the inter-dependency here?
> >>
> >> Doing this way has also been found to be completely unnecessary and
> >> removed in recent DRM driver reviews. Admittedly, those are cases
> >> where one device can be the master of the others. For 2 parallel
> >> devices, I don't have an alternative other than question why they need
> >> to be a single driver.
> >>
> > If you insist on doing things differently for this driver, we could add
> > a pass at driver registration that scans through the DT, looking for
> > nodes matching the GPU core compatible.
> 
> I've not insisted on anything. I'm only asking a question which didn't
> get answered. I'll ask another way. Why can't you have 2 instances of
> the same driver given they are only rendering nodes?

Sorry, but it _did_ get answered - I answered that in my reply to you.
I'll repeat it again, but more briefly, and then expand on it: it's what
userspace like Xorg DRI2 and MESA want.

Yes, there's DRI3, which is more modern and in theory allows multiple
renderers to be opened by the client, but so far I fail to see how that
can work with a separate KMS DRM driver.  It _may_ be intended to, but
the problem I see here is that when you have the KMS hardware only
capable of scanning out linear buffers, but the GPU hardware is only
capable of rendering to tiled buffers, there needs to be some way to
allocate KMS buffers in the client, and right now I don't see any way
to know what the KMS DRM device being used is in the DRI3/Present Xorg
extensions.

Moreover, DRI3 is not yet available for Gallium, so if we're talking
about Xorg, then functional DRI2 is a requirement, and that _needs_
to have a single device for the rendering instances.  Xorg has no way
to pass multiple render nodes to client over DRI2.
Ilia Mirkin Dec. 4, 2015, 8:42 p.m. UTC | #11
On Fri, Dec 4, 2015 at 3:31 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> Moreover, DRI3 is not yet available for Gallium, so if we're talking
> about Xorg, then functional DRI2 is a requirement, and that _needs_
> to have a single device for the rendering instances.  Xorg has no way
> to pass multiple render nodes to client over DRI2.

Just to correct... DRI3 has been available on gallium [at least in the
context of st/mesa] basically since DRI3 was introduced. Not sure what
issue you're fighting with, but it's definitely not a gallium
limitation... could be something related to platform devices.

It is also possible to use DRI_PRIME=<udev idpath> to select which
device to use. Not sure whether that has been available since the
beginning or not.

That said e.g. st/vdpau still doesn't deal with DRI3 which is a bit
annoying. And your other points all still stand (buffer sharing =
pain, tiling = not figured out, etc).

  -ilia
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Vetter Dec. 4, 2015, 8:43 p.m. UTC | #12
On Fri, Dec 04, 2015 at 08:31:01PM +0000, Russell King - ARM Linux wrote:
> On Fri, Dec 04, 2015 at 02:19:42PM -0600, Rob Herring wrote:
> > On Fri, Dec 4, 2015 at 11:56 AM, Lucas Stach <l.stach@pengutronix.de> wrote:
> > > Am Freitag, den 04.12.2015, 11:33 -0600 schrieb Rob Herring:
> > >> On Fri, Dec 4, 2015 at 10:41 AM, Lucas Stach <l.stach@pengutronix.de> wrote:
> > >> > Am Freitag, den 04.12.2015, 10:29 -0600 schrieb Rob Herring:
> > >> >> On Fri, Dec 04, 2015 at 02:59:54PM +0100, Lucas Stach wrote:
> > >> >> > Etnaviv follows the same priciple as imx-drm to have a virtual
> > >> >> > master device node to bind all the individual GPU cores together
> > >> >> > into one DRM device.
> > >> >> >
> > >> >> > Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> > >> >> > ---
> > >> >> >  .../bindings/display/etnaviv/etnaviv-drm.txt       | 46 ++++++++++++++++++++++
> > >> >> >  1 file changed, 46 insertions(+)
> > >> >> >  create mode 100644 Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> > >> >> >
> > >> >> > diff --git a/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> > >> >> > new file mode 100644
> > >> >> > index 000000000000..19fde29dc1d7
> > >> >> > --- /dev/null
> > >> >> > +++ b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
> > >> >> > @@ -0,0 +1,46 @@
> > >> >> > +Etnaviv DRM master device
> > >> >> > +================================
> > >> >> > +
> > >> >> > +The Etnaviv DRM master device is a virtual device needed to list all
> > >> >> > +Vivante GPU cores that comprise the GPU subsystem.
> > >> >> > +
> > >> >> > +Required properties:
> > >> >> > +- compatible: Should be one of
> > >> >> > +    "fsl,imx-gpu-subsystem"
> > >> >> > +    "marvell,dove-gpu-subsystem"
> > >> >> > +- cores: Should contain a list of phandles pointing to Vivante GPU devices
> > >> >> > +
> > >> >> > +example:
> > >> >> > +
> > >> >> > +gpu-subsystem {
> > >> >> > +   compatible = "fsl,imx-gpu-subsystem";
> > >> >> > +   cores = <&gpu_2d>, <&gpu_3d>;
> > >> >> > +};
> > >> >>
> > >> >> Yeah, I'm not really a fan of doing this simply because DRM wants 1
> > >> >> driver.
> > >> >>
> > >> > I'm aware of that, but I don't see much value in kicking this discussion
> > >> > around for every DRM driver submission. This is the binding that has
> > >> > emerged from a lengthy discussion at KS 2013 in Edinburgh and at least
> > >> > allows us to standardize on _something_. Also ALSA does a similar thing
> > >> > to bind codecs and CPU interfaces together.
> > >>
> > >> This case is quite different though I think. The ALSA case and other
> > >> DRM cases are ones that have inter-dependencies between the blocks
> > >> (e.g. some sort of h/w connection). What is the inter-dependency here?
> > >>
> > >> Doing this way has also been found to be completely unnecessary and
> > >> removed in recent DRM driver reviews. Admittedly, those are cases
> > >> where one device can be the master of the others. For 2 parallel
> > >> devices, I don't have an alternative other than question why they need
> > >> to be a single driver.
> > >>
> > > If you insist on doing things differently for this driver, we could add
> > > a pass at driver registration that scans through the DT, looking for
> > > nodes matching the GPU core compatible.
> > 
> > I've not insisted on anything. I'm only asking a question which didn't
> > get answered. I'll ask another way. Why can't you have 2 instances of
> > the same driver given they are only rendering nodes?
> 
> Sorry, but it _did_ get answered - I answered that in my reply to you.
> I'll repeat it again, but more briefly, and then expand on it: it's what
> userspace like Xorg DRI2 and MESA want.
> 
> Yes, there's DRI3, which is more modern and in theory allows multiple
> renderers to be opened by the client, but so far I fail to see how that
> can work with a separate KMS DRM driver.  It _may_ be intended to, but
> the problem I see here is that when you have the KMS hardware only
> capable of scanning out linear buffers, but the GPU hardware is only
> capable of rendering to tiled buffers, there needs to be some way to
> allocate KMS buffers in the client, and right now I don't see any way
> to know what the KMS DRM device being used is in the DRI3/Present Xorg
> extensions.
> 
> Moreover, DRI3 is not yet available for Gallium, so if we're talking
> about Xorg, then functional DRI2 is a requirement, and that _needs_
> to have a single device for the rendering instances.  Xorg has no way
> to pass multiple render nodes to client over DRI2.

The only thing that DRI2 needs is that both client and X use the same
device. But you can just dma-buf share stuff in the client (e.g. we had
some code at intel to decode videos in libva with some code chip, then
dma-buf share to i915.ko for post-proc, then dri2 flink to the X server).
Same on the X server, you're X driver could dma-buf share stuff between
the 3d core, 2d core and whatever kms driver is used for display and throw
a good party with all of them ;-) Of course for that sharing you need to
open everything as rendernodes, except for the one driver you use for
dri2, that needs to run in master mode in X to be able to open flink
names.

So drm doesn't want a single driver at all, it's purely a question of what
makes sense wrt code organisation and reusing of drivers (for same or
similar IP blocks) across different products. Personally I'm leaning
towards smashing related things together since then you can have some good
tailor-made infrastructure (like e.g. all the big drivers have with their
sometimes massive internal abstraction).
-Daniel
Russell King - ARM Linux Dec. 4, 2015, 10:05 p.m. UTC | #13
On Fri, Dec 04, 2015 at 03:42:47PM -0500, Ilia Mirkin wrote:
> On Fri, Dec 4, 2015 at 3:31 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > Moreover, DRI3 is not yet available for Gallium, so if we're talking
> > about Xorg, then functional DRI2 is a requirement, and that _needs_
> > to have a single device for the rendering instances.  Xorg has no way
> > to pass multiple render nodes to client over DRI2.
> 
> Just to correct... DRI3 has been available on gallium [at least in the
> context of st/mesa] basically since DRI3 was introduced. Not sure what
> issue you're fighting with, but it's definitely not a gallium
> limitation... could be something related to platform devices.

Well, my statement is based on the fact that there's nothing in
src/gallium/state-tracker/dri which hints at being DRI3.  Maybe it's
implemented differently, I don't know.  I never wanted to hack MESA -
I'm supposed to be the ARM kernel maintainer - and I'm still very new
to MESA.

I think it's a DRI3 limitation.  The issue with the DRI3 design is that:

* The client has access to the GPU render nodes only, but not the
  corresponding KMS node.
* Buffers in DRI3 are allocated from the GPU render nodes.
* The Xorg Present protocol is then used to manage the vblank
  synchonisation and page flips.

Now, the KMS scanout hardware typically does not support any kind of
scatter-gather: the buffers it has must be contiguous.  These can be
allocated from the KMS DRM device.

However, the DRI3 client has no access to the KMS DRM device to allocate
linear buffers from, and GPUs typically don't have dumb buffer support.
Hence, the client can't pass a suitable buffer to the present layer.

Hence, I can see no way for the resource_create() to be able to allocate
any kind of scanout capable buffer.

That's a separate issue though: you've pointed out that you can select
which render node to use: what if we want to use multiple render nodes
simultaneously - eg, because we want to use multiple 3D GPU cores
together?  How does that work with stuff?

I think the idea that individual GPU cores should be exposed as
separate DRM devices is fundamentally flawed, and adds a hell of a
lot of complexity.

In any case, I've spent _way_ too much time on etnaviv during November -
quite literally almost every day (I worked out that I was producing 4.5
patches a day during November for Etnaviv MESA.)  I'm afraid that it's
now time that I switch my attention elsewhere, so if this means that
Etnaviv is rejected for merging, I'm afraid it'll be a few months before
I can get back to it.

It would have been nice if these issues had been brought up much earlier,
during the RFC posting of the patches.  These are nothing new, and I now
feel that this whole thing has been a total waste of time.
Ilia Mirkin Dec. 4, 2015, 10:43 p.m. UTC | #14
On Fri, Dec 4, 2015 at 5:05 PM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Fri, Dec 04, 2015 at 03:42:47PM -0500, Ilia Mirkin wrote:
>> On Fri, Dec 4, 2015 at 3:31 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>> > Moreover, DRI3 is not yet available for Gallium, so if we're talking
>> > about Xorg, then functional DRI2 is a requirement, and that _needs_
>> > to have a single device for the rendering instances.  Xorg has no way
>> > to pass multiple render nodes to client over DRI2.
>>
>> Just to correct... DRI3 has been available on gallium [at least in the
>> context of st/mesa] basically since DRI3 was introduced. Not sure what
>> issue you're fighting with, but it's definitely not a gallium
>> limitation... could be something related to platform devices.
>
> Well, my statement is based on the fact that there's nothing in
> src/gallium/state-tracker/dri which hints at being DRI3.  Maybe it's
> implemented differently, I don't know.  I never wanted to hack MESA -
> I'm supposed to be the ARM kernel maintainer - and I'm still very new
> to MESA.
>
> I think it's a DRI3 limitation.  The issue with the DRI3 design is that:
>
> * The client has access to the GPU render nodes only, but not the
>   corresponding KMS node.
> * Buffers in DRI3 are allocated from the GPU render nodes.
> * The Xorg Present protocol is then used to manage the vblank
>   synchonisation and page flips.
>
> Now, the KMS scanout hardware typically does not support any kind of
> scatter-gather: the buffers it has must be contiguous.  These can be
> allocated from the KMS DRM device.
>
> However, the DRI3 client has no access to the KMS DRM device to allocate
> linear buffers from, and GPUs typically don't have dumb buffer support.
> Hence, the client can't pass a suitable buffer to the present layer.
>
> Hence, I can see no way for the resource_create() to be able to allocate
> any kind of scanout capable buffer.
>
> That's a separate issue though: you've pointed out that you can select
> which render node to use: what if we want to use multiple render nodes
> simultaneously - eg, because we want to use multiple 3D GPU cores
> together?  How does that work with stuff?

This is a bit like the SLI question -- let's say you have 2 pricey
desktop GPUs, with a fast interconnect between them, which lets them
know about each other, how do you make use of that. Solution: unsolved
:)

In an ideal world, you'd have a single driver that knows how to
interact with multiple devices and make them do cool things. However
this a completely untrodden path. (Not to mention the problem of *how*
to break up a workload across 2 GPUs.)

>
> I think the idea that individual GPU cores should be exposed as
> separate DRM devices is fundamentally flawed, and adds a hell of a
> lot of complexity.
>
> In any case, I've spent _way_ too much time on etnaviv during November -
> quite literally almost every day (I worked out that I was producing 4.5
> patches a day during November for Etnaviv MESA.)  I'm afraid that it's
> now time that I switch my attention elsewhere, so if this means that
> Etnaviv is rejected for merging, I'm afraid it'll be a few months before
> I can get back to it.
>
> It would have been nice if these issues had been brought up much earlier,
> during the RFC posting of the patches.  These are nothing new, and I now
> feel that this whole thing has been a total waste of time.

The whole multi-gpu thing is a bit of an open question right now. It
works in theory, but in practice nobody's done it. Maarten tried to
get nouveau/gk20a + tegra/drm on Jetson TK1 to play nice with, e.g., X
2d accel, and it was an exercise in pain. Not sure that he ever
succeeded.

I think it's unfair to try to make new hardware enablement the time to
attempt to heap extra work onto those authors unfortunate enough to
have slightly unorthodox hardware that maps nicely onto some
desired-but-not-quite-there-yet usage model -- they have enough
problems.

The way everything works is one drm node can do everything. PRIME is a
barely-functional way to offload some things onto a (single) secondary
GPU. Everything beyond that is purely theoretical.

None of what's being done now prevents some future where these things
are broken up. No need to force it now.

  -ilia
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Vetter Dec. 5, 2015, 10:12 a.m. UTC | #15
On Fri, Dec 04, 2015 at 05:43:33PM -0500, Ilia Mirkin wrote:
> On Fri, Dec 4, 2015 at 5:05 PM, Russell King - ARM Linux
> <linux@arm.linux.org.uk> wrote:
> > On Fri, Dec 04, 2015 at 03:42:47PM -0500, Ilia Mirkin wrote:
> >> On Fri, Dec 4, 2015 at 3:31 PM, Russell King - ARM Linux
> >> <linux@arm.linux.org.uk> wrote:
> >> > Moreover, DRI3 is not yet available for Gallium, so if we're talking
> >> > about Xorg, then functional DRI2 is a requirement, and that _needs_
> >> > to have a single device for the rendering instances.  Xorg has no way
> >> > to pass multiple render nodes to client over DRI2.
> >>
> >> Just to correct... DRI3 has been available on gallium [at least in the
> >> context of st/mesa] basically since DRI3 was introduced. Not sure what
> >> issue you're fighting with, but it's definitely not a gallium
> >> limitation... could be something related to platform devices.
> >
> > Well, my statement is based on the fact that there's nothing in
> > src/gallium/state-tracker/dri which hints at being DRI3.  Maybe it's
> > implemented differently, I don't know.  I never wanted to hack MESA -
> > I'm supposed to be the ARM kernel maintainer - and I'm still very new
> > to MESA.
> >
> > I think it's a DRI3 limitation.  The issue with the DRI3 design is that:
> >
> > * The client has access to the GPU render nodes only, but not the
> >   corresponding KMS node.
> > * Buffers in DRI3 are allocated from the GPU render nodes.
> > * The Xorg Present protocol is then used to manage the vblank
> >   synchonisation and page flips.
> >
> > Now, the KMS scanout hardware typically does not support any kind of
> > scatter-gather: the buffers it has must be contiguous.  These can be
> > allocated from the KMS DRM device.
> >
> > However, the DRI3 client has no access to the KMS DRM device to allocate
> > linear buffers from, and GPUs typically don't have dumb buffer support.
> > Hence, the client can't pass a suitable buffer to the present layer.

Oh right, buffer alloc if you have special constraints won't work with
DRI3 as-is. For that we probably need something like DRI2 for buffer alloc
+ Present (like DRI3 does) for flipping them.

> > Hence, I can see no way for the resource_create() to be able to allocate
> > any kind of scanout capable buffer.
> >
> > That's a separate issue though: you've pointed out that you can select
> > which render node to use: what if we want to use multiple render nodes
> > simultaneously - eg, because we want to use multiple 3D GPU cores
> > together?  How does that work with stuff?
> 
> This is a bit like the SLI question -- let's say you have 2 pricey
> desktop GPUs, with a fast interconnect between them, which lets them
> know about each other, how do you make use of that. Solution: unsolved
> :)
> 
> In an ideal world, you'd have a single driver that knows how to
> interact with multiple devices and make them do cool things. However
> this a completely untrodden path. (Not to mention the problem of *how*
> to break up a workload across 2 GPUs.)
> 
> >
> > I think the idea that individual GPU cores should be exposed as
> > separate DRM devices is fundamentally flawed, and adds a hell of a
> > lot of complexity.
> >
> > In any case, I've spent _way_ too much time on etnaviv during November -
> > quite literally almost every day (I worked out that I was producing 4.5
> > patches a day during November for Etnaviv MESA.)  I'm afraid that it's
> > now time that I switch my attention elsewhere, so if this means that
> > Etnaviv is rejected for merging, I'm afraid it'll be a few months before
> > I can get back to it.
> >
> > It would have been nice if these issues had been brought up much earlier,
> > during the RFC posting of the patches.  These are nothing new, and I now
> > feel that this whole thing has been a total waste of time.
> 
> The whole multi-gpu thing is a bit of an open question right now. It
> works in theory, but in practice nobody's done it. Maarten tried to
> get nouveau/gk20a + tegra/drm on Jetson TK1 to play nice with, e.g., X
> 2d accel, and it was an exercise in pain. Not sure that he ever
> succeeded.
> 
> I think it's unfair to try to make new hardware enablement the time to
> attempt to heap extra work onto those authors unfortunate enough to
> have slightly unorthodox hardware that maps nicely onto some
> desired-but-not-quite-there-yet usage model -- they have enough
> problems.
>
> The way everything works is one drm node can do everything. PRIME is a
> barely-functional way to offload some things onto a (single) secondary
> GPU. Everything beyond that is purely theoretical.

One thing that doesn't yet work with PRIME is synchronization. But at list
for the render->display combo Alex Goins from nvidia fixed it up, and it's
pretty much trivial to do so (2 patches on i915 since we needed to support
both atomic and legacy page_flip, would have been just 1 tiny patch
otherwise). i915->nvidia would have been more work because of some locking
fun, but the infrastructure is now all there (including patches for Xorg).

> None of what's being done now prevents some future where these things
> are broken up. No need to force it now.

See my other mail, but intel implemented the 2 drm drivers, one logical
thing model. Unfortunately didn't go upstream, but definitely possible.

But the real question is whether it makes sense to split drivers, and imo
for that question the only criteria should be:
- How different are the blocks? When there's a major architectural shift
  across everything from a vendor then splitting makes sense. Looking at
  etnaviv it seems like there's at least somewhat a unified frontend on
  top of these different blocks, so makes sense to keep them together in
  one driver.

- How likely is the IP block to be reused in a totally different SoC/GPU?
  That's why we've done the split in intel for libva since that additional
  core was a licensed IP core also used it other SoCs. Since the proposed
  etnaviv driver can already work together with 2 different display
  drivers we seem to have sufficient flexibility here. So if someone would
  combine e.g. etnaviv 2d core with some other 3d core it should pretty
  much just work. Well there's some fun maybe on the userspace side to
  glue things together, but not on the kernel.

Given that I think the current etnaviv is a sound architecture. And I'm
not saying that because drm requires everything to be smashed into one
driver, since that's simple not the case.

Cheers, Daniel
Russell King - ARM Linux Dec. 5, 2015, 11:02 a.m. UTC | #16
On Sat, Dec 05, 2015 at 11:12:08AM +0100, Daniel Vetter wrote:
> Given that I think the current etnaviv is a sound architecture. And I'm
> not saying that because drm requires everything to be smashed into one
> driver, since that's simple not the case.

There's other reasons as well, mostly from the performance point of view.
Having separate DRM devices for each GPU means you need to use dmabuf to
share buffers across the GPUs.  This brings with it several kinds of
overhead:

1. having more fd usage in the client programs.
2. having more memory usage as a result.
3. having more locks, due to more object lists.
4. having the overhead from the DMA API when importing buffers between
   different GPU nodes.

>From my performance measurements over the last month, the top hit is
currently from the DMA API, so having to export and import buffers
between different GPU devices is just going to make that worse.
Lucas Stach Dec. 5, 2015, 11:26 a.m. UTC | #17
Am Freitag, den 04.12.2015, 14:19 -0600 schrieb Rob Herring:
> On Fri, Dec 4, 2015 at 11:56 AM, Lucas Stach <l.stach@pengutronix.de>
> wrote:
> > Am Freitag, den 04.12.2015, 11:33 -0600 schrieb Rob Herring:
> > > On Fri, Dec 4, 2015 at 10:41 AM, Lucas Stach <l.stach@pengutronix
> > > .de> wrote:
> > > > Am Freitag, den 04.12.2015, 10:29 -0600 schrieb Rob Herring:
> > > > > On Fri, Dec 04, 2015 at 02:59:54PM +0100, Lucas Stach wrote:
> > > > > > Etnaviv follows the same priciple as imx-drm to have a
> > > > > > virtual
> > > > > > master device node to bind all the individual GPU cores
> > > > > > together
> > > > > > into one DRM device.
> > > > > > 
> > > > > > Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
> > > > > > ---
> > > > > >  .../bindings/display/etnaviv/etnaviv-drm.txt       | 46
> > > > > > ++++++++++++++++++++++
> > > > > >  1 file changed, 46 insertions(+)
> > > > > >  create mode 100644
> > > > > > Documentation/devicetree/bindings/display/etnaviv/etnaviv-
> > > > > > drm.txt
> > > > > > 
> > > > > > diff --git
> > > > > > a/Documentation/devicetree/bindings/display/etnaviv/etnaviv
> > > > > > -drm.txt
> > > > > > b/Documentation/devicetree/bindings/display/etnaviv/etnaviv
> > > > > > -drm.txt
> > > > > > new file mode 100644
> > > > > > index 000000000000..19fde29dc1d7
> > > > > > --- /dev/null
> > > > > > +++
> > > > > > b/Documentation/devicetree/bindings/display/etnaviv/etnaviv
> > > > > > -drm.txt
> > > > > > @@ -0,0 +1,46 @@
> > > > > > +Etnaviv DRM master device
> > > > > > +================================
> > > > > > +
> > > > > > +The Etnaviv DRM master device is a virtual device needed
> > > > > > to list all
> > > > > > +Vivante GPU cores that comprise the GPU subsystem.
> > > > > > +
> > > > > > +Required properties:
> > > > > > +- compatible: Should be one of
> > > > > > +    "fsl,imx-gpu-subsystem"
> > > > > > +    "marvell,dove-gpu-subsystem"
> > > > > > +- cores: Should contain a list of phandles pointing to
> > > > > > Vivante GPU devices
> > > > > > +
> > > > > > +example:
> > > > > > +
> > > > > > +gpu-subsystem {
> > > > > > +   compatible = "fsl,imx-gpu-subsystem";
> > > > > > +   cores = <&gpu_2d>, <&gpu_3d>;
> > > > > > +};
> > > > > 
> > > > > Yeah, I'm not really a fan of doing this simply because DRM
> > > > > wants 1
> > > > > driver.
> > > > > 
> > > > I'm aware of that, but I don't see much value in kicking this
> > > > discussion
> > > > around for every DRM driver submission. This is the binding
> > > > that has
> > > > emerged from a lengthy discussion at KS 2013 in Edinburgh and
> > > > at least
> > > > allows us to standardize on _something_. Also ALSA does a
> > > > similar thing
> > > > to bind codecs and CPU interfaces together.
> > > 
> > > This case is quite different though I think. The ALSA case and
> > > other
> > > DRM cases are ones that have inter-dependencies between the
> > > blocks
> > > (e.g. some sort of h/w connection). What is the inter-dependency
> > > here?
> > > 
> > > Doing this way has also been found to be completely unnecessary
> > > and
> > > removed in recent DRM driver reviews. Admittedly, those are cases
> > > where one device can be the master of the others. For 2 parallel
> > > devices, I don't have an alternative other than question why they
> > > need
> > > to be a single driver.
> > > 
> > If you insist on doing things differently for this driver, we could
> > add
> > a pass at driver registration that scans through the DT, looking
> > for
> > nodes matching the GPU core compatible.
> 
> I've not insisted on anything. I'm only asking a question which
> didn't
> get answered. I'll ask another way. Why can't you have 2 instances of
> the same driver given they are only rendering nodes?
> 
> > I'm not sure if that makes things cleaner though and might bite us
> > later
> > on. Also I'm not sure if moving away from the binding scheme
> > already
> > established for other DRM drivers makes things better from a DT
> > perspective. Personally I would prefer DT binding consistency over
> > perfection for single drivers, segmenting the DT binding space.
> 
> This is the least of our issues in terms of consistency among
> drivers,
> but that is in fact what I'm pushing for. This is probably the first
> case of a render only driver (at least for DT). So this isn't a case
> of just follow what others are doing.
> 
> The h/w in this area can be quite different, so the DT bindings are
> going to reflect that to some extent. A virtual node makes sense in
> some cases, but for others it may not.
> 
I see where you are going here and I appreciate that this discussion
isn't a exercise in bikeshed, but based on technical facts.

So let me try to explain things from the other way around:
We made the decision to have a single DRM device for all the Vivante
GPU nodes in a system based on technical merits, not because DRM wants
us to do this, but because it has practical upsides for the
implementation of the driver.

1. It makes buffer and management and information sharing between the
cores that are likely to be used together vastly easier for the use-
cases seen today. Having one DRM device per core would be possible, but
would make things a lot harder implementation wise.

2. It will allow us to share resources such as the GPU page tables,
once we move to per-client address spaces, reducing the footprint of
memory that we need to allocate out of CMA.

3. It makes submit fencing look the same regardless of the core
configuration. There are configurations where a 2D and a 3D core are
sitting behind a shared frontend (Dove) and some where each engine has
it's own frontend (i.MX6). Having a single DRM driver allows us to make
both configurations look the same to userspace from a fencing
perspective.

There are probably some more arguments that have escaped the top of my
head right now. Regardless of how the DT bindings end up, we won't move
away from the single DRM device design.

So the question is: given the above, are you opposed to having a
virtual node in DT to describe this master device?
I already sketched up the alternative of having the master driver scan
the DT for matching GPU nodes at probe time and binding them together
into a single device. But given that we end up with one master device
anyways, do you really prefer this over the virtual node, which is a
working and proven solution to this exact problem?

Regards,
Lucas
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Russell King - ARM Linux Dec. 5, 2015, 11:42 a.m. UTC | #18
On Sat, Dec 05, 2015 at 12:26:19PM +0100, Lucas Stach wrote:
> I see where you are going here and I appreciate that this discussion
> isn't a exercise in bikeshed, but based on technical facts.

It would be nice (for the sake of this discussion not getting heated)
if Rob could show that he's been reading my replies to his questions.
My impression so far is that I'm being ignored.  Evidence being Rob's
assertion that his points have not been addressed, where I've replied
with the technical responses - and not one of those has seen any kind
of reply.
Russell King - ARM Linux Dec. 5, 2015, 12:17 p.m. UTC | #19
On Sat, Dec 05, 2015 at 12:26:19PM +0100, Lucas Stach wrote:
> I already sketched up the alternative of having the master driver scan
> the DT for matching GPU nodes at probe time and binding them together
> into a single device. But given that we end up with one master device
> anyways, do you really prefer this over the virtual node, which is a
> working and proven solution to this exact problem?

I really don't think that would be a sane approach for several reasons:

1. We end up with a load of platform devices which are just dangling.
2. We lose the ability to runtime PM manage each GPU core separately,
   along with the PM domain infrastructure to manage the separate GPU
   power domains, as we wouldn't be able to bind to the individual
   struct devices.
3. We still need some kind of master struct device to trigger the
   registration of devices, and to act as the struct device for DRM
   and DMA allocations.

I can't see any advantages to such an approach, only downsides.  Hence,
I'd say that such an approach is completely unsuitable - especially as
the 3D GPUs consume much more power than their 2D counterparts, so it's
very advantageous to have the individual GPU core power management that
the current structure elegantly gives us.

To do the above, we'd probably have to re-implement our own private
per-GPU runtime PM infrastructure, and somehow couple that into the PM
domains stuff - or even bring the SoC specific PM domain manipulation
into the GPU driver.

The only way I can see around that would be to keep the existing driver
structure, but have a non-driver shim which scans the DT and is
responsible for creating the platform devices - duplicating in effect
what the platform device code in drivers/of is already doing but
specifically for this driver.  It'd also be responsible for creating
the master device as well.  In other words, it would be a half-way
between DT and non-DT solutions: DT would be dealt with in the shim
layer and the etnaviv driver itself would effectively know nothing
about DT.
Daniel Vetter Dec. 5, 2015, 3:35 p.m. UTC | #20
On Sat, Dec 05, 2015 at 11:02:09AM +0000, Russell King - ARM Linux wrote:
> On Sat, Dec 05, 2015 at 11:12:08AM +0100, Daniel Vetter wrote:
> > Given that I think the current etnaviv is a sound architecture. And I'm
> > not saying that because drm requires everything to be smashed into one
> > driver, since that's simple not the case.
> 
> There's other reasons as well, mostly from the performance point of view.
> Having separate DRM devices for each GPU means you need to use dmabuf to
> share buffers across the GPUs.  This brings with it several kinds of
> overhead:
> 
> 1. having more fd usage in the client programs.
> 2. having more memory usage as a result.
> 3. having more locks, due to more object lists.
> 4. having the overhead from the DMA API when importing buffers between
>    different GPU nodes.
> 
> From my performance measurements over the last month, the top hit is
> currently from the DMA API, so having to export and import buffers
> between different GPU devices is just going to make that worse.

Yeah dma-api for shared buffers is currently not up to the challenge, we
had the same problem with the intel driver. Big one is that gpus tend to
do cache management themselves, or at least all need cpu caches to be
flushed. But if you have multiple drivers using the same memory nothing
currently keeps track of whether caches are flushed already or not, so you
end up with double or worse flushing of your buffers. Which totally kills
performance.

In theory dma-buf could keep track of who's flushed a buffer already, but
there's no implementation of that yet. And for a generic one we'd need to
violate the current dma api abstractions. So yeah, perf is going to tank
until that's solved, at least for some workloads. Video wasn't a problem
here since all you do is establish a set of shared buffers once, making
all the overhead just one-time. But dynamic workloads like GL can't
amortize setup cost that easily.
-Daniel
Russell King - ARM Linux Dec. 5, 2015, 4:05 p.m. UTC | #21
On Sat, Dec 05, 2015 at 04:35:11PM +0100, Daniel Vetter wrote:
> In theory dma-buf could keep track of who's flushed a buffer already, but
> there's no implementation of that yet. And for a generic one we'd need to
> violate the current dma api abstractions. So yeah, perf is going to tank
> until that's solved, at least for some workloads. Video wasn't a problem
> here since all you do is establish a set of shared buffers once, making
> all the overhead just one-time. But dynamic workloads like GL can't
> amortize setup cost that easily.

Indeed, and as the Vivante 3D core is not as capable as the 2D core at
blitting between two pixmaps, the cost of splitting them would be high.
(The early 3D cores such as found in Dove seem only able to bit from the
0,0 origin.)

So, we really need to do the 3D rendering on the 3D core, and then use
the 2D core as a blitter - preferably with the 3D core rendering the
next frame in parallel with the blit.

There's more reasons to do this when you discover that iMX6's memory bus
arbiter doesn't work quite right: it appears that each bus master has a
certain bandwidth reserved which is never given to any other master.
That means it's best to spread the memory workload over as many of the
masters as possible.
Rob Herring Dec. 5, 2015, 11:22 p.m. UTC | #22
On Sat, Dec 5, 2015 at 5:42 AM, Russell King - ARM Linux
<linux@arm.linux.org.uk> wrote:
> On Sat, Dec 05, 2015 at 12:26:19PM +0100, Lucas Stach wrote:
>> I see where you are going here and I appreciate that this discussion
>> isn't a exercise in bikeshed, but based on technical facts.
>
> It would be nice (for the sake of this discussion not getting heated)
> if Rob could show that he's been reading my replies to his questions.
> My impression so far is that I'm being ignored.  Evidence being Rob's
> assertion that his points have not been addressed, where I've replied
> with the technical responses - and not one of those has seen any kind
> of reply.

Sorry, I had missed your reply before I did reply.

In any case, I'm satisfied with the reasoning and the virtual node is
fine. I do want to see a note in the binding doc about why the
compatible string is not specific.

Rob
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michel Dänzer Dec. 7, 2015, 8:41 a.m. UTC | #23
On 05.12.2015 19:12, Daniel Vetter wrote:
> On Fri, Dec 04, 2015 at 05:43:33PM -0500, Ilia Mirkin wrote:
>> On Fri, Dec 4, 2015 at 5:05 PM, Russell King - ARM Linux
>> <linux@arm.linux.org.uk> wrote:
>>> On Fri, Dec 04, 2015 at 03:42:47PM -0500, Ilia Mirkin wrote:
>>>> On Fri, Dec 4, 2015 at 3:31 PM, Russell King - ARM Linux
>>>> <linux@arm.linux.org.uk> wrote:
>>>>> Moreover, DRI3 is not yet available for Gallium, so if we're talking
>>>>> about Xorg, then functional DRI2 is a requirement, and that _needs_
>>>>> to have a single device for the rendering instances.  Xorg has no way
>>>>> to pass multiple render nodes to client over DRI2.
>>>>
>>>> Just to correct... DRI3 has been available on gallium [at least in the
>>>> context of st/mesa] basically since DRI3 was introduced. Not sure what
>>>> issue you're fighting with, but it's definitely not a gallium
>>>> limitation... could be something related to platform devices.
>>>
>>> Well, my statement is based on the fact that there's nothing in
>>> src/gallium/state-tracker/dri which hints at being DRI3.  Maybe it's
>>> implemented differently, I don't know.

The DRI state tracker code in src/gallium/state_trackers/dri/ supports
DRI1-3. There's almost no explicit mention of DRI3 in it because DRI3
and DRI2 work mostly the same as far as this code is concerned.


>>> I think it's a DRI3 limitation.  The issue with the DRI3 design is that:
>>>
>>> * The client has access to the GPU render nodes only, but not the
>>>   corresponding KMS node.
>>> * Buffers in DRI3 are allocated from the GPU render nodes.
>>> * The Xorg Present protocol is then used to manage the vblank
>>>   synchonisation and page flips.
>>>
>>> Now, the KMS scanout hardware typically does not support any kind of
>>> scatter-gather: the buffers it has must be contiguous.  These can be
>>> allocated from the KMS DRM device.
>>>
>>> However, the DRI3 client has no access to the KMS DRM device to allocate
>>> linear buffers from, and GPUs typically don't have dumb buffer support.
>>> Hence, the client can't pass a suitable buffer to the present layer.
> 
> Oh right, buffer alloc if you have special constraints won't work with
> DRI3 as-is. For that we probably need something like DRI2 for buffer alloc
> + Present (like DRI3 does) for flipping them.

FWIW, that's basically possible already. E.g. the Gallium nine state
tracker can use Present even with DRI2.
Michel Dänzer Dec. 8, 2015, 9:12 a.m. UTC | #24
On 07.12.2015 17:41, Michel Dänzer wrote:
> On 05.12.2015 19:12, Daniel Vetter wrote:
>> On Fri, Dec 04, 2015 at 05:43:33PM -0500, Ilia Mirkin wrote:
>>> On Fri, Dec 4, 2015 at 5:05 PM, Russell King - ARM Linux
>>> <linux@arm.linux.org.uk> wrote:
>>>> On Fri, Dec 04, 2015 at 03:42:47PM -0500, Ilia Mirkin wrote:
>>>>> On Fri, Dec 4, 2015 at 3:31 PM, Russell King - ARM Linux
>>>>> <linux@arm.linux.org.uk> wrote:
>>>>>> Moreover, DRI3 is not yet available for Gallium, so if we're talking
>>>>>> about Xorg, then functional DRI2 is a requirement, and that _needs_
>>>>>> to have a single device for the rendering instances.  Xorg has no way
>>>>>> to pass multiple render nodes to client over DRI2.
>>>>>
>>>>> Just to correct... DRI3 has been available on gallium [at least in the
>>>>> context of st/mesa] basically since DRI3 was introduced. Not sure what
>>>>> issue you're fighting with, but it's definitely not a gallium
>>>>> limitation... could be something related to platform devices.
>>>>
>>>> Well, my statement is based on the fact that there's nothing in
>>>> src/gallium/state-tracker/dri which hints at being DRI3.  Maybe it's
>>>> implemented differently, I don't know.
> 
> The DRI state tracker code in src/gallium/state_trackers/dri/ supports
> DRI1-3. There's almost no explicit mention of DRI3 in it because DRI3
> and DRI2 work mostly the same as far as this code is concerned.

Also note that there's no requirement for the renderer device node to be
opened via the DRI3 protocol. E.g. the DRI_PRIME implementation for DRI3
in Mesa opens the renderer device itself, creates a shared scanout image
from it, gets a dma-buf file descriptor for that, creates a pixmap from
that via DRI3 and presents that pixmap via Present.
diff mbox

Patch

diff --git a/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
new file mode 100644
index 000000000000..19fde29dc1d7
--- /dev/null
+++ b/Documentation/devicetree/bindings/display/etnaviv/etnaviv-drm.txt
@@ -0,0 +1,46 @@ 
+Etnaviv DRM master device
+================================
+
+The Etnaviv DRM master device is a virtual device needed to list all
+Vivante GPU cores that comprise the GPU subsystem.
+
+Required properties:
+- compatible: Should be one of
+    "fsl,imx-gpu-subsystem"
+    "marvell,dove-gpu-subsystem"
+- cores: Should contain a list of phandles pointing to Vivante GPU devices
+
+example:
+
+gpu-subsystem {
+	compatible = "fsl,imx-gpu-subsystem";
+	cores = <&gpu_2d>, <&gpu_3d>;
+};
+
+
+Vivante GPU core devices
+====================
+
+Required properties:
+- compatible: Should be "vivante,gc"
+- reg: should be register base and length as documented in the
+  datasheet
+- interrupts: Should contain the cores interrupt line
+- clocks: should contain one clock for entry in clock-names
+  see Documentation/devicetree/bindings/clock/clock-bindings.txt
+- clock-names:
+   - "bus":    AXI/register clock
+   - "core":   GPU core clock
+   - "shader": Shader clock (only required if GPU has feature PIPE_3D)
+
+example:
+
+gpu_3d: gpu@00130000 {
+	compatible = "vivante,gc";
+	reg = <0x00130000 0x4000>;
+	interrupts = <0 9 IRQ_TYPE_LEVEL_HIGH>;
+	clocks = <&clks IMX6QDL_CLK_GPU3D_AXI>,
+	         <&clks IMX6QDL_CLK_GPU3D_CORE>,
+	         <&clks IMX6QDL_CLK_GPU3D_SHADER>;
+	clock-names = "bus", "core", "shader";
+};