mbox series

[v3,0/9] Add HANTRO G2/HEVC decoder support for IMX8MQ

Message ID 20210222122406.41782-1-benjamin.gaignard@collabora.com
Headers show
Series Add HANTRO G2/HEVC decoder support for IMX8MQ | expand

Message

Benjamin Gaignard Feb. 22, 2021, 12:23 p.m. UTC
The IMX8MQ got two VPUs but until now only G1 has been enabled.
This series aim to add the second VPU (aka G2) and provide basic 
HEVC decoding support.

To be able to decode HEVC it is needed to add/update some of the
structures in the uapi. In addition of them one HANTRO dedicated
control is required to inform the driver of the numbre of bits to skip
at the beginning of the slice header.
The hardware require to allocate few auxiliary buffers to store the
references frame or tile size data.

The driver has been tested with fluster test suite stream.
For example with this command: ./fluster.py run -ts JCT-VC-HEVC_V1 -d GStreamer-H.265-V4L2SL-Gst1.0
 
This series depends of the reset rework posted here: https://www.spinics.net/lists/arm-kernel/msg875766.html

Finally the both VPUs will have a node the device-tree and be
independent from v4l2 point of view.

A branch with all the dev is available here:
https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/upstream_g2_v2

version 3:
- Fix typo in Hantro v4l2 dedicated control
- Add documentation for the new structures and fields
- Rebased on top of media_tree for-linus-5.12-rc1 tag

version 2:
- remove all change related to scaling
- squash commits to a coherent split
- be more verbose about the added fields
- fix the comments done by Ezequiel about dma_alloc_coherent usage
- fix Dan's comments about control copy, reverse the test logic
in tile_buffer_reallocate, rework some goto and return cases.
- be more verbose about why I change the bindings
- remove all sign-off expect mime since it is confusing
- remove useless clocks in VPUs nodes

Benjamin

Benjamin Gaignard (9):
  media: hevc: Modify structures to follow H265 ITU spec
  media: hantro: Define HEVC codec profiles and supported features
  media: hantro: Add a field to distinguish the hardware versions
  media: uapi: Add a control for HANTRO driver
  media: hantro: Introduce G2/HEVC decoder
  media: hantro: handle V4L2_PIX_FMT_HEVC_SLICE control
  media: hantro: IMX8M: add variant for G2/HEVC codec
  dt-bindings: media: nxp,imx8mq-vpu: Update bindings
  arm64: dts: imx8mq: Add node to G2 hardware

 .../bindings/media/nxp,imx8mq-vpu.yaml        |  54 +-
 .../media/v4l/ext-ctrls-codec.rst             | 126 +++-
 .../media/v4l/vidioc-queryctrl.rst            |   6 +
 arch/arm64/boot/dts/freescale/imx8mq.dtsi     |  41 +-
 drivers/media/v4l2-core/v4l2-ctrls.c          |  26 +-
 drivers/staging/media/hantro/Makefile         |   2 +
 drivers/staging/media/hantro/hantro.h         |  34 +-
 drivers/staging/media/hantro/hantro_drv.c     | 103 +++
 .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
 drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
 drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
 drivers/staging/media/hantro/hantro_hw.h      |  48 ++
 .../staging/media/hantro/hantro_postproc.c    |  17 +
 drivers/staging/media/hantro/hantro_v4l2.c    |   1 +
 drivers/staging/media/hantro/imx8m_vpu_hw.c   |  95 ++-
 drivers/staging/media/sunxi/cedrus/cedrus.c   |   6 +
 drivers/staging/media/sunxi/cedrus/cedrus.h   |   1 +
 .../staging/media/sunxi/cedrus/cedrus_dec.c   |   2 +
 .../staging/media/sunxi/cedrus/cedrus_h265.c  |   6 +-
 include/media/hevc-ctrls.h                    |  45 +-
 include/uapi/linux/hantro-v4l2-controls.h     |  20 +
 include/uapi/linux/v4l2-controls.h            |   5 +
 22 files changed, 1674 insertions(+), 70 deletions(-)
 create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
 create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
 create mode 100644 drivers/staging/media/hantro/hantro_hevc.c
 create mode 100644 include/uapi/linux/hantro-v4l2-controls.h

Comments

Ezequiel Garcia Feb. 24, 2021, 8:31 p.m. UTC | #1
Hi Benjamin,

On Mon, 2021-02-22 at 13:23 +0100, Benjamin Gaignard wrote:
> The IMX8MQ got two VPUs but until now only G1 has been enabled.
> This series aim to add the second VPU (aka G2) and provide basic 
> HEVC decoding support.
> 
> To be able to decode HEVC it is needed to add/update some of the
> structures in the uapi. In addition of them one HANTRO dedicated
> control is required to inform the driver of the numbre of bits to skip
> at the beginning of the slice header.
> The hardware require to allocate few auxiliary buffers to store the
> references frame or tile size data.
> 
> The driver has been tested with fluster test suite stream.
> For example with this command: ./fluster.py run -ts JCT-VC-HEVC_V1 -d GStreamer-H.265-V4L2SL-Gst1.0
>  
> This series depends of the reset rework posted here: https://www.spinics.net/lists/arm-kernel/msg875766.html
> 
> Finally the both VPUs will have a node the device-tree and be
> independent from v4l2 point of view.
> 
> A branch with all the dev is available here:
> https://gitlab.collabora.com/benjamin.gaignard/for-upstream/-/commits/upstream_g2_v2
> 
> version 3:
> - Fix typo in Hantro v4l2 dedicated control
> - Add documentation for the new structures and fields
> - Rebased on top of media_tree for-linus-5.12-rc1 tag
> 
> version 2:
> - remove all change related to scaling
> - squash commits to a coherent split
> - be more verbose about the added fields
> - fix the comments done by Ezequiel about dma_alloc_coherent usage
> - fix Dan's comments about control copy, reverse the test logic
> in tile_buffer_reallocate, rework some goto and return cases.
> - be more verbose about why I change the bindings
> - remove all sign-off expect mime since it is confusing

I don't think it's about having less confusing commits, but
trying to describe fairly accurately the origin of the work
and the authorship of changes.

In this series I believe we have three cases, please
correct me if I'm wrong:

* Changes that you have authored, for instance it would
seem to me that "dt-bindings: media: nxp,imx8mq-vpu: Update bindings",
and "media: uapi: Add a control for HANTRO driver" would be in
that category.

* Changes that Adrian and/or me have initially authored,
where both Adrian and me did further work, and also
where you did significant work on as well.

IOW, changes where the three of us did significant work.

I guess in this case, the original author could be the
author of the patch (git's Author tag), and additionally
authorship would be indicated with Co-developed-by tags.

commit 494adacd844e5656b570895a82bc343438b23023
Author: Adrian Ratiu <adrian.ratiu@collabora.com>
Date:   Thu Feb 18 12:17:37 2021 +0100

    media: hantro: Introduce G2/HEVC decoder

    ...

    Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
    Co-developed-by: Ezequiel Garcia <ezequiel@collabora.com>
    Signed-off-by: Ezequiel Garcia <ezequiel@collabora.com>
    Co-developed-by: Benjamin Gaignard <benjamin.gaignard@collabora.com> 
    Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>

* And finally changes with an original author (Adrian or me),
which you haven't changed, but you are just submitting
(or you did minor work).

Maybe with the current way of splitting commits we don't have
these type of patches, but just for the sake of it, let's say
there's a commit authored by Adrian that you are mostly picking up,
it would be:

commit 896776d3bcd032808e4d5772e6749da5dd4eec42
Author: Adrian Ratiu <adrian.ratiu@collabora.com>
Date:   Fri Feb 5 12:14:16 2021 +0100

    media: hantro: Some change
    
    ...

    Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com>
    Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>


Thanks a lot!
Ezequiel

> - remove useless clocks in VPUs nodes
> 
> Benjamin
> 
> Benjamin Gaignard (9):
>   media: hevc: Modify structures to follow H265 ITU spec
>   media: hantro: Define HEVC codec profiles and supported features
>   media: hantro: Add a field to distinguish the hardware versions
>   media: uapi: Add a control for HANTRO driver
>   media: hantro: Introduce G2/HEVC decoder
>   media: hantro: handle V4L2_PIX_FMT_HEVC_SLICE control
>   media: hantro: IMX8M: add variant for G2/HEVC codec
>   dt-bindings: media: nxp,imx8mq-vpu: Update bindings
>   arm64: dts: imx8mq: Add node to G2 hardware
> 
>  .../bindings/media/nxp,imx8mq-vpu.yaml        |  54 +-
>  .../media/v4l/ext-ctrls-codec.rst             | 126 +++-
>  .../media/v4l/vidioc-queryctrl.rst            |   6 +
>  arch/arm64/boot/dts/freescale/imx8mq.dtsi     |  41 +-
>  drivers/media/v4l2-core/v4l2-ctrls.c          |  26 +-
>  drivers/staging/media/hantro/Makefile         |   2 +
>  drivers/staging/media/hantro/hantro.h         |  34 +-
>  drivers/staging/media/hantro/hantro_drv.c     | 103 +++
>  .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
>  drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
>  drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
>  drivers/staging/media/hantro/hantro_hw.h      |  48 ++
>  .../staging/media/hantro/hantro_postproc.c    |  17 +
>  drivers/staging/media/hantro/hantro_v4l2.c    |   1 +
>  drivers/staging/media/hantro/imx8m_vpu_hw.c   |  95 ++-
>  drivers/staging/media/sunxi/cedrus/cedrus.c   |   6 +
>  drivers/staging/media/sunxi/cedrus/cedrus.h   |   1 +
>  .../staging/media/sunxi/cedrus/cedrus_dec.c   |   2 +
>  .../staging/media/sunxi/cedrus/cedrus_h265.c  |   6 +-
>  include/media/hevc-ctrls.h                    |  45 +-
>  include/uapi/linux/hantro-v4l2-controls.h     |  20 +
>  include/uapi/linux/v4l2-controls.h            |   5 +
>  22 files changed, 1674 insertions(+), 70 deletions(-)
>  create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>  create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
>  create mode 100644 drivers/staging/media/hantro/hantro_hevc.c
>  create mode 100644 include/uapi/linux/hantro-v4l2-controls.h
>
Ezequiel Garcia Feb. 24, 2021, 8:39 p.m. UTC | #2
On Mon, 2021-02-22 at 13:23 +0100, Benjamin Gaignard wrote:
> Define which HEVC profiles (up to level 5.1) and features
> (no scaling, no 10 bits) are supported by the driver.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
>  drivers/staging/media/hantro/hantro.h     |  2 +
>  drivers/staging/media/hantro/hantro_drv.c | 58 +++++++++++++++++++++++
>  2 files changed, 60 insertions(+)
> 
> diff --git a/drivers/staging/media/hantro/hantro.h b/drivers/staging/media/hantro/hantro.h
> index 65f9f7ea7dcf..bde65231f22f 100644
> --- a/drivers/staging/media/hantro/hantro.h
> +++ b/drivers/staging/media/hantro/hantro.h
> @@ -99,6 +99,7 @@ struct hantro_variant {
>   * @HANTRO_MODE_H264_DEC: H264 decoder.
>   * @HANTRO_MODE_MPEG2_DEC: MPEG-2 decoder.
>   * @HANTRO_MODE_VP8_DEC: VP8 decoder.
> + * @HANTRO_MODE_HEVC_DEC: HEVC decoder.
>   */
>  enum hantro_codec_mode {
>         HANTRO_MODE_NONE = -1,
> @@ -106,6 +107,7 @@ enum hantro_codec_mode {
>         HANTRO_MODE_H264_DEC,
>         HANTRO_MODE_MPEG2_DEC,
>         HANTRO_MODE_VP8_DEC,
> +       HANTRO_MODE_HEVC_DEC,
>  };
>  
>  /*
> diff --git a/drivers/staging/media/hantro/hantro_drv.c b/drivers/staging/media/hantro/hantro_drv.c
> index e5f200e64993..d86e322a5980 100644
> --- a/drivers/staging/media/hantro/hantro_drv.c
> +++ b/drivers/staging/media/hantro/hantro_drv.c
> @@ -243,6 +243,18 @@ static int hantro_try_ctrl(struct v4l2_ctrl *ctrl)
>                 if (sps->bit_depth_luma_minus8 != 0)
>                         /* Only 8-bit is supported */
>                         return -EINVAL;
> +       } else if (ctrl->id == V4L2_CID_MPEG_VIDEO_HEVC_SPS) {
> +               const struct v4l2_ctrl_hevc_sps *sps = ctrl->p_new.p_hevc_sps;
> +
> +               if (sps->bit_depth_luma_minus8 != sps->bit_depth_chroma_minus8)
> +                       /* Luma and chroma bit depth mismatch */
> +                       return -EINVAL;
> +               if (sps->bit_depth_luma_minus8 != 0)
> +                       /* Only 8-bit is supported */
> +                       return -EINVAL;
> +               if (sps->flags & V4L2_HEVC_SPS_FLAG_SCALING_LIST_ENABLED)
> +                       /* No scaling support */
> +                       return -EINVAL;
>         }
>         return 0;
>  }
> @@ -349,6 +361,52 @@ static const struct hantro_ctrl controls[] = {
>                         .def = V4L2_MPEG_VIDEO_H264_PROFILE_MAIN,
>                 }
>         }, {
> +               .codec = HANTRO_HEVC_DECODER,

Silly nitpick. Looks like this is not defined yet?

I'm getting:

drivers/staging/media/hantro/hantro_drv.c:364:12: error: ‘HANTRO_HEVC_DECODER’ undeclared here (not in a function); did you mean
‘HANTRO_H264_DECODER’?
  364 |   .codec = HANTRO_HEVC_DECODER,
      |            ^~~~~~~~~~~~~~~~~~~
      |            HANTRO_H264_DECODER

I'll review the G2 driver soon :-)

Thanks,
Ezequiel

> +               .cfg = {
> +                       .id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_MODE,
> +                       .min = V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_FRAME_BASED,
> +                       .max = V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_FRAME_BASED,
> +                       .def = V4L2_MPEG_VIDEO_HEVC_DECODE_MODE_FRAME_BASED,
> +               },
> +       }, {
> +               .codec = HANTRO_HEVC_DECODER,
> +               .cfg = {
> +                       .id = V4L2_CID_MPEG_VIDEO_HEVC_START_CODE,
> +                       .min = V4L2_MPEG_VIDEO_HEVC_START_CODE_ANNEX_B,
> +                       .max = V4L2_MPEG_VIDEO_HEVC_START_CODE_ANNEX_B,
> +                       .def = V4L2_MPEG_VIDEO_HEVC_START_CODE_ANNEX_B,
> +               },
> +       }, {
> +               .codec = HANTRO_HEVC_DECODER,
> +               .cfg = {
> +                       .id = V4L2_CID_MPEG_VIDEO_HEVC_PROFILE,
> +                       .min = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN,
> +                       .max = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN_10,
> +                       .def = V4L2_MPEG_VIDEO_HEVC_PROFILE_MAIN,
> +               },
> +       }, {
> +               .codec = HANTRO_HEVC_DECODER,
> +               .cfg = {
> +                       .id = V4L2_CID_MPEG_VIDEO_HEVC_LEVEL,
> +                       .min = V4L2_MPEG_VIDEO_HEVC_LEVEL_1,
> +                       .max = V4L2_MPEG_VIDEO_HEVC_LEVEL_5_1,
> +               },
> +       }, {
> +               .codec = HANTRO_HEVC_DECODER,
> +               .cfg = {
> +                       .id = V4L2_CID_MPEG_VIDEO_HEVC_SPS,
> +                       .ops = &hantro_ctrl_ops,
> +               },
> +       }, {
> +               .codec = HANTRO_HEVC_DECODER,
> +               .cfg = {
> +                       .id = V4L2_CID_MPEG_VIDEO_HEVC_PPS,
> +               },
> +       }, {
> +               .codec = HANTRO_HEVC_DECODER,
> +               .cfg = {
> +                       .id = V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS,
> +               },
>         },
>  };
>
Ezequiel Garcia Feb. 25, 2021, 1:09 p.m. UTC | #3
Hi Benjamin,

Thanks for the good work.

On Mon, 2021-02-22 at 13:23 +0100, Benjamin Gaignard wrote:
> The H.265 ITU specification (section 7.4) define the general
> slice segment header semantics.
> Modified/added fields are:
> - video_parameter_set_id: (7.4.3.1) identifies the VPS for
> reference by other syntax elements.
> - seq_parameter_set_id: (7.4.3.2.1) specifies the value of
> the vps_video_parameter_set_id of the active VPS.
> - chroma_format_idc: (7.4.3.2.1) specifies the chroma sampling
>  relative to the luma sampling
> - pic_parameter_set_id: (7.4.3.3.1) identifies the PPS for
> reference by other syntax elements
> - num_ref_idx_l0_default_active_minus1: (7.4.3.3.1) specifies
> the inferred value of num_ref_idx_l0_active_minus1
> - num_ref_idx_l1_default_active_minus1: (7.4.3.3.1) specifies
> the inferred value of num_ref_idx_l1_active_minus1
> - slice_segment_addr: (7.4.7.1) specifies the address of
> the first coding tree block in the slice segment
> - num_entry_point_offsets: (7.4.7.1) specifies the number of
> entry_point_offset_minus1[ i ] syntax elements in the slice header
> 
> Add HEVC decode params contains the information used in section
> "8.3 Slice decoding process" of the specification to let the hardware
> perform decoding of a slices.
> 
> Adapt Cedrus driver according to these changes.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> version 3:
> - Add documentation about the new structuers and fields.
> 
> version 2:
> - remove all change related to scaling
> - squash commits to a coherent split
> - be more verbose about the added fields
> 
>  .../media/v4l/ext-ctrls-codec.rst             | 126 +++++++++++++++---
>  .../media/v4l/vidioc-queryctrl.rst            |   6 +
>  drivers/media/v4l2-core/v4l2-ctrls.c          |  26 +++-
>  drivers/staging/media/sunxi/cedrus/cedrus.c   |   6 +
>  drivers/staging/media/sunxi/cedrus/cedrus.h   |   1 +
>  .../staging/media/sunxi/cedrus/cedrus_dec.c   |   2 +
>  .../staging/media/sunxi/cedrus/cedrus_h265.c  |   6 +-
>  include/media/hevc-ctrls.h                    |  45 +++++--
>  8 files changed, 186 insertions(+), 32 deletions(-)
> 
> diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> index 00944e97d638..5e6d77e858c0 100644
> --- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> @@ -3109,6 +3109,15 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
>      :stub-columns: 0
>      :widths:       1 1 2
>  
> +    * - __u8
> +      - ``video_parameter_set_id``
> +      - Identifies the VPS for reference by other syntax elements
> +    * - __u8
> +      - ``seq_parameter_set_id̀``
> +      - Specifies the value of the vps_video_parameter_set_id of the active VPS
> +    * - __u8
> +      - ``chroma_format_idc``
> +      - Specifies the chroma sampling relative to the luma sampling

None of these fields seem needed for the Hantro G2 driver,
so I suggest you drop them for now.

>      * - __u16
>        - ``pic_width_in_luma_samples``
>        -
> @@ -3172,6 +3181,9 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
>      * - __u8
>        - ``chroma_format_idc``
>        -
> +    * - __u8
> +      - ``num_slices``
> +

Not used, but also doesn't seem part of the SPS syntax. If we have to
pass the number of slices, we'll need another mechanism.

>       -
>      * - __u64
>        - ``flags``
>        - See :ref:`Sequence Parameter Set Flags <hevc_sps_flags>`
> @@ -3231,9 +3243,18 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
>      :stub-columns: 0
>      :widths:       1 1 2
>  
> +    * - __u8
> +      - ``pic_parameter_set_id``
> +      - Identifies the PPS for reference by other syntax elements

Not used.

>      * - __u8
>        - ``num_extra_slice_header_bits``
>        -
> +    * - __u8
> +      - ``num_ref_idx_l0_default_active_minus1``
> +      - Specifies the inferred value of num_ref_idx_l0_active_minus1
> +    * - __u8
> +      - ``num_ref_idx_l1_default_active_minus1``
> +      - Specifies the inferred value of num_ref_idx_l1_active_minus1
>      * - __s8
>        - ``init_qp_minus26``
>        -
> @@ -3342,6 +3363,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
>      * - ``V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT``
>        - 0x00040000
>        -
> +    * - ``V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT``
> +      - 0x00080000
> +      -
> +    * - ``V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING``
> +      - 0x00100000
> +      -
>  

I suggest to do all the PPS control changes in a separate patch,
feels easier to review and cleaner as you can explain the
changes with more detail in the commit description.

Looking at the PPS syntax for tiles, I'm wondering if these
deserve their own control, which would be used if tiles are enabled,
i.e. V4L2_HEVC_PPS_FLAG_TILES_ENABLED is set.

        __u8    num_tile_columns_minus1;                                         
        __u8    num_tile_rows_minus1;                                            
        __u8    column_width_minus1[20];                                         
        __u8    row_height_minus1[22];    

Not something we necessarily have to tackle now.

>  ``V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS (struct)``
>      Specifies various slice-specific parameters, especially from the NAL unit
> @@ -3366,6 +3393,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
>      * - __u32
>        - ``data_bit_offset``
>        - Offset (in bits) to the video data in the current slice data.
> +    * - __u32
> +      - ``slice_segment_addr``
> +      - Specifies the address of the first coding tree block in the slice segment

Not used.

> +    * - __u32
> +      - ``num_entry_point_offsets``
> +      - Specifies the number of entry_point_offset_minus1[ i ] syntax elements in the slice header

Not used.

>      * - __u8
>        - ``nal_unit_type``
>        -
> @@ -3422,28 +3455,20 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
>      * - __u8
>        - ``pic_struct``
>        -
> -    * - __u8
> -      - ``num_active_dpb_entries``
> -      - The number of entries in ``dpb``.

Need to explain in the commit description why this field is moved.

>      * - __u8
>        - ``ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
>        - The list of L0 reference elements as indices in the DPB.
>      * - __u8
>        - ``ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
>        - The list of L1 reference elements as indices in the DPB.
> +    * - __u16
> +      - ``short_term_ref_pic_set_size``
> +

Not used.

>       -
> +    * - __u16
> +      - ``long_term_ref_pic_set_size``
> +      -

Not used.

>      * - __u8
> -      - ``num_rps_poc_st_curr_before``
> -      - The number of reference pictures in the short-term set that come before
> -        the current frame.

If this matches NumPocStCurrBefore from section 8.3.2 "Decoding process for reference picture set"
then I would document that. And perhaps rename it to num_poc_st_curr_before.

> -    * - __u8
> -      - ``num_rps_poc_st_curr_after``
> -      - The number of reference pictures in the short-term set that come after
> -        the current frame.

Ditto.

> -    * - __u8
> -      - ``num_rps_poc_lt_curr``
> -      - The number of reference pictures in the long-term set.

Ditto.

Also, I'd like the changes that move fields from V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS
to the new V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS control, to be in their
patch.

That will allow us to put in the commit description a proper
explanation of why are fields being moved. Nothing fancy, simply
explaining that these variables come from section 8.3.2
"Decoding process for reference picture set", which describes
a process invoked once per picture, so they are not per-slice.

> -    * - __u8
> -      - ``padding[7]``
> +      - ``padding``
>        - Applications and drivers must set this to zero.
>      * - struct :c:type:`v4l2_hevc_dpb_entry`
>        - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> @@ -3646,3 +3671,74 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
>      so this has to come from client.
>      This is applicable to H264 and valid Range is from 0 to 63.
>      Source Rec. ITU-T H.264 (06/2019); G.7.4.1.1, G.8.8.1.
> +
> +``V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS (struct)``
> +    Specifies various decode parameters, especially the references picture order
> +    count (POC) for all the lists (short, long, before, current, after) and the
> +    number of entries for each of them.
> +    These parameters are defined according to :ref:`hevc`.
> +    They are described in section 8.3 "Slice decoding process" of the
> +    specification.
> +
> +.. c:type:: v4l2_ctrl_hevc_decode_params
> +
> +.. cssclass:: longtable
> +
> +.. flat-table:: struct v4l2_ctrl_hevc_decode_params
> +    :header-rows:  0
> +    :stub-columns: 0
> +    :widths:       1 1 2
> +
> +    * - __s32
> +      - ``pic_order_cnt_val``
> +      -

Can be documented as:

"""
PicOrderCntVal as described in section 8.3.1 "Decoding process
for picture order count" of the specification.
"""

Note that snake case is used to match the kernel style,
but other than that we try to keep the HEVC spec variable
names.

> +    * - __u8
> +      - ``num_active_dpb_entries``
> +      - The number of entries in ``dpb``.
> +    * - struct :c:type:`v4l2_hevc_dpb_entry`
> +      - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> +      - The decoded picture buffer, for meta-data about reference frames.

The DPB is here, but it seems it's also in the slice control?

> +    * - __u8
> +      - ``num_rps_poc_st_curr_before``
> +      - The number of reference pictures in the short-term set that come before
> +        the current frame.
> +    * - __u8
> +      - ``num_rps_poc_st_curr_after``
> +      - The number of reference pictures in the short-term set that come after
> +        the current frame.
> +    * - __u8
> +      - ``num_rps_poc_lt_curr``
> +      - The number of reference pictures in the long-term set.
> +    * - __u8
> +      - ``rps_st_curr_before[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> +      -
> +    * - __u8
> +      - ``rps_st_curr_after[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> +      -
> +    * - __u8
> +      - ``rps_lt_curr[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> +      -

Could you document these as well?

Thanks a lot,
Ezequiel
Ezequiel Garcia Feb. 25, 2021, 2:05 p.m. UTC | #4
Hi Benjamin,

On Mon, 2021-02-22 at 13:24 +0100, Benjamin Gaignard wrote:
> The HEVC HANTRO driver needs to know the number of bits to skip at

s/HANTRO/Hantro

> the beginning of the slice header.

As discussed in a different thread, we should describe exactly
what the hardware is expecting, so applications can parse that
and pass a correct value.

> That is a hardware specific requirement so create a dedicated control
> that this purpose.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> version 3:
> - Fix typo in field name
> 
>  include/uapi/linux/hantro-v4l2-controls.h | 20 ++++++++++++++++++++
>  include/uapi/linux/v4l2-controls.h        |  5 +++++
>  2 files changed, 25 insertions(+)
>  create mode 100644 include/uapi/linux/hantro-v4l2-controls.h
> 
> diff --git a/include/uapi/linux/hantro-v4l2-controls.h b/include/uapi/linux/hantro-v4l2-controls.h
> new file mode 100644
> index 000000000000..a8dfd6b1a2a9
> --- /dev/null
> +++ b/include/uapi/linux/hantro-v4l2-controls.h
> @@ -0,0 +1,20 @@
> +/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
> +
> +#ifndef __UAPI_HANTRO_V4L2_CONYTROLS_H__
> +#define __UAPI_HANTRO_V4L2_CONYTROLS_H__
> +
> +#include <linux/v4l2-controls.h>
> +#include <media/hevc-ctrls.h>
> +
> +#define V4L2_CID_HANTRO_HEVC_EXTRA_DECODE_PARAMS       (V4L2_CID_USER_HANTRO_BASE + 0)
> +
> +/**
> + * struct hantro_hevc_extra_decode_params - extra decode parameters for hantro driver
> + * @hevc_hdr_skip_length:      header first bits offset
> + */
> +struct hantro_hevc_extra_decode_params {
> +       __u32   hevc_hdr_skip_length;
> +       __u8    padding[4];
> +};
> +

I think we can get away with a simpler solution. Since it's just one integer
we need, there's no need for a compound control. Something like this:

                .codec = HANTRO_HEVC_DECODER,                                    
                .cfg = {                                                         
                        .id = V4L2_CID_HANTRO_HEVC_SLICE_HEADER_SKIP,            
                        .name = "Hantro HEVC slice header skip bytes",           
                        .type = V4L2_CTRL_TYPE_INTEGER,                          
                        .min = 0,                                                
                        .max = 0x7fffffff,                                       
                        .step = 1,                                               
                },     

Also see V4L2_CID_CODA_MB_ERR_CNT which is defined in drivers/media/platform/coda/coda.h.
The control is sufficiently special that it could be kept in an internal driver header.

> +#endif
> diff --git a/include/uapi/linux/v4l2-controls.h b/include/uapi/linux/v4l2-controls.h
> index 039c0d7add1b..ced7486c7f46 100644
> --- a/include/uapi/linux/v4l2-controls.h
> +++ b/include/uapi/linux/v4l2-controls.h
> @@ -209,6 +209,11 @@ enum v4l2_colorfx {
>   * We reserve 128 controls for this driver.
>   */
>  #define V4L2_CID_USER_CCS_BASE                 (V4L2_CID_USER_BASE + 0x10f0)
> +/*
> + * The base for HANTRO driver controls.
> + * We reserve 32 controls for this driver.
> + */
> +#define V4L2_CID_USER_HANTRO_BASE              (V4L2_CID_USER_BASE + 0x1170)
>  
>  /* MPEG-class control IDs */
>  /* The MPEG controls are applicable to all codec controls

Thanks,
Ezequiel
Jernej Škrabec Feb. 25, 2021, 5:01 p.m. UTC | #5
Hi Ezequiel,

Dne četrtek, 25. februar 2021 ob 14:09:52 CET je Ezequiel Garcia napisal(a):
> Hi Benjamin,
> 
> Thanks for the good work.
> 
> On Mon, 2021-02-22 at 13:23 +0100, Benjamin Gaignard wrote:
> > The H.265 ITU specification (section 7.4) define the general
> > slice segment header semantics.
> > Modified/added fields are:
> > - video_parameter_set_id: (7.4.3.1) identifies the VPS for
> > reference by other syntax elements.
> > - seq_parameter_set_id: (7.4.3.2.1) specifies the value of
> > the vps_video_parameter_set_id of the active VPS.
> > - chroma_format_idc: (7.4.3.2.1) specifies the chroma sampling
> >  relative to the luma sampling
> > - pic_parameter_set_id: (7.4.3.3.1) identifies the PPS for
> > reference by other syntax elements
> > - num_ref_idx_l0_default_active_minus1: (7.4.3.3.1) specifies
> > the inferred value of num_ref_idx_l0_active_minus1
> > - num_ref_idx_l1_default_active_minus1: (7.4.3.3.1) specifies
> > the inferred value of num_ref_idx_l1_active_minus1
> > - slice_segment_addr: (7.4.7.1) specifies the address of
> > the first coding tree block in the slice segment
> > - num_entry_point_offsets: (7.4.7.1) specifies the number of
> > entry_point_offset_minus1[ i ] syntax elements in the slice header
> > 
> > Add HEVC decode params contains the information used in section
> > "8.3 Slice decoding process" of the specification to let the hardware
> > perform decoding of a slices.
> > 
> > Adapt Cedrus driver according to these changes.
> > 
> > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> > ---
> > version 3:
> > - Add documentation about the new structuers and fields.
> > 
> > version 2:
> > - remove all change related to scaling
> > - squash commits to a coherent split
> > - be more verbose about the added fields
> > 
> >  .../media/v4l/ext-ctrls-codec.rst             | 126 +++++++++++++++---
> >  .../media/v4l/vidioc-queryctrl.rst            |   6 +
> >  drivers/media/v4l2-core/v4l2-ctrls.c          |  26 +++-
> >  drivers/staging/media/sunxi/cedrus/cedrus.c   |   6 +
> >  drivers/staging/media/sunxi/cedrus/cedrus.h   |   1 +
> >  .../staging/media/sunxi/cedrus/cedrus_dec.c   |   2 +
> >  .../staging/media/sunxi/cedrus/cedrus_h265.c  |   6 +-
> >  include/media/hevc-ctrls.h                    |  45 +++++--
> >  8 files changed, 186 insertions(+), 32 deletions(-)
> > 
> > diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst b/
Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > index 00944e97d638..5e6d77e858c0 100644
> > --- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > @@ -3109,6 +3109,15 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> >      :stub-columns: 0
> >      :widths:       1 1 2
> >  
> > +    * - __u8
> > +      - ``video_parameter_set_id``
> > +      - Identifies the VPS for reference by other syntax elements
> > +    * - __u8
> > +      - ``seq_parameter_set_id̀``
> > +      - Specifies the value of the vps_video_parameter_set_id of the 
active VPS
> > +    * - __u8
> > +      - ``chroma_format_idc``
> > +      - Specifies the chroma sampling relative to the luma sampling
> 
> None of these fields seem needed for the Hantro G2 driver,
> so I suggest you drop them for now.
> 
> >      * - __u16
> >        - ``pic_width_in_luma_samples``
> >        -
> > @@ -3172,6 +3181,9 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> >      * - __u8
> >        - ``chroma_format_idc``
> >        -
> > +    * - __u8
> > +      - ``num_slices``
> > +
> 
> Not used, but also doesn't seem part of the SPS syntax. If we have to
> pass the number of slices, we'll need another mechanism.
> 
> >       -
> >      * - __u64
> >        - ``flags``
> >        - See :ref:`Sequence Parameter Set Flags <hevc_sps_flags>`
> > @@ -3231,9 +3243,18 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> >      :stub-columns: 0
> >      :widths:       1 1 2
> >  
> > +    * - __u8
> > +      - ``pic_parameter_set_id``
> > +      - Identifies the PPS for reference by other syntax elements
> 
> Not used.
> 
> >      * - __u8
> >        - ``num_extra_slice_header_bits``
> >        -
> > +    * - __u8
> > +      - ``num_ref_idx_l0_default_active_minus1``
> > +      - Specifies the inferred value of num_ref_idx_l0_active_minus1
> > +    * - __u8
> > +      - ``num_ref_idx_l1_default_active_minus1``
> > +      - Specifies the inferred value of num_ref_idx_l1_active_minus1
> >      * - __s8
> >        - ``init_qp_minus26``
> >        -
> > @@ -3342,6 +3363,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> >      * - ``V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT``
> >        - 0x00040000
> >        -
> > +    * - ``V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT``
> > +      - 0x00080000
> > +      -
> > +    * - ``V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING``
> > +      - 0x00100000
> > +      -
> >  
> 
> I suggest to do all the PPS control changes in a separate patch,
> feels easier to review and cleaner as you can explain the
> changes with more detail in the commit description.
> 
> Looking at the PPS syntax for tiles, I'm wondering if these
> deserve their own control, which would be used if tiles are enabled,
> i.e. V4L2_HEVC_PPS_FLAG_TILES_ENABLED is set.
> 
>         __u8    num_tile_columns_minus1;                                         
>         __u8    num_tile_rows_minus1;                                            
>         __u8    column_width_minus1[20];                                         
>         __u8    row_height_minus1[22];    
> 
> Not something we necessarily have to tackle now.
> 
> >  ``V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS (struct)``
> >      Specifies various slice-specific parameters, especially from the NAL 
unit
> > @@ -3366,6 +3393,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> >      * - __u32
> >        - ``data_bit_offset``
> >        - Offset (in bits) to the video data in the current slice data.
> > +    * - __u32
> > +      - ``slice_segment_addr``
> > +      - Specifies the address of the first coding tree block in the slice 
segment
> 
> Not used.
> 
> > +    * - __u32
> > +      - ``num_entry_point_offsets``
> > +      - Specifies the number of entry_point_offset_minus1[ i ] syntax 
elements in the slice header
> 
> Not used.

While above two fields may not be used in Hantro, they are for sure useful for 
Cedrus and RPi4. I would like to keep them, otherwise with such approach HEVC 
will stay in staging for a long time. I'm still baffled why scaling matrix 
control was dropped. It would fit well in Cedrus and RPi4 driver and after a 
quick look, it seems that it was used in driver in later patch.

Best regards,
Jernej

> 
> >      * - __u8
> >        - ``nal_unit_type``
> >        -
> > @@ -3422,28 +3455,20 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> >      * - __u8
> >        - ``pic_struct``
> >        -
> > -    * - __u8
> > -      - ``num_active_dpb_entries``
> > -      - The number of entries in ``dpb``.
> 
> Need to explain in the commit description why this field is moved.
> 
> >      * - __u8
> >        - ``ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> >        - The list of L0 reference elements as indices in the DPB.
> >      * - __u8
> >        - ``ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> >        - The list of L1 reference elements as indices in the DPB.
> > +    * - __u16
> > +      - ``short_term_ref_pic_set_size``
> > +
> 
> Not used.
> 
> >       -
> > +    * - __u16
> > +      - ``long_term_ref_pic_set_size``
> > +      -
> 
> Not used.
> 
> >      * - __u8
> > -      - ``num_rps_poc_st_curr_before``
> > -      - The number of reference pictures in the short-term set that come 
before
> > -        the current frame.
> 
> If this matches NumPocStCurrBefore from section 8.3.2 "Decoding process for 
reference picture set"
> then I would document that. And perhaps rename it to num_poc_st_curr_before.
> 
> > -    * - __u8
> > -      - ``num_rps_poc_st_curr_after``
> > -      - The number of reference pictures in the short-term set that come 
after
> > -        the current frame.
> 
> Ditto.
> 
> > -    * - __u8
> > -      - ``num_rps_poc_lt_curr``
> > -      - The number of reference pictures in the long-term set.
> 
> Ditto.
> 
> Also, I'd like the changes that move fields from 
V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS
> to the new V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS control, to be in their
> patch.
> 
> That will allow us to put in the commit description a proper
> explanation of why are fields being moved. Nothing fancy, simply
> explaining that these variables come from section 8.3.2
> "Decoding process for reference picture set", which describes
> a process invoked once per picture, so they are not per-slice.
> 
> > -    * - __u8
> > -      - ``padding[7]``
> > +      - ``padding``
> >        - Applications and drivers must set this to zero.
> >      * - struct :c:type:`v4l2_hevc_dpb_entry`
> >        - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > @@ -3646,3 +3671,74 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> >      so this has to come from client.
> >      This is applicable to H264 and valid Range is from 0 to 63.
> >      Source Rec. ITU-T H.264 (06/2019); G.7.4.1.1, G.8.8.1.
> > +
> > +``V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS (struct)``
> > +    Specifies various decode parameters, especially the references picture 
order
> > +    count (POC) for all the lists (short, long, before, current, after) 
and the
> > +    number of entries for each of them.
> > +    These parameters are defined according to :ref:`hevc`.
> > +    They are described in section 8.3 "Slice decoding process" of the
> > +    specification.
> > +
> > +.. c:type:: v4l2_ctrl_hevc_decode_params
> > +
> > +.. cssclass:: longtable
> > +
> > +.. flat-table:: struct v4l2_ctrl_hevc_decode_params
> > +    :header-rows:  0
> > +    :stub-columns: 0
> > +    :widths:       1 1 2
> > +
> > +    * - __s32
> > +      - ``pic_order_cnt_val``
> > +      -
> 
> Can be documented as:
> 
> """
> PicOrderCntVal as described in section 8.3.1 "Decoding process
> for picture order count" of the specification.
> """
> 
> Note that snake case is used to match the kernel style,
> but other than that we try to keep the HEVC spec variable
> names.
> 
> > +    * - __u8
> > +      - ``num_active_dpb_entries``
> > +      - The number of entries in ``dpb``.
> > +    * - struct :c:type:`v4l2_hevc_dpb_entry`
> > +      - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > +      - The decoded picture buffer, for meta-data about reference frames.
> 
> The DPB is here, but it seems it's also in the slice control?
> 
> > +    * - __u8
> > +      - ``num_rps_poc_st_curr_before``
> > +      - The number of reference pictures in the short-term set that come 
before
> > +        the current frame.
> > +    * - __u8
> > +      - ``num_rps_poc_st_curr_after``
> > +      - The number of reference pictures in the short-term set that come 
after
> > +        the current frame.
> > +    * - __u8
> > +      - ``num_rps_poc_lt_curr``
> > +      - The number of reference pictures in the long-term set.
> > +    * - __u8
> > +      - ``rps_st_curr_before[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > +      -
> > +    * - __u8
> > +      - ``rps_st_curr_after[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > +      -
> > +    * - __u8
> > +      - ``rps_lt_curr[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > +      -
> 
> Could you document these as well?
> 
> Thanks a lot,
> Ezequiel
> 
>
Ezequiel Garcia Feb. 25, 2021, 5:34 p.m. UTC | #6
Hey Jernej,

On Thu, 2021-02-25 at 18:01 +0100, Jernej Škrabec wrote:
> Hi Ezequiel,
> 
> Dne četrtek, 25. februar 2021 ob 14:09:52 CET je Ezequiel Garcia napisal(a):
> > Hi Benjamin,
> > 
> > Thanks for the good work.
> > 
> > On Mon, 2021-02-22 at 13:23 +0100, Benjamin Gaignard wrote:
> > > The H.265 ITU specification (section 7.4) define the general
> > > slice segment header semantics.
> > > Modified/added fields are:
> > > - video_parameter_set_id: (7.4.3.1) identifies the VPS for
> > > reference by other syntax elements.
> > > - seq_parameter_set_id: (7.4.3.2.1) specifies the value of
> > > the vps_video_parameter_set_id of the active VPS.
> > > - chroma_format_idc: (7.4.3.2.1) specifies the chroma sampling
> > >  relative to the luma sampling
> > > - pic_parameter_set_id: (7.4.3.3.1) identifies the PPS for
> > > reference by other syntax elements
> > > - num_ref_idx_l0_default_active_minus1: (7.4.3.3.1) specifies
> > > the inferred value of num_ref_idx_l0_active_minus1
> > > - num_ref_idx_l1_default_active_minus1: (7.4.3.3.1) specifies
> > > the inferred value of num_ref_idx_l1_active_minus1
> > > - slice_segment_addr: (7.4.7.1) specifies the address of
> > > the first coding tree block in the slice segment
> > > - num_entry_point_offsets: (7.4.7.1) specifies the number of
> > > entry_point_offset_minus1[ i ] syntax elements in the slice header
> > > 
> > > Add HEVC decode params contains the information used in section
> > > "8.3 Slice decoding process" of the specification to let the hardware
> > > perform decoding of a slices.
> > > 
> > > Adapt Cedrus driver according to these changes.
> > > 
> > > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> > > ---
> > > version 3:
> > > - Add documentation about the new structuers and fields.
> > > 
> > > version 2:
> > > - remove all change related to scaling
> > > - squash commits to a coherent split
> > > - be more verbose about the added fields
> > > 
> > >  .../media/v4l/ext-ctrls-codec.rst             | 126 +++++++++++++++---
> > >  .../media/v4l/vidioc-queryctrl.rst            |   6 +
> > >  drivers/media/v4l2-core/v4l2-ctrls.c          |  26 +++-
> > >  drivers/staging/media/sunxi/cedrus/cedrus.c   |   6 +
> > >  drivers/staging/media/sunxi/cedrus/cedrus.h   |   1 +
> > >  .../staging/media/sunxi/cedrus/cedrus_dec.c   |   2 +
> > >  .../staging/media/sunxi/cedrus/cedrus_h265.c  |   6 +-
> > >  include/media/hevc-ctrls.h                    |  45 +++++--
> > >  8 files changed, 186 insertions(+), 32 deletions(-)
> > > 
> > > diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst b/
> Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > > index 00944e97d638..5e6d77e858c0 100644
> > > --- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > > +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > > @@ -3109,6 +3109,15 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> > >      :stub-columns: 0
> > >      :widths:       1 1 2
> > >  
> > > +    * - __u8
> > > +      - ``video_parameter_set_id``
> > > +      - Identifies the VPS for reference by other syntax elements
> > > +    * - __u8
> > > +      - ``seq_parameter_set_id̀``
> > > +      - Specifies the value of the vps_video_parameter_set_id of the 
> active VPS
> > > +    * - __u8
> > > +      - ``chroma_format_idc``
> > > +      - Specifies the chroma sampling relative to the luma sampling
> > 
> > None of these fields seem needed for the Hantro G2 driver,
> > so I suggest you drop them for now.
> > 
> > >      * - __u16
> > >        - ``pic_width_in_luma_samples``
> > >        -
> > > @@ -3172,6 +3181,9 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> > >      * - __u8
> > >        - ``chroma_format_idc``
> > >        -
> > > +    * - __u8
> > > +      - ``num_slices``
> > > +
> > 
> > Not used, but also doesn't seem part of the SPS syntax. If we have to
> > pass the number of slices, we'll need another mechanism.
> > 
> > >       -
> > >      * - __u64
> > >        - ``flags``
> > >        - See :ref:`Sequence Parameter Set Flags <hevc_sps_flags>`
> > > @@ -3231,9 +3243,18 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> > >      :stub-columns: 0
> > >      :widths:       1 1 2
> > >  
> > > +    * - __u8
> > > +      - ``pic_parameter_set_id``
> > > +      - Identifies the PPS for reference by other syntax elements
> > 
> > Not used.
> > 
> > >      * - __u8
> > >        - ``num_extra_slice_header_bits``
> > >        -
> > > +    * - __u8
> > > +      - ``num_ref_idx_l0_default_active_minus1``
> > > +      - Specifies the inferred value of num_ref_idx_l0_active_minus1
> > > +    * - __u8
> > > +      - ``num_ref_idx_l1_default_active_minus1``
> > > +      - Specifies the inferred value of num_ref_idx_l1_active_minus1
> > >      * - __s8
> > >        - ``init_qp_minus26``
> > >        -
> > > @@ -3342,6 +3363,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> > >      * - ``V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT``
> > >        - 0x00040000
> > >        -
> > > +    * - ``V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT``
> > > +      - 0x00080000
> > > +      -
> > > +    * - ``V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING``
> > > +      - 0x00100000
> > > +      -
> > >  
> > 
> > I suggest to do all the PPS control changes in a separate patch,
> > feels easier to review and cleaner as you can explain the
> > changes with more detail in the commit description.
> > 
> > Looking at the PPS syntax for tiles, I'm wondering if these
> > deserve their own control, which would be used if tiles are enabled,
> > i.e. V4L2_HEVC_PPS_FLAG_TILES_ENABLED is set.
> > 
> >         __u8    num_tile_columns_minus1;                                         
> >         __u8    num_tile_rows_minus1;                                            
> >         __u8    column_width_minus1[20];                                         
> >         __u8    row_height_minus1[22];    
> > 
> > Not something we necessarily have to tackle now.
> > 
> > >  ``V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS (struct)``
> > >      Specifies various slice-specific parameters, especially from the NAL 
> unit
> > > @@ -3366,6 +3393,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> > >      * - __u32
> > >        - ``data_bit_offset``
> > >        - Offset (in bits) to the video data in the current slice data.
> > > +    * - __u32
> > > +      - ``slice_segment_addr``
> > > +      - Specifies the address of the first coding tree block in the slice 
> segment
> > 
> > Not used.
> > 
> > > +    * - __u32
> > > +      - ``num_entry_point_offsets``
> > > +      - Specifies the number of entry_point_offset_minus1[ i ] syntax 
> elements in the slice header
> > 
> > Not used.
> 
> While above two fields may not be used in Hantro, they are for sure useful for 
> Cedrus and RPi4. I would like to keep them, otherwise with such approach HEVC 
> will stay in staging for a long time. I'm still baffled why scaling matrix 
> control was dropped. It would fit well in Cedrus and RPi4 driver and after a 
> quick look, it seems that it was used in driver in later patch.
> 

I'd like to make sure each modification we are making to the uAPI
goes in the right direction, that is in the direction of moving
the API out of staging.

Since reviewing each field is quite hard, and opens some discussions,
I wanted to keep this patchset specific to what's needed for Hantro G2.

The Scaling matrix control is certainly a good one, as well as the ones
needed for Cedrus and RPi4. However, I feel it's better to discuss
them in their own "uAPI review" series so we can review all the changes
with an API hat.

This way we decouple the Hantro G2 discussion and work from the API work.

Also please feel free to submit RFC patches fo Cedrus and RPi4
(API and driver changes). We can certainly start the discussion around that,
with driver changes in context.

Hope I'm making sense here :)

Thanks,
Ezequiel

> Best regards,
> Jernej
> 
> > 
> > >      * - __u8
> > >        - ``nal_unit_type``
> > >        -
> > > @@ -3422,28 +3455,20 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> > >      * - __u8
> > >        - ``pic_struct``
> > >        -
> > > -    * - __u8
> > > -      - ``num_active_dpb_entries``
> > > -      - The number of entries in ``dpb``.
> > 
> > Need to explain in the commit description why this field is moved.
> > 
> > >      * - __u8
> > >        - ``ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > >        - The list of L0 reference elements as indices in the DPB.
> > >      * - __u8
> > >        - ``ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > >        - The list of L1 reference elements as indices in the DPB.
> > > +    * - __u16
> > > +      - ``short_term_ref_pic_set_size``
> > > +
> > 
> > Not used.
> > 
> > >       -
> > > +    * - __u16
> > > +      - ``long_term_ref_pic_set_size``
> > > +      -
> > 
> > Not used.
> > 
> > >      * - __u8
> > > -      - ``num_rps_poc_st_curr_before``
> > > -      - The number of reference pictures in the short-term set that come 
> before
> > > -        the current frame.
> > 
> > If this matches NumPocStCurrBefore from section 8.3.2 "Decoding process for 
> reference picture set"
> > then I would document that. And perhaps rename it to num_poc_st_curr_before.
> > 
> > > -    * - __u8
> > > -      - ``num_rps_poc_st_curr_after``
> > > -      - The number of reference pictures in the short-term set that come 
> after
> > > -        the current frame.
> > 
> > Ditto.
> > 
> > > -    * - __u8
> > > -      - ``num_rps_poc_lt_curr``
> > > -      - The number of reference pictures in the long-term set.
> > 
> > Ditto.
> > 
> > Also, I'd like the changes that move fields from 
> V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS
> > to the new V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS control, to be in their
> > patch.
> > 
> > That will allow us to put in the commit description a proper
> > explanation of why are fields being moved. Nothing fancy, simply
> > explaining that these variables come from section 8.3.2
> > "Decoding process for reference picture set", which describes
> > a process invoked once per picture, so they are not per-slice.
> > 
> > > -    * - __u8
> > > -      - ``padding[7]``
> > > +      - ``padding``
> > >        - Applications and drivers must set this to zero.
> > >      * - struct :c:type:`v4l2_hevc_dpb_entry`
> > >        - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > @@ -3646,3 +3671,74 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> > >      so this has to come from client.
> > >      This is applicable to H264 and valid Range is from 0 to 63.
> > >      Source Rec. ITU-T H.264 (06/2019); G.7.4.1.1, G.8.8.1.
> > > +
> > > +``V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS (struct)``
> > > +    Specifies various decode parameters, especially the references picture 
> order
> > > +    count (POC) for all the lists (short, long, before, current, after) 
> and the
> > > +    number of entries for each of them.
> > > +    These parameters are defined according to :ref:`hevc`.
> > > +    They are described in section 8.3 "Slice decoding process" of the
> > > +    specification.
> > > +
> > > +.. c:type:: v4l2_ctrl_hevc_decode_params
> > > +
> > > +.. cssclass:: longtable
> > > +
> > > +.. flat-table:: struct v4l2_ctrl_hevc_decode_params
> > > +    :header-rows:  0
> > > +    :stub-columns: 0
> > > +    :widths:       1 1 2
> > > +
> > > +    * - __s32
> > > +      - ``pic_order_cnt_val``
> > > +      -
> > 
> > Can be documented as:
> > 
> > """
> > PicOrderCntVal as described in section 8.3.1 "Decoding process
> > for picture order count" of the specification.
> > """
> > 
> > Note that snake case is used to match the kernel style,
> > but other than that we try to keep the HEVC spec variable
> > names.
> > 
> > > +    * - __u8
> > > +      - ``num_active_dpb_entries``
> > > +      - The number of entries in ``dpb``.
> > > +    * - struct :c:type:`v4l2_hevc_dpb_entry`
> > > +      - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > +      - The decoded picture buffer, for meta-data about reference frames.
> > 
> > The DPB is here, but it seems it's also in the slice control?
> > 
> > > +    * - __u8
> > > +      - ``num_rps_poc_st_curr_before``
> > > +      - The number of reference pictures in the short-term set that come 
> before
> > > +        the current frame.
> > > +    * - __u8
> > > +      - ``num_rps_poc_st_curr_after``
> > > +      - The number of reference pictures in the short-term set that come 
> after
> > > +        the current frame.
> > > +    * - __u8
> > > +      - ``num_rps_poc_lt_curr``
> > > +      - The number of reference pictures in the long-term set.
> > > +    * - __u8
> > > +      - ``rps_st_curr_before[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > +      -
> > > +    * - __u8
> > > +      - ``rps_st_curr_after[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > +      -
> > > +    * - __u8
> > > +      - ``rps_lt_curr[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > +      -
> > 
> > Could you document these as well?
> > 
> > Thanks a lot,
> > Ezequiel
> > 
> > 
> 
>
Ezequiel Garcia Feb. 25, 2021, 5:55 p.m. UTC | #7
Hi Benjamin,

Thanks for the good work!
I mostly have two concerns with this implementation,
the tiled output and the allocation path.

More below.

On Mon, 2021-02-22 at 13:24 +0100, Benjamin Gaignard wrote:
> Implement all the logic to get G2 hardware decoding HEVC frames.
> It support up level 5.1 HEVC stream.
> It doesn't support yet 10 bits formats or scaling feature.
> 
> Add HANTRO HEVC dedicated control to skip some bits at the beginning
> of the slice header. That is very specific to this hardware so can't
> go into uapi structures. Compute the needed value is complex and require
> information from the stream that only the userland knows so let it
> provide the correct value to the driver.
> 
> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> ---
> version 2:
> - squash multiple commits in this one.
> - fix the comments done by Ezequiel about dma_alloc_coherent usage
> - fix Dan's comments about control copy, reverse the test logic
> in tile_buffer_reallocate, rework some goto and return cases.
> 
>  drivers/staging/media/hantro/Makefile         |   2 +
>  drivers/staging/media/hantro/hantro.h         |  19 +
>  drivers/staging/media/hantro/hantro_drv.c     |  42 ++
>  .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
>  drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
>  drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
>  drivers/staging/media/hantro/hantro_hw.h      |  47 ++
>  7 files changed, 1216 insertions(+)
>  create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>  create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
>  create mode 100644 drivers/staging/media/hantro/hantro_hevc.c
> 
> 
[snip]
> +
> +static void set_buffers(struct hantro_ctx *ctx)
> +{
> +       struct vb2_v4l2_buffer *src_buf, *dst_buf;
> +       struct hantro_dev *vpu = ctx->dev;
> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> +       dma_addr_t src_dma, dst_dma;
> +       u32 src_len, src_buf_len;
> +
> +       src_buf = hantro_get_src_buf(ctx);
> +       dst_buf = hantro_get_dst_buf(ctx);
> +
> +       /* Source (stream) buffer. */
> +       src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
> +       src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
> +       src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
> +
> +       hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
> +       hantro_reg_write(vpu, hevc_stream_len, src_len);
> +       hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
> +       hantro_reg_write(vpu, hevc_strm_start_offset, 0);
> +       hantro_reg_write(vpu, hevc_write_mvs_e, 1);
> +
> +       /* Destination (decoded frame) buffer. */
> +       dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
> +
> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);

The way this is done the raster-scan output is the only
output, and the tiled ouput it kept entirely internal, in hantro_hevc_dec_hw_ctx.ref_bufs.

This means there's no way to expose tiled NV12 in i.MX8M VPU tiled format
to userspace, which could be desirable for some use cases.

I'm not suggesting to actually expose tiled NV12 in this patch, but to prepare 
the driver to be able to support that easily in the future.

It should be possible to consider this detiling as
post-processing, adding some code in hantro_postproc.c
leveraging the existing post-proc support.

IOW, hantro_postproc_ctx.dec_q would hold the tiled output,
hantro_postproc_enable() writes the raster-scan
buffer destination address, and so on.

> +       hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
> +       hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
> +       hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
> +       hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
> +}
> +
> +void hantro_g2_check_idle(struct hantro_dev *vpu)
> +{
> +       int i;
> +
> +       for (i = 0; i < 3; i++) {
> +               u32 status;
> +
> +               /* Make sure the VPU is idle */
> +               status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
> +               if (status & HEVC_REG_INTERRUPT_DEC_E) {
> +                       pr_warn("%s: still enabled!!! resetting.\n", __func__);
> +                       status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
> +                       vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
> +               }
> +       }
> +}
> +
> +void hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
> +{
> +       struct hantro_dev *vpu = ctx->dev;
> +
> +       hantro_g2_check_idle(vpu);
> +
> +       /* Prepare HEVC decoder context. */
> +       if (hantro_hevc_dec_prepare_run(ctx))
> +               return;
> +
> +       /* Configure hardware registers. */
> +       set_params(ctx);
> +
> +       /* set reference pictures */
> +       if (set_ref(ctx))
> +               /* something goes wrong */
> +               hantro_irq_done(vpu, VB2_BUF_STATE_ERROR);
> +

I don't think we want to allow the _run() operation to fail like this.
In other words, I would avoid allocating buffers in the _run() path,
and doing all allocation at start() time.

That's why hantro_start_streaming() calls hantro_postproc_alloc() if needed.

> +       set_buffers(ctx);
> +       prepare_tile_info_buffer(ctx);
> +
> +       hantro_end_prepare_run(ctx);
> +
> +       hantro_reg_write(vpu, hevc_mode, HEVC_DEC_MODE);
> +       hantro_reg_write(vpu, hevc_clk_gate_e, 1);
> +
> +       /* Don't disable output */
> +       hantro_reg_write(vpu, hevc_out_dis, 0);
> +
> +       /* Don't compress buffers */
> +       hantro_reg_write(vpu, hevc_ref_compress_bypass, 1);
> +
> +       /* use NV12 as output format */
> +       hantro_reg_write(vpu, hevc_tile_e, 0);

Unless I'm missing something, this ^

> +       hantro_reg_write(vpu, hevc_out_rs_e, 1);

> +       hantro_reg_write(vpu, hevc_num_tile_rows, 1);
> +       hantro_reg_write(vpu, hevc_num_tile_cols, 1);
> +

and this ^ these shouldn't be here.

HEVC tiles are handled already. See prepare_tile_info_buffer().
Note that HEVC "tiles" are not related to NV12 "tiled" format.

Thanks!
Ezequiel
Jernej Škrabec Feb. 25, 2021, 6:05 p.m. UTC | #8
Dne četrtek, 25. februar 2021 ob 18:34:48 CET je Ezequiel Garcia napisal(a):
> Hey Jernej,
> 
> On Thu, 2021-02-25 at 18:01 +0100, Jernej Škrabec wrote:
> > Hi Ezequiel,
> > 
> > Dne četrtek, 25. februar 2021 ob 14:09:52 CET je Ezequiel Garcia 
napisal(a):
> > > Hi Benjamin,
> > > 
> > > Thanks for the good work.
> > > 
> > > On Mon, 2021-02-22 at 13:23 +0100, Benjamin Gaignard wrote:
> > > > The H.265 ITU specification (section 7.4) define the general
> > > > slice segment header semantics.
> > > > Modified/added fields are:
> > > > - video_parameter_set_id: (7.4.3.1) identifies the VPS for
> > > > reference by other syntax elements.
> > > > - seq_parameter_set_id: (7.4.3.2.1) specifies the value of
> > > > the vps_video_parameter_set_id of the active VPS.
> > > > - chroma_format_idc: (7.4.3.2.1) specifies the chroma sampling
> > > >  relative to the luma sampling
> > > > - pic_parameter_set_id: (7.4.3.3.1) identifies the PPS for
> > > > reference by other syntax elements
> > > > - num_ref_idx_l0_default_active_minus1: (7.4.3.3.1) specifies
> > > > the inferred value of num_ref_idx_l0_active_minus1
> > > > - num_ref_idx_l1_default_active_minus1: (7.4.3.3.1) specifies
> > > > the inferred value of num_ref_idx_l1_active_minus1
> > > > - slice_segment_addr: (7.4.7.1) specifies the address of
> > > > the first coding tree block in the slice segment
> > > > - num_entry_point_offsets: (7.4.7.1) specifies the number of
> > > > entry_point_offset_minus1[ i ] syntax elements in the slice header
> > > > 
> > > > Add HEVC decode params contains the information used in section
> > > > "8.3 Slice decoding process" of the specification to let the hardware
> > > > perform decoding of a slices.
> > > > 
> > > > Adapt Cedrus driver according to these changes.
> > > > 
> > > > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> > > > ---
> > > > version 3:
> > > > - Add documentation about the new structuers and fields.
> > > > 
> > > > version 2:
> > > > - remove all change related to scaling
> > > > - squash commits to a coherent split
> > > > - be more verbose about the added fields
> > > > 
> > > >  .../media/v4l/ext-ctrls-codec.rst             | 126 ++++++++++++++
+---
> > > >  .../media/v4l/vidioc-queryctrl.rst            |   6 +
> > > >  drivers/media/v4l2-core/v4l2-ctrls.c          |  26 +++-
> > > >  drivers/staging/media/sunxi/cedrus/cedrus.c   |   6 +
> > > >  drivers/staging/media/sunxi/cedrus/cedrus.h   |   1 +
> > > >  .../staging/media/sunxi/cedrus/cedrus_dec.c   |   2 +
> > > >  .../staging/media/sunxi/cedrus/cedrus_h265.c  |   6 +-
> > > >  include/media/hevc-ctrls.h                    |  45 +++++--
> > > >  8 files changed, 186 insertions(+), 32 deletions(-)
> > > > 
> > > > diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst 
b/
> > Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > > > index 00944e97d638..5e6d77e858c0 100644
> > > > --- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > > > +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > > > @@ -3109,6 +3109,15 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> > > >      :stub-columns: 0
> > > >      :widths:       1 1 2
> > > >  
> > > > +    * - __u8
> > > > +      - ``video_parameter_set_id``
> > > > +      - Identifies the VPS for reference by other syntax elements
> > > > +    * - __u8
> > > > +      - ``seq_parameter_set_id̀``
> > > > +      - Specifies the value of the vps_video_parameter_set_id of the 
> > active VPS
> > > > +    * - __u8
> > > > +      - ``chroma_format_idc``
> > > > +      - Specifies the chroma sampling relative to the luma sampling
> > > 
> > > None of these fields seem needed for the Hantro G2 driver,
> > > so I suggest you drop them for now.
> > > 
> > > >      * - __u16
> > > >        - ``pic_width_in_luma_samples``
> > > >        -
> > > > @@ -3172,6 +3181,9 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> > > >      * - __u8
> > > >        - ``chroma_format_idc``
> > > >        -
> > > > +    * - __u8
> > > > +      - ``num_slices``
> > > > +
> > > 
> > > Not used, but also doesn't seem part of the SPS syntax. If we have to
> > > pass the number of slices, we'll need another mechanism.
> > > 
> > > >       -
> > > >      * - __u64
> > > >        - ``flags``
> > > >        - See :ref:`Sequence Parameter Set Flags <hevc_sps_flags>`
> > > > @@ -3231,9 +3243,18 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> > > >      :stub-columns: 0
> > > >      :widths:       1 1 2
> > > >  
> > > > +    * - __u8
> > > > +      - ``pic_parameter_set_id``
> > > > +      - Identifies the PPS for reference by other syntax elements
> > > 
> > > Not used.
> > > 
> > > >      * - __u8
> > > >        - ``num_extra_slice_header_bits``
> > > >        -
> > > > +    * - __u8
> > > > +      - ``num_ref_idx_l0_default_active_minus1``
> > > > +      - Specifies the inferred value of num_ref_idx_l0_active_minus1
> > > > +    * - __u8
> > > > +      - ``num_ref_idx_l1_default_active_minus1``
> > > > +      - Specifies the inferred value of num_ref_idx_l1_active_minus1
> > > >      * - __s8
> > > >        - ``init_qp_minus26``
> > > >        -
> > > > @@ -3342,6 +3363,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> > > >      * - ``V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT``
> > > >        - 0x00040000
> > > >        -
> > > > +    * - ``V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT``
> > > > +      - 0x00080000
> > > > +      -
> > > > +    * - ``V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING``
> > > > +      - 0x00100000
> > > > +      -
> > > >  
> > > 
> > > I suggest to do all the PPS control changes in a separate patch,
> > > feels easier to review and cleaner as you can explain the
> > > changes with more detail in the commit description.
> > > 
> > > Looking at the PPS syntax for tiles, I'm wondering if these
> > > deserve their own control, which would be used if tiles are enabled,
> > > i.e. V4L2_HEVC_PPS_FLAG_TILES_ENABLED is set.
> > > 
> > >         __u8    
num_tile_columns_minus1;                                         
> > >         __u8    
num_tile_rows_minus1;                                            
> > >         __u8    
column_width_minus1[20];                                         
> > >         __u8    row_height_minus1[22];    
> > > 
> > > Not something we necessarily have to tackle now.
> > > 
> > > >  ``V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS (struct)``
> > > >      Specifies various slice-specific parameters, especially from the 
NAL 
> > unit
> > > > @@ -3366,6 +3393,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> > > >      * - __u32
> > > >        - ``data_bit_offset``
> > > >        - Offset (in bits) to the video data in the current slice data.
> > > > +    * - __u32
> > > > +      - ``slice_segment_addr``
> > > > +      - Specifies the address of the first coding tree block in the 
slice 
> > segment
> > > 
> > > Not used.
> > > 
> > > > +    * - __u32
> > > > +      - ``num_entry_point_offsets``
> > > > +      - Specifies the number of entry_point_offset_minus1[ i ] syntax 
> > elements in the slice header
> > > 
> > > Not used.
> > 
> > While above two fields may not be used in Hantro, they are for sure useful 
for 
> > Cedrus and RPi4. I would like to keep them, otherwise with such approach 
HEVC 
> > will stay in staging for a long time. I'm still baffled why scaling matrix 
> > control was dropped. It would fit well in Cedrus and RPi4 driver and after 
a 
> > quick look, it seems that it was used in driver in later patch.
> > 
> 
> I'd like to make sure each modification we are making to the uAPI
> goes in the right direction, that is in the direction of moving
> the API out of staging.
> 
> Since reviewing each field is quite hard, and opens some discussions,
> I wanted to keep this patchset specific to what's needed for Hantro G2.
> 
> The Scaling matrix control is certainly a good one, as well as the ones
> needed for Cedrus and RPi4. However, I feel it's better to discuss
> them in their own "uAPI review" series so we can review all the changes
> with an API hat.
> 
> This way we decouple the Hantro G2 discussion and work from the API work.
> 
> Also please feel free to submit RFC patches fo Cedrus and RPi4
> (API and driver changes). We can certainly start the discussion around that,
> with driver changes in context.

I don't know much about RPi4 driver, only few implementation details, so 
you'll have to ping developer who wrote it. Regarding HEVC on Cedrus - it has 
one pain point - it needs entry point table which in turn needs support for 
variable arrays in order to be feasable AFAIK. I don't plan to develop that. 
Patches for scaling matrix and segment address were sent a bit more than a 
year ago but were turned down because they change control structures (among 
other things). Sorry to say, but I work on other things now, so Cedrus will 
have to wait. Alternatively, someone can take my patches from LibreELEC, 
update and submit them. They are in use for a long time.

Best regards,
Jernej

> 
> Hope I'm making sense here :)
> 
> Thanks,
> Ezequiel
> 
> > Best regards,
> > Jernej
> > 
> > > 
> > > >      * - __u8
> > > >        - ``nal_unit_type``
> > > >        -
> > > > @@ -3422,28 +3455,20 @@ enum v4l2_mpeg_video_hevc_size_of_length_field 
-
> > > >      * - __u8
> > > >        - ``pic_struct``
> > > >        -
> > > > -    * - __u8
> > > > -      - ``num_active_dpb_entries``
> > > > -      - The number of entries in ``dpb``.
> > > 
> > > Need to explain in the commit description why this field is moved.
> > > 
> > > >      * - __u8
> > > >        - ``ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > >        - The list of L0 reference elements as indices in the DPB.
> > > >      * - __u8
> > > >        - ``ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > >        - The list of L1 reference elements as indices in the DPB.
> > > > +    * - __u16
> > > > +      - ``short_term_ref_pic_set_size``
> > > > +
> > > 
> > > Not used.
> > > 
> > > >       -
> > > > +    * - __u16
> > > > +      - ``long_term_ref_pic_set_size``
> > > > +      -
> > > 
> > > Not used.
> > > 
> > > >      * - __u8
> > > > -      - ``num_rps_poc_st_curr_before``
> > > > -      - The number of reference pictures in the short-term set that 
come 
> > before
> > > > -        the current frame.
> > > 
> > > If this matches NumPocStCurrBefore from section 8.3.2 "Decoding process 
for 
> > reference picture set"
> > > then I would document that. And perhaps rename it to 
num_poc_st_curr_before.
> > > 
> > > > -    * - __u8
> > > > -      - ``num_rps_poc_st_curr_after``
> > > > -      - The number of reference pictures in the short-term set that 
come 
> > after
> > > > -        the current frame.
> > > 
> > > Ditto.
> > > 
> > > > -    * - __u8
> > > > -      - ``num_rps_poc_lt_curr``
> > > > -      - The number of reference pictures in the long-term set.
> > > 
> > > Ditto.
> > > 
> > > Also, I'd like the changes that move fields from 
> > V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS
> > > to the new V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS control, to be in 
their
> > > patch.
> > > 
> > > That will allow us to put in the commit description a proper
> > > explanation of why are fields being moved. Nothing fancy, simply
> > > explaining that these variables come from section 8.3.2
> > > "Decoding process for reference picture set", which describes
> > > a process invoked once per picture, so they are not per-slice.
> > > 
> > > > -    * - __u8
> > > > -      - ``padding[7]``
> > > > +      - ``padding``
> > > >        - Applications and drivers must set this to zero.
> > > >      * - struct :c:type:`v4l2_hevc_dpb_entry`
> > > >        - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > > @@ -3646,3 +3671,74 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> > > >      so this has to come from client.
> > > >      This is applicable to H264 and valid Range is from 0 to 63.
> > > >      Source Rec. ITU-T H.264 (06/2019); G.7.4.1.1, G.8.8.1.
> > > > +
> > > > +``V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS (struct)``
> > > > +    Specifies various decode parameters, especially the references 
picture 
> > order
> > > > +    count (POC) for all the lists (short, long, before, current, 
after) 
> > and the
> > > > +    number of entries for each of them.
> > > > +    These parameters are defined according to :ref:`hevc`.
> > > > +    They are described in section 8.3 "Slice decoding process" of the
> > > > +    specification.
> > > > +
> > > > +.. c:type:: v4l2_ctrl_hevc_decode_params
> > > > +
> > > > +.. cssclass:: longtable
> > > > +
> > > > +.. flat-table:: struct v4l2_ctrl_hevc_decode_params
> > > > +    :header-rows:  0
> > > > +    :stub-columns: 0
> > > > +    :widths:       1 1 2
> > > > +
> > > > +    * - __s32
> > > > +      - ``pic_order_cnt_val``
> > > > +      -
> > > 
> > > Can be documented as:
> > > 
> > > """
> > > PicOrderCntVal as described in section 8.3.1 "Decoding process
> > > for picture order count" of the specification.
> > > """
> > > 
> > > Note that snake case is used to match the kernel style,
> > > but other than that we try to keep the HEVC spec variable
> > > names.
> > > 
> > > > +    * - __u8
> > > > +      - ``num_active_dpb_entries``
> > > > +      - The number of entries in ``dpb``.
> > > > +    * - struct :c:type:`v4l2_hevc_dpb_entry`
> > > > +      - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > > +      - The decoded picture buffer, for meta-data about reference 
frames.
> > > 
> > > The DPB is here, but it seems it's also in the slice control?
> > > 
> > > > +    * - __u8
> > > > +      - ``num_rps_poc_st_curr_before``
> > > > +      - The number of reference pictures in the short-term set that 
come 
> > before
> > > > +        the current frame.
> > > > +    * - __u8
> > > > +      - ``num_rps_poc_st_curr_after``
> > > > +      - The number of reference pictures in the short-term set that 
come 
> > after
> > > > +        the current frame.
> > > > +    * - __u8
> > > > +      - ``num_rps_poc_lt_curr``
> > > > +      - The number of reference pictures in the long-term set.
> > > > +    * - __u8
> > > > +      - ``rps_st_curr_before[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > > +      -
> > > > +    * - __u8
> > > > +      - ``rps_st_curr_after[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > > +      -
> > > > +    * - __u8
> > > > +      - ``rps_lt_curr[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > > +      -
> > > 
> > > Could you document these as well?
> > > 
> > > Thanks a lot,
> > > Ezequiel
> > > 
> > > 
> > 
> > 
> 
> 
>
John Cox Feb. 25, 2021, 6:34 p.m. UTC | #9
On Thu, 25 Feb 2021 19:05:55 +0100, you wrote:

>Dne ?etrtek, 25. februar 2021 ob 18:34:48 CET je Ezequiel Garcia napisal(a):
>> Hey Jernej,
>> 
>> On Thu, 2021-02-25 at 18:01 +0100, Jernej Škrabec wrote:
>> > Hi Ezequiel,
>> > 
>> > Dne ?etrtek, 25. februar 2021 ob 14:09:52 CET je Ezequiel Garcia 
>napisal(a):
>> > > Hi Benjamin,
>> > > 
>> > > Thanks for the good work.
>> > > 
>> > > On Mon, 2021-02-22 at 13:23 +0100, Benjamin Gaignard wrote:
>> > > > The H.265 ITU specification (section 7.4) define the general
>> > > > slice segment header semantics.
>> > > > Modified/added fields are:
>> > > > - video_parameter_set_id: (7.4.3.1) identifies the VPS for
>> > > > reference by other syntax elements.
>> > > > - seq_parameter_set_id: (7.4.3.2.1) specifies the value of
>> > > > the vps_video_parameter_set_id of the active VPS.
>> > > > - chroma_format_idc: (7.4.3.2.1) specifies the chroma sampling
>> > > >  relative to the luma sampling
>> > > > - pic_parameter_set_id: (7.4.3.3.1) identifies the PPS for
>> > > > reference by other syntax elements
>> > > > - num_ref_idx_l0_default_active_minus1: (7.4.3.3.1) specifies
>> > > > the inferred value of num_ref_idx_l0_active_minus1
>> > > > - num_ref_idx_l1_default_active_minus1: (7.4.3.3.1) specifies
>> > > > the inferred value of num_ref_idx_l1_active_minus1
>> > > > - slice_segment_addr: (7.4.7.1) specifies the address of
>> > > > the first coding tree block in the slice segment
>> > > > - num_entry_point_offsets: (7.4.7.1) specifies the number of
>> > > > entry_point_offset_minus1[ i ] syntax elements in the slice header
>> > > > 
>> > > > Add HEVC decode params contains the information used in section
>> > > > "8.3 Slice decoding process" of the specification to let the hardware
>> > > > perform decoding of a slices.
>> > > > 
>> > > > Adapt Cedrus driver according to these changes.
>> > > > 
>> > > > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> > > > ---
>> > > > version 3:
>> > > > - Add documentation about the new structuers and fields.
>> > > > 
>> > > > version 2:
>> > > > - remove all change related to scaling
>> > > > - squash commits to a coherent split
>> > > > - be more verbose about the added fields
>> > > > 
>> > > >  .../media/v4l/ext-ctrls-codec.rst             | 126 ++++++++++++++
>+---
>> > > >  .../media/v4l/vidioc-queryctrl.rst            |   6 +
>> > > >  drivers/media/v4l2-core/v4l2-ctrls.c          |  26 +++-
>> > > >  drivers/staging/media/sunxi/cedrus/cedrus.c   |   6 +
>> > > >  drivers/staging/media/sunxi/cedrus/cedrus.h   |   1 +
>> > > >  .../staging/media/sunxi/cedrus/cedrus_dec.c   |   2 +
>> > > >  .../staging/media/sunxi/cedrus/cedrus_h265.c  |   6 +-
>> > > >  include/media/hevc-ctrls.h                    |  45 +++++--
>> > > >  8 files changed, 186 insertions(+), 32 deletions(-)
>> > > > 
>> > > > diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst 
>b/
>> > Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
>> > > > index 00944e97d638..5e6d77e858c0 100644
>> > > > --- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
>> > > > +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
>> > > > @@ -3109,6 +3109,15 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
>> > > >      :stub-columns: 0
>> > > >      :widths:       1 1 2
>> > > >  
>> > > > +    * - __u8
>> > > > +      - ``video_parameter_set_id``
>> > > > +      - Identifies the VPS for reference by other syntax elements
>> > > > +    * - __u8
>> > > > +      - ``seq_parameter_set_id?``
>> > > > +      - Specifies the value of the vps_video_parameter_set_id of the 
>> > active VPS
>> > > > +    * - __u8
>> > > > +      - ``chroma_format_idc``
>> > > > +      - Specifies the chroma sampling relative to the luma sampling
>> > > 
>> > > None of these fields seem needed for the Hantro G2 driver,
>> > > so I suggest you drop them for now.
>> > > 
>> > > >      * - __u16
>> > > >        - ``pic_width_in_luma_samples``
>> > > >        -
>> > > > @@ -3172,6 +3181,9 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
>> > > >      * - __u8
>> > > >        - ``chroma_format_idc``
>> > > >        -
>> > > > +    * - __u8
>> > > > +      - ``num_slices``
>> > > > +
>> > > 
>> > > Not used, but also doesn't seem part of the SPS syntax. If we have to
>> > > pass the number of slices, we'll need another mechanism.
>> > > 
>> > > >       -
>> > > >      * - __u64
>> > > >        - ``flags``
>> > > >        - See :ref:`Sequence Parameter Set Flags <hevc_sps_flags>`
>> > > > @@ -3231,9 +3243,18 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
>> > > >      :stub-columns: 0
>> > > >      :widths:       1 1 2
>> > > >  
>> > > > +    * - __u8
>> > > > +      - ``pic_parameter_set_id``
>> > > > +      - Identifies the PPS for reference by other syntax elements
>> > > 
>> > > Not used.
>> > > 
>> > > >      * - __u8
>> > > >        - ``num_extra_slice_header_bits``
>> > > >        -
>> > > > +    * - __u8
>> > > > +      - ``num_ref_idx_l0_default_active_minus1``
>> > > > +      - Specifies the inferred value of num_ref_idx_l0_active_minus1
>> > > > +    * - __u8
>> > > > +      - ``num_ref_idx_l1_default_active_minus1``
>> > > > +      - Specifies the inferred value of num_ref_idx_l1_active_minus1
>> > > >      * - __s8
>> > > >        - ``init_qp_minus26``
>> > > >        -
>> > > > @@ -3342,6 +3363,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
>> > > >      * - ``V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT``
>> > > >        - 0x00040000
>> > > >        -
>> > > > +    * - ``V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT``
>> > > > +      - 0x00080000
>> > > > +      -
>> > > > +    * - ``V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING``
>> > > > +      - 0x00100000
>> > > > +      -
>> > > >  
>> > > 
>> > > I suggest to do all the PPS control changes in a separate patch,
>> > > feels easier to review and cleaner as you can explain the
>> > > changes with more detail in the commit description.
>> > > 
>> > > Looking at the PPS syntax for tiles, I'm wondering if these
>> > > deserve their own control, which would be used if tiles are enabled,
>> > > i.e. V4L2_HEVC_PPS_FLAG_TILES_ENABLED is set.
>> > > 
>> > >         __u8    
>num_tile_columns_minus1;                                         
>> > >         __u8    
>num_tile_rows_minus1;                                            
>> > >         __u8    
>column_width_minus1[20];                                         
>> > >         __u8    row_height_minus1[22];    
>> > > 
>> > > Not something we necessarily have to tackle now.
>> > > 
>> > > >  ``V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS (struct)``
>> > > >      Specifies various slice-specific parameters, especially from the 
>NAL 
>> > unit
>> > > > @@ -3366,6 +3393,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
>> > > >      * - __u32
>> > > >        - ``data_bit_offset``
>> > > >        - Offset (in bits) to the video data in the current slice data.
>> > > > +    * - __u32
>> > > > +      - ``slice_segment_addr``
>> > > > +      - Specifies the address of the first coding tree block in the 
>slice 
>> > segment
>> > > 
>> > > Not used.
>> > > 
>> > > > +    * - __u32
>> > > > +      - ``num_entry_point_offsets``
>> > > > +      - Specifies the number of entry_point_offset_minus1[ i ] syntax 
>> > elements in the slice header
>> > > 
>> > > Not used.
>> > 
>> > While above two fields may not be used in Hantro, they are for sure useful 
>for 
>> > Cedrus and RPi4. I would like to keep them, otherwise with such approach 
>HEVC 
>> > will stay in staging for a long time. I'm still baffled why scaling matrix 
>> > control was dropped. It would fit well in Cedrus and RPi4 driver and after 
>a 
>> > quick look, it seems that it was used in driver in later patch.
>> > 
>> 
>> I'd like to make sure each modification we are making to the uAPI
>> goes in the right direction, that is in the direction of moving
>> the API out of staging.
>> 
>> Since reviewing each field is quite hard, and opens some discussions,
>> I wanted to keep this patchset specific to what's needed for Hantro G2.
>> 
>> The Scaling matrix control is certainly a good one, as well as the ones
>> needed for Cedrus and RPi4. However, I feel it's better to discuss
>> them in their own "uAPI review" series so we can review all the changes
>> with an API hat.
>> 
>> This way we decouple the Hantro G2 discussion and work from the API work.
>> 
>> Also please feel free to submit RFC patches fo Cedrus and RPi4
>> (API and driver changes). We can certainly start the discussion around that,
>> with driver changes in context.
>
>I don't know much about RPi4 driver, only few implementation details, so 
>you'll have to ping developer who wrote it. Regarding HEVC on Cedrus - it has 
>one pain point - it needs entry point table which in turn needs support for 
>variable arrays in order to be feasable AFAIK. I don't plan to develop that. 
>Patches for scaling matrix and segment address were sent a bit more than a 
>year ago but were turned down because they change control structures (among 
>other things). Sorry to say, but I work on other things now, so Cedrus will 
>have to wait. Alternatively, someone can take my patches from LibreELEC, 
>update and submit them. They are in use for a long time.

It seems to me that H.265 should be the source for what fields are
needed in the uapi - not whatever happens to be supported by current
h/w. The standard defines the list of fields that can occur in the
parameter sets and headers, and everything that is needed to decode 
any slice_seqgment_data should be in the upai.  Eventually all those
fields are going to be needed (they aren't in the standard just to look
pretty) and given the reluctance I've observed to change the uapi once
released they should be there from the start.

Now some hardware requires more fields that aren't in the standard -
those can (and should) be added in h/w specific extensions.

Regards

JC

>Best regards,
>Jernej
>
>> 
>> Hope I'm making sense here :)
>> 
>> Thanks,
>> Ezequiel
>> 
>> > Best regards,
>> > Jernej
>> > 
>> > > 
>> > > >      * - __u8
>> > > >        - ``nal_unit_type``
>> > > >        -
>> > > > @@ -3422,28 +3455,20 @@ enum v4l2_mpeg_video_hevc_size_of_length_field 
>-
>> > > >      * - __u8
>> > > >        - ``pic_struct``
>> > > >        -
>> > > > -    * - __u8
>> > > > -      - ``num_active_dpb_entries``
>> > > > -      - The number of entries in ``dpb``.
>> > > 
>> > > Need to explain in the commit description why this field is moved.
>> > > 
>> > > >      * - __u8
>> > > >        - ``ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
>> > > >        - The list of L0 reference elements as indices in the DPB.
>> > > >      * - __u8
>> > > >        - ``ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
>> > > >        - The list of L1 reference elements as indices in the DPB.
>> > > > +    * - __u16
>> > > > +      - ``short_term_ref_pic_set_size``
>> > > > +
>> > > 
>> > > Not used.
>> > > 
>> > > >       -
>> > > > +    * - __u16
>> > > > +      - ``long_term_ref_pic_set_size``
>> > > > +      -
>> > > 
>> > > Not used.
>> > > 
>> > > >      * - __u8
>> > > > -      - ``num_rps_poc_st_curr_before``
>> > > > -      - The number of reference pictures in the short-term set that 
>come 
>> > before
>> > > > -        the current frame.
>> > > 
>> > > If this matches NumPocStCurrBefore from section 8.3.2 "Decoding process 
>for 
>> > reference picture set"
>> > > then I would document that. And perhaps rename it to 
>num_poc_st_curr_before.
>> > > 
>> > > > -    * - __u8
>> > > > -      - ``num_rps_poc_st_curr_after``
>> > > > -      - The number of reference pictures in the short-term set that 
>come 
>> > after
>> > > > -        the current frame.
>> > > 
>> > > Ditto.
>> > > 
>> > > > -    * - __u8
>> > > > -      - ``num_rps_poc_lt_curr``
>> > > > -      - The number of reference pictures in the long-term set.
>> > > 
>> > > Ditto.
>> > > 
>> > > Also, I'd like the changes that move fields from 
>> > V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS
>> > > to the new V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS control, to be in 
>their
>> > > patch.
>> > > 
>> > > That will allow us to put in the commit description a proper
>> > > explanation of why are fields being moved. Nothing fancy, simply
>> > > explaining that these variables come from section 8.3.2
>> > > "Decoding process for reference picture set", which describes
>> > > a process invoked once per picture, so they are not per-slice.
>> > > 
>> > > > -    * - __u8
>> > > > -      - ``padding[7]``
>> > > > +      - ``padding``
>> > > >        - Applications and drivers must set this to zero.
>> > > >      * - struct :c:type:`v4l2_hevc_dpb_entry`
>> > > >        - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
>> > > > @@ -3646,3 +3671,74 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
>> > > >      so this has to come from client.
>> > > >      This is applicable to H264 and valid Range is from 0 to 63.
>> > > >      Source Rec. ITU-T H.264 (06/2019); G.7.4.1.1, G.8.8.1.
>> > > > +
>> > > > +``V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS (struct)``
>> > > > +    Specifies various decode parameters, especially the references 
>picture 
>> > order
>> > > > +    count (POC) for all the lists (short, long, before, current, 
>after) 
>> > and the
>> > > > +    number of entries for each of them.
>> > > > +    These parameters are defined according to :ref:`hevc`.
>> > > > +    They are described in section 8.3 "Slice decoding process" of the
>> > > > +    specification.
>> > > > +
>> > > > +.. c:type:: v4l2_ctrl_hevc_decode_params
>> > > > +
>> > > > +.. cssclass:: longtable
>> > > > +
>> > > > +.. flat-table:: struct v4l2_ctrl_hevc_decode_params
>> > > > +    :header-rows:  0
>> > > > +    :stub-columns: 0
>> > > > +    :widths:       1 1 2
>> > > > +
>> > > > +    * - __s32
>> > > > +      - ``pic_order_cnt_val``
>> > > > +      -
>> > > 
>> > > Can be documented as:
>> > > 
>> > > """
>> > > PicOrderCntVal as described in section 8.3.1 "Decoding process
>> > > for picture order count" of the specification.
>> > > """
>> > > 
>> > > Note that snake case is used to match the kernel style,
>> > > but other than that we try to keep the HEVC spec variable
>> > > names.
>> > > 
>> > > > +    * - __u8
>> > > > +      - ``num_active_dpb_entries``
>> > > > +      - The number of entries in ``dpb``.
>> > > > +    * - struct :c:type:`v4l2_hevc_dpb_entry`
>> > > > +      - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
>> > > > +      - The decoded picture buffer, for meta-data about reference 
>frames.
>> > > 
>> > > The DPB is here, but it seems it's also in the slice control?
>> > > 
>> > > > +    * - __u8
>> > > > +      - ``num_rps_poc_st_curr_before``
>> > > > +      - The number of reference pictures in the short-term set that 
>come 
>> > before
>> > > > +        the current frame.
>> > > > +    * - __u8
>> > > > +      - ``num_rps_poc_st_curr_after``
>> > > > +      - The number of reference pictures in the short-term set that 
>come 
>> > after
>> > > > +        the current frame.
>> > > > +    * - __u8
>> > > > +      - ``num_rps_poc_lt_curr``
>> > > > +      - The number of reference pictures in the long-term set.
>> > > > +    * - __u8
>> > > > +      - ``rps_st_curr_before[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
>> > > > +      -
>> > > > +    * - __u8
>> > > > +      - ``rps_st_curr_after[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
>> > > > +      -
>> > > > +    * - __u8
>> > > > +      - ``rps_lt_curr[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
>> > > > +      -
>> > > 
>> > > Could you document these as well?
>> > > 
>> > > Thanks a lot,
>> > > Ezequiel
>> > > 
>> > > 
>> > 
>> > 
>> 
>> 
>> 
>
Nicolas Dufresne Feb. 25, 2021, 6:35 p.m. UTC | #10
Le jeudi 25 février 2021 à 10:09 -0300, Ezequiel Garcia a écrit :
> Hi Benjamin,
> 
> Thanks for the good work.
> 
> On Mon, 2021-02-22 at 13:23 +0100, Benjamin Gaignard wrote:
> > The H.265 ITU specification (section 7.4) define the general
> > slice segment header semantics.
> > Modified/added fields are:
> > - video_parameter_set_id: (7.4.3.1) identifies the VPS for
> > reference by other syntax elements.
> > - seq_parameter_set_id: (7.4.3.2.1) specifies the value of
> > the vps_video_parameter_set_id of the active VPS.
> > - chroma_format_idc: (7.4.3.2.1) specifies the chroma sampling
> >  relative to the luma sampling
> > - pic_parameter_set_id: (7.4.3.3.1) identifies the PPS for
> > reference by other syntax elements
> > - num_ref_idx_l0_default_active_minus1: (7.4.3.3.1) specifies
> > the inferred value of num_ref_idx_l0_active_minus1
> > - num_ref_idx_l1_default_active_minus1: (7.4.3.3.1) specifies
> > the inferred value of num_ref_idx_l1_active_minus1
> > - slice_segment_addr: (7.4.7.1) specifies the address of
> > the first coding tree block in the slice segment
> > - num_entry_point_offsets: (7.4.7.1) specifies the number of
> > entry_point_offset_minus1[ i ] syntax elements in the slice header
> > 
> > Add HEVC decode params contains the information used in section
> > "8.3 Slice decoding process" of the specification to let the hardware
> > perform decoding of a slices.
> > 
> > Adapt Cedrus driver according to these changes.
> > 
> > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> > ---
> > version 3:
> > - Add documentation about the new structuers and fields.
> > 
> > version 2:
> > - remove all change related to scaling
> > - squash commits to a coherent split
> > - be more verbose about the added fields
> > 
> >  .../media/v4l/ext-ctrls-codec.rst             | 126 +++++++++++++++---
> >  .../media/v4l/vidioc-queryctrl.rst            |   6 +
> >  drivers/media/v4l2-core/v4l2-ctrls.c          |  26 +++-
> >  drivers/staging/media/sunxi/cedrus/cedrus.c   |   6 +
> >  drivers/staging/media/sunxi/cedrus/cedrus.h   |   1 +
> >  .../staging/media/sunxi/cedrus/cedrus_dec.c   |   2 +
> >  .../staging/media/sunxi/cedrus/cedrus_h265.c  |   6 +-
> >  include/media/hevc-ctrls.h                    |  45 +++++--
> >  8 files changed, 186 insertions(+), 32 deletions(-)
> > 
> > diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > index 00944e97d638..5e6d77e858c0 100644
> > --- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > @@ -3109,6 +3109,15 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> >      :stub-columns: 0
> >      :widths:       1 1 2
> >  
> > +    * - __u8
> > +      - ``video_parameter_set_id``
> > +      - Identifies the VPS for reference by other syntax elements
> > +    * - __u8
> > +      - ``seq_parameter_set_id̀``
> > +      - Specifies the value of the vps_video_parameter_set_id of the active
> > VPS
> > +    * - __u8
> > +      - ``chroma_format_idc``
> > +      - Specifies the chroma sampling relative to the luma sampling
> 
> None of these fields seem needed for the Hantro G2 driver,
> so I suggest you drop them for now.

You should be using chroma_format_idc in your validation. I think Hantro G2 does
not do YUV 4:4:4, which is what chroma_format_idc tells the driver. Our
implementation might also be missing 4:0:0 (black and white).

As this value will aftect the buffer size, please keep them, and ideally
validate them. Note that it feels rather problematic / unsafe situation as we
endup having to trust userspace, hopefully the HW validates it's buffer size ?
(do we tell the HW the buffer sizes ?)

> 
> >      * - __u16
> >        - ``pic_width_in_luma_samples``
> >        -
> > @@ -3172,6 +3181,9 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> >      * - __u8
> >        - ``chroma_format_idc``
> >        -
> > +    * - __u8
> > +      - ``num_slices``
> > +
> 
> Not used, but also doesn't seem part of the SPS syntax. If we have to
> pass the number of slices, we'll need another mechanism.
> 
> >       -
> >      * - __u64
> >        - ``flags``
> >        - See :ref:`Sequence Parameter Set Flags <hevc_sps_flags>`
> > @@ -3231,9 +3243,18 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> >      :stub-columns: 0
> >      :widths:       1 1 2
> >  
> > +    * - __u8
> > +      - ``pic_parameter_set_id``
> > +      - Identifies the PPS for reference by other syntax elements
> 
> Not used.
> 
> >      * - __u8
> >        - ``num_extra_slice_header_bits``
> >        -
> > +    * - __u8
> > +      - ``num_ref_idx_l0_default_active_minus1``
> > +      - Specifies the inferred value of num_ref_idx_l0_active_minus1
> > +    * - __u8
> > +      - ``num_ref_idx_l1_default_active_minus1``
> > +      - Specifies the inferred value of num_ref_idx_l1_active_minus1
> >      * - __s8
> >        - ``init_qp_minus26``
> >        -
> > @@ -3342,6 +3363,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> >      * - ``V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT``
> >        - 0x00040000
> >        -
> > +    * - ``V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT``
> > +      - 0x00080000
> > +      -
> > +    * - ``V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING``
> > +      - 0x00100000
> > +      -
> >  
> 
> I suggest to do all the PPS control changes in a separate patch,
> feels easier to review and cleaner as you can explain the
> changes with more detail in the commit description.
> 
> Looking at the PPS syntax for tiles, I'm wondering if these
> deserve their own control, which would be used if tiles are enabled,
> i.e. V4L2_HEVC_PPS_FLAG_TILES_ENABLED is set.
> 
>         __u8   
> num_tile_columns_minus1;                                         
>         __u8   
> num_tile_rows_minus1;                                            
>         __u8   
> column_width_minus1[20];                                         
>         __u8    row_height_minus1[22];    
> 
> Not something we necessarily have to tackle now.
> 
> >  ``V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS (struct)``
> >      Specifies various slice-specific parameters, especially from the NAL
> > unit
> > @@ -3366,6 +3393,12 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> >      * - __u32
> >        - ``data_bit_offset``
> >        - Offset (in bits) to the video data in the current slice data.
> > +    * - __u32
> > +      - ``slice_segment_addr``
> > +      - Specifies the address of the first coding tree block in the slice
> > segment
> 
> Not used.
> 
> > +    * - __u32
> > +      - ``num_entry_point_offsets``
> > +      - Specifies the number of entry_point_offset_minus1[ i ] syntax
> > elements in the slice header
> 
> Not used.
> 
> >      * - __u8
> >        - ``nal_unit_type``
> >        -
> > @@ -3422,28 +3455,20 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> >      * - __u8
> >        - ``pic_struct``
> >        -
> > -    * - __u8
> > -      - ``num_active_dpb_entries``
> > -      - The number of entries in ``dpb``.
> 
> Need to explain in the commit description why this field is moved.
> 
> >      * - __u8
> >        - ``ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> >        - The list of L0 reference elements as indices in the DPB.
> >      * - __u8
> >        - ``ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> >        - The list of L1 reference elements as indices in the DPB.
> > +    * - __u16
> > +      - ``short_term_ref_pic_set_size``
> > +
> 
> Not used.
> 
> >       -
> > +    * - __u16
> > +      - ``long_term_ref_pic_set_size``
> > +      -
> 
> Not used.
> 
> >      * - __u8
> > -      - ``num_rps_poc_st_curr_before``
> > -      - The number of reference pictures in the short-term set that come
> > before
> > -        the current frame.
> 
> If this matches NumPocStCurrBefore from section 8.3.2 "Decoding process for
> reference picture set"
> then I would document that. And perhaps rename it to num_poc_st_curr_before.
> 
> > -    * - __u8
> > -      - ``num_rps_poc_st_curr_after``
> > -      - The number of reference pictures in the short-term set that come
> > after
> > -        the current frame.
> 
> Ditto.
> 
> > -    * - __u8
> > -      - ``num_rps_poc_lt_curr``
> > -      - The number of reference pictures in the long-term set.
> 
> Ditto.
> 
> Also, I'd like the changes that move fields from
> V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS
> to the new V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS control, to be in their
> patch.
> 
> That will allow us to put in the commit description a proper
> explanation of why are fields being moved. Nothing fancy, simply
> explaining that these variables come from section 8.3.2
> "Decoding process for reference picture set", which describes
> a process invoked once per picture, so they are not per-slice.
> 
> > -    * - __u8
> > -      - ``padding[7]``
> > +      - ``padding``
> >        - Applications and drivers must set this to zero.
> >      * - struct :c:type:`v4l2_hevc_dpb_entry`
> >        - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > @@ -3646,3 +3671,74 @@ enum v4l2_mpeg_video_hevc_size_of_length_field -
> >      so this has to come from client.
> >      This is applicable to H264 and valid Range is from 0 to 63.
> >      Source Rec. ITU-T H.264 (06/2019); G.7.4.1.1, G.8.8.1.
> > +
> > +``V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS (struct)``
> > +    Specifies various decode parameters, especially the references picture
> > order
> > +    count (POC) for all the lists (short, long, before, current, after) and
> > the
> > +    number of entries for each of them.
> > +    These parameters are defined according to :ref:`hevc`.
> > +    They are described in section 8.3 "Slice decoding process" of the
> > +    specification.
> > +
> > +.. c:type:: v4l2_ctrl_hevc_decode_params
> > +
> > +.. cssclass:: longtable
> > +
> > +.. flat-table:: struct v4l2_ctrl_hevc_decode_params
> > +    :header-rows:  0
> > +    :stub-columns: 0
> > +    :widths:       1 1 2
> > +
> > +    * - __s32
> > +      - ``pic_order_cnt_val``
> > +      -
> 
> Can be documented as:
> 
> """
> PicOrderCntVal as described in section 8.3.1 "Decoding process
> for picture order count" of the specification.
> """
> 
> Note that snake case is used to match the kernel style,
> but other than that we try to keep the HEVC spec variable
> names.
> 
> > +    * - __u8
> > +      - ``num_active_dpb_entries``
> > +      - The number of entries in ``dpb``.
> > +    * - struct :c:type:`v4l2_hevc_dpb_entry`
> > +      - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > +      - The decoded picture buffer, for meta-data about reference frames.
> 
> The DPB is here, but it seems it's also in the slice control?
> 
> > +    * - __u8
> > +      - ``num_rps_poc_st_curr_before``
> > +      - The number of reference pictures in the short-term set that come
> > before
> > +        the current frame.
> > +    * - __u8
> > +      - ``num_rps_poc_st_curr_after``
> > +      - The number of reference pictures in the short-term set that come
> > after
> > +        the current frame.
> > +    * - __u8
> > +      - ``num_rps_poc_lt_curr``
> > +      - The number of reference pictures in the long-term set.
> > +    * - __u8
> > +      - ``rps_st_curr_before[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > +      -
> > +    * - __u8
> > +      - ``rps_st_curr_after[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > +      -
> > +    * - __u8
> > +      - ``rps_lt_curr[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > +      -
> 
> Could you document these as well?
> 
> Thanks a lot,
> Ezequiel
> 
>
Nicolas Dufresne Feb. 25, 2021, 7:11 p.m. UTC | #11
Le jeudi 25 février 2021 à 18:34 +0000, John Cox a écrit :
> On Thu, 25 Feb 2021 19:05:55 +0100, you wrote:
> 
> > Dne ?etrtek, 25. februar 2021 ob 18:34:48 CET je Ezequiel Garcia napisal(a):
> > > Hey Jernej,
> > > 
> > > On Thu, 2021-02-25 at 18:01 +0100, Jernej Škrabec wrote:
> > > > Hi Ezequiel,
> > > > 
> > > > Dne ?etrtek, 25. februar 2021 ob 14:09:52 CET je Ezequiel Garcia 
> > napisal(a):
> > > > > Hi Benjamin,
> > > > > 
> > > > > Thanks for the good work.
> > > > > 
> > > > > On Mon, 2021-02-22 at 13:23 +0100, Benjamin Gaignard wrote:
> > > > > > The H.265 ITU specification (section 7.4) define the general
> > > > > > slice segment header semantics.
> > > > > > Modified/added fields are:
> > > > > > - video_parameter_set_id: (7.4.3.1) identifies the VPS for
> > > > > > reference by other syntax elements.
> > > > > > - seq_parameter_set_id: (7.4.3.2.1) specifies the value of
> > > > > > the vps_video_parameter_set_id of the active VPS.
> > > > > > - chroma_format_idc: (7.4.3.2.1) specifies the chroma sampling
> > > > > >  relative to the luma sampling
> > > > > > - pic_parameter_set_id: (7.4.3.3.1) identifies the PPS for
> > > > > > reference by other syntax elements
> > > > > > - num_ref_idx_l0_default_active_minus1: (7.4.3.3.1) specifies
> > > > > > the inferred value of num_ref_idx_l0_active_minus1
> > > > > > - num_ref_idx_l1_default_active_minus1: (7.4.3.3.1) specifies
> > > > > > the inferred value of num_ref_idx_l1_active_minus1
> > > > > > - slice_segment_addr: (7.4.7.1) specifies the address of
> > > > > > the first coding tree block in the slice segment
> > > > > > - num_entry_point_offsets: (7.4.7.1) specifies the number of
> > > > > > entry_point_offset_minus1[ i ] syntax elements in the slice header
> > > > > > 
> > > > > > Add HEVC decode params contains the information used in section
> > > > > > "8.3 Slice decoding process" of the specification to let the
> > > > > > hardware
> > > > > > perform decoding of a slices.
> > > > > > 
> > > > > > Adapt Cedrus driver according to these changes.
> > > > > > 
> > > > > > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> > > > > > ---
> > > > > > version 3:
> > > > > > - Add documentation about the new structuers and fields.
> > > > > > 
> > > > > > version 2:
> > > > > > - remove all change related to scaling
> > > > > > - squash commits to a coherent split
> > > > > > - be more verbose about the added fields
> > > > > > 
> > > > > >  .../media/v4l/ext-ctrls-codec.rst             | 126 ++++++++++++++
> > +---
> > > > > >  .../media/v4l/vidioc-queryctrl.rst            |   6 +
> > > > > >  drivers/media/v4l2-core/v4l2-ctrls.c          |  26 +++-
> > > > > >  drivers/staging/media/sunxi/cedrus/cedrus.c   |   6 +
> > > > > >  drivers/staging/media/sunxi/cedrus/cedrus.h   |   1 +
> > > > > >  .../staging/media/sunxi/cedrus/cedrus_dec.c   |   2 +
> > > > > >  .../staging/media/sunxi/cedrus/cedrus_h265.c  |   6 +-
> > > > > >  include/media/hevc-ctrls.h                    |  45 +++++--
> > > > > >  8 files changed, 186 insertions(+), 32 deletions(-)
> > > > > > 
> > > > > > diff --git a/Documentation/userspace-api/media/v4l/ext-ctrls-
> > > > > > codec.rst 
> > b/
> > > > Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > > > > > index 00944e97d638..5e6d77e858c0 100644
> > > > > > --- a/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > > > > > +++ b/Documentation/userspace-api/media/v4l/ext-ctrls-codec.rst
> > > > > > @@ -3109,6 +3109,15 @@ enum
> > > > > > v4l2_mpeg_video_hevc_size_of_length_field -
> > > > > >      :stub-columns: 0
> > > > > >      :widths:       1 1 2
> > > > > >  
> > > > > > +    * - __u8
> > > > > > +      - ``video_parameter_set_id``
> > > > > > +      - Identifies the VPS for reference by other syntax elements
> > > > > > +    * - __u8
> > > > > > +      - ``seq_parameter_set_id?``
> > > > > > +      - Specifies the value of the vps_video_parameter_set_id of
> > > > > > the 
> > > > active VPS
> > > > > > +    * - __u8
> > > > > > +      - ``chroma_format_idc``
> > > > > > +      - Specifies the chroma sampling relative to the luma sampling
> > > > > 
> > > > > None of these fields seem needed for the Hantro G2 driver,
> > > > > so I suggest you drop them for now.
> > > > > 
> > > > > >      * - __u16
> > > > > >        - ``pic_width_in_luma_samples``
> > > > > >        -
> > > > > > @@ -3172,6 +3181,9 @@ enum v4l2_mpeg_video_hevc_size_of_length_field
> > > > > > -
> > > > > >      * - __u8
> > > > > >        - ``chroma_format_idc``
> > > > > >        -
> > > > > > +    * - __u8
> > > > > > +      - ``num_slices``
> > > > > > +
> > > > > 
> > > > > Not used, but also doesn't seem part of the SPS syntax. If we have to
> > > > > pass the number of slices, we'll need another mechanism.
> > > > > 
> > > > > >       -
> > > > > >      * - __u64
> > > > > >        - ``flags``
> > > > > >        - See :ref:`Sequence Parameter Set Flags <hevc_sps_flags>`
> > > > > > @@ -3231,9 +3243,18 @@ enum
> > > > > > v4l2_mpeg_video_hevc_size_of_length_field -
> > > > > >      :stub-columns: 0
> > > > > >      :widths:       1 1 2
> > > > > >  
> > > > > > +    * - __u8
> > > > > > +      - ``pic_parameter_set_id``
> > > > > > +      - Identifies the PPS for reference by other syntax elements
> > > > > 
> > > > > Not used.
> > > > > 
> > > > > >      * - __u8
> > > > > >        - ``num_extra_slice_header_bits``
> > > > > >        -
> > > > > > +    * - __u8
> > > > > > +      - ``num_ref_idx_l0_default_active_minus1``
> > > > > > +      - Specifies the inferred value of
> > > > > > num_ref_idx_l0_active_minus1
> > > > > > +    * - __u8
> > > > > > +      - ``num_ref_idx_l1_default_active_minus1``
> > > > > > +      - Specifies the inferred value of
> > > > > > num_ref_idx_l1_active_minus1
> > > > > >      * - __s8
> > > > > >        - ``init_qp_minus26``
> > > > > >        -
> > > > > > @@ -3342,6 +3363,12 @@ enum
> > > > > > v4l2_mpeg_video_hevc_size_of_length_field -
> > > > > >      * -
> > > > > > ``V4L2_HEVC_PPS_FLAG_SLICE_SEGMENT_HEADER_EXTENSION_PRESENT``
> > > > > >        - 0x00040000
> > > > > >        -
> > > > > > +    * - ``V4L2_HEVC_PPS_FLAG_DEBLOCKING_FILTER_CONTROL_PRESENT``
> > > > > > +      - 0x00080000
> > > > > > +      -
> > > > > > +    * - ``V4L2_HEVC_PPS_FLAG_UNIFORM_SPACING``
> > > > > > +      - 0x00100000
> > > > > > +      -
> > > > > >  
> > > > > 
> > > > > I suggest to do all the PPS control changes in a separate patch,
> > > > > feels easier to review and cleaner as you can explain the
> > > > > changes with more detail in the commit description.
> > > > > 
> > > > > Looking at the PPS syntax for tiles, I'm wondering if these
> > > > > deserve their own control, which would be used if tiles are enabled,
> > > > > i.e. V4L2_HEVC_PPS_FLAG_TILES_ENABLED is set.
> > > > > 
> > > > >         __u8    
> > num_tile_columns_minus1;                                         
> > > > >         __u8    
> > num_tile_rows_minus1;                                            
> > > > >         __u8    
> > column_width_minus1[20];                                         
> > > > >         __u8    row_height_minus1[22];    
> > > > > 
> > > > > Not something we necessarily have to tackle now.
> > > > > 
> > > > > >  ``V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS (struct)``
> > > > > >      Specifies various slice-specific parameters, especially from
> > > > > > the 
> > NAL 
> > > > unit
> > > > > > @@ -3366,6 +3393,12 @@ enum
> > > > > > v4l2_mpeg_video_hevc_size_of_length_field -
> > > > > >      * - __u32
> > > > > >        - ``data_bit_offset``
> > > > > >        - Offset (in bits) to the video data in the current slice
> > > > > > data.
> > > > > > +    * - __u32
> > > > > > +      - ``slice_segment_addr``
> > > > > > +      - Specifies the address of the first coding tree block in the
> > slice 
> > > > segment
> > > > > 
> > > > > Not used.
> > > > > 
> > > > > > +    * - __u32
> > > > > > +      - ``num_entry_point_offsets``
> > > > > > +      - Specifies the number of entry_point_offset_minus1[ i ]
> > > > > > syntax 
> > > > elements in the slice header
> > > > > 
> > > > > Not used.
> > > > 
> > > > While above two fields may not be used in Hantro, they are for sure
> > > > useful 
> > for 
> > > > Cedrus and RPi4. I would like to keep them, otherwise with such approach
> > HEVC 
> > > > will stay in staging for a long time. I'm still baffled why scaling
> > > > matrix 
> > > > control was dropped. It would fit well in Cedrus and RPi4 driver and
> > > > after 
> > a 
> > > > quick look, it seems that it was used in driver in later patch.
> > > > 
> > > 
> > > I'd like to make sure each modification we are making to the uAPI
> > > goes in the right direction, that is in the direction of moving
> > > the API out of staging.
> > > 
> > > Since reviewing each field is quite hard, and opens some discussions,
> > > I wanted to keep this patchset specific to what's needed for Hantro G2.
> > > 
> > > The Scaling matrix control is certainly a good one, as well as the ones
> > > needed for Cedrus and RPi4. However, I feel it's better to discuss
> > > them in their own "uAPI review" series so we can review all the changes
> > > with an API hat.
> > > 
> > > This way we decouple the Hantro G2 discussion and work from the API work.
> > > 
> > > Also please feel free to submit RFC patches fo Cedrus and RPi4
> > > (API and driver changes). We can certainly start the discussion around
> > > that,
> > > with driver changes in context.
> > 
> > I don't know much about RPi4 driver, only few implementation details, so 
> > you'll have to ping developer who wrote it. Regarding HEVC on Cedrus - it
> > has 
> > one pain point - it needs entry point table which in turn needs support for 
> > variable arrays in order to be feasable AFAIK. I don't plan to develop that.
> > Patches for scaling matrix and segment address were sent a bit more than a 
> > year ago but were turned down because they change control structures (among 
> > other things). Sorry to say, but I work on other things now, so Cedrus will 
> > have to wait. Alternatively, someone can take my patches from LibreELEC, 
> > update and submit them. They are in use for a long time.
> 
> It seems to me that H.265 should be the source for what fields are
> needed in the uapi - not whatever happens to be supported by current
> h/w. The standard defines the list of fields that can occur in the
> parameter sets and headers, and everything that is needed to decode 
> any slice_seqgment_data should be in the upai.  Eventually all those
> fields are going to be needed (they aren't in the standard just to look
> pretty) and given the reluctance I've observed to change the uapi once
> released they should be there from the start.
> 
> Now some hardware requires more fields that aren't in the standard -
> those can (and should) be added in h/w specific extensions.

While this is not wrong, there is obvious bitstream parameters that are not of
the HW accelerator domain. We did the work of deciding what to omit in the H264
uAPI, and we'll do that for HEVC for sure. But now is not the time yet.

Now the point that you seem to be missing is that Benjamin is trying to stage a
driver, not a uAPI. With this third driver, we can have more folks contributing
and testing toward our goal of unstaging the uAPI.

After this, we'll be able to iteratively add features (like scaling list) to
this driver along with the required uAPI and discuss each of these needed API
separatly. Note that previous iteration, the driver implementation for the
scaling list wasn't correct, and for this reason, the uAPI wasn't being
validated.

When this is getting stable enough, there will be a final stage were someone
will have to pick the uAPI and fill in the blank with remaining field that has a
strong probability to be used by later HW WITHIN the supported HEVC profiles et
have defined.

I emphasis on that because there is a lot of HEVC (and H264) extensions we
haven't covered or mentionned, and these will be added with dedicated controls
when we have a use case for it of course. Remember that adding controls is not a
problem, we need these basic control of the basic decoding, and for sure, over
the decade to come we will have to introduce new controls for other stuff we
haven't supported yet.

All this may look like taking forever, but as Jernej states, not everyone can
afford contributing upstream. So unless someone step in, this pace is about what
we can offer, and we strongly believe that it will converge. I do personnally
expect HEVC to take more time as the codec rationales are mostly kept secret,
and our answers burried into a mountain of reference code. It's also, if you
defocus from the film industry, just one CODEC. Notably, for the web, VP9 is
more important, and this will likely puts some shadow on this work.

Meawhile, you are strongly encouraged if you can afford it to open the
discussion around the RPi4 multi-stage HEVC decoder. While doing this, you could
send a patched for staging the RPi4 driver. The benefit would be that we get to
update your drivers when we update the uAPI, saving you times, and also giving
everyone more data on what is strictly needed.

> 
> Regards
> 
> JC
> 
> > Best regards,
> > Jernej
> > 
> > > 
> > > Hope I'm making sense here :)
> > > 
> > > Thanks,
> > > Ezequiel
> > > 
> > > > Best regards,
> > > > Jernej
> > > > 
> > > > > 
> > > > > >      * - __u8
> > > > > >        - ``nal_unit_type``
> > > > > >        -
> > > > > > @@ -3422,28 +3455,20 @@ enum
> > > > > > v4l2_mpeg_video_hevc_size_of_length_field 
> > -
> > > > > >      * - __u8
> > > > > >        - ``pic_struct``
> > > > > >        -
> > > > > > -    * - __u8
> > > > > > -      - ``num_active_dpb_entries``
> > > > > > -      - The number of entries in ``dpb``.
> > > > > 
> > > > > Need to explain in the commit description why this field is moved.
> > > > > 
> > > > > >      * - __u8
> > > > > >        - ``ref_idx_l0[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > > > >        - The list of L0 reference elements as indices in the DPB.
> > > > > >      * - __u8
> > > > > >        - ``ref_idx_l1[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > > > >        - The list of L1 reference elements as indices in the DPB.
> > > > > > +    * - __u16
> > > > > > +      - ``short_term_ref_pic_set_size``
> > > > > > +
> > > > > 
> > > > > Not used.
> > > > > 
> > > > > >       -
> > > > > > +    * - __u16
> > > > > > +      - ``long_term_ref_pic_set_size``
> > > > > > +      -
> > > > > 
> > > > > Not used.
> > > > > 
> > > > > >      * - __u8
> > > > > > -      - ``num_rps_poc_st_curr_before``
> > > > > > -      - The number of reference pictures in the short-term set that
> > come 
> > > > before
> > > > > > -        the current frame.
> > > > > 
> > > > > If this matches NumPocStCurrBefore from section 8.3.2 "Decoding
> > > > > process 
> > for 
> > > > reference picture set"
> > > > > then I would document that. And perhaps rename it to 
> > num_poc_st_curr_before.
> > > > > 
> > > > > > -    * - __u8
> > > > > > -      - ``num_rps_poc_st_curr_after``
> > > > > > -      - The number of reference pictures in the short-term set that
> > come 
> > > > after
> > > > > > -        the current frame.
> > > > > 
> > > > > Ditto.
> > > > > 
> > > > > > -    * - __u8
> > > > > > -      - ``num_rps_poc_lt_curr``
> > > > > > -      - The number of reference pictures in the long-term set.
> > > > > 
> > > > > Ditto.
> > > > > 
> > > > > Also, I'd like the changes that move fields from 
> > > > V4L2_CID_MPEG_VIDEO_HEVC_SLICE_PARAMS
> > > > > to the new V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS control, to be in 
> > their
> > > > > patch.
> > > > > 
> > > > > That will allow us to put in the commit description a proper
> > > > > explanation of why are fields being moved. Nothing fancy, simply
> > > > > explaining that these variables come from section 8.3.2
> > > > > "Decoding process for reference picture set", which describes
> > > > > a process invoked once per picture, so they are not per-slice.
> > > > > 
> > > > > > -    * - __u8
> > > > > > -      - ``padding[7]``
> > > > > > +      - ``padding``
> > > > > >        - Applications and drivers must set this to zero.
> > > > > >      * - struct :c:type:`v4l2_hevc_dpb_entry`
> > > > > >        - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > > > > @@ -3646,3 +3671,74 @@ enum
> > > > > > v4l2_mpeg_video_hevc_size_of_length_field -
> > > > > >      so this has to come from client.
> > > > > >      This is applicable to H264 and valid Range is from 0 to 63.
> > > > > >      Source Rec. ITU-T H.264 (06/2019); G.7.4.1.1, G.8.8.1.
> > > > > > +
> > > > > > +``V4L2_CID_MPEG_VIDEO_HEVC_DECODE_PARAMS (struct)``
> > > > > > +    Specifies various decode parameters, especially the references 
> > picture 
> > > > order
> > > > > > +    count (POC) for all the lists (short, long, before, current, 
> > after) 
> > > > and the
> > > > > > +    number of entries for each of them.
> > > > > > +    These parameters are defined according to :ref:`hevc`.
> > > > > > +    They are described in section 8.3 "Slice decoding process" of
> > > > > > the
> > > > > > +    specification.
> > > > > > +
> > > > > > +.. c:type:: v4l2_ctrl_hevc_decode_params
> > > > > > +
> > > > > > +.. cssclass:: longtable
> > > > > > +
> > > > > > +.. flat-table:: struct v4l2_ctrl_hevc_decode_params
> > > > > > +    :header-rows:  0
> > > > > > +    :stub-columns: 0
> > > > > > +    :widths:       1 1 2
> > > > > > +
> > > > > > +    * - __s32
> > > > > > +      - ``pic_order_cnt_val``
> > > > > > +      -
> > > > > 
> > > > > Can be documented as:
> > > > > 
> > > > > """
> > > > > PicOrderCntVal as described in section 8.3.1 "Decoding process
> > > > > for picture order count" of the specification.
> > > > > """
> > > > > 
> > > > > Note that snake case is used to match the kernel style,
> > > > > but other than that we try to keep the HEVC spec variable
> > > > > names.
> > > > > 
> > > > > > +    * - __u8
> > > > > > +      - ``num_active_dpb_entries``
> > > > > > +      - The number of entries in ``dpb``.
> > > > > > +    * - struct :c:type:`v4l2_hevc_dpb_entry`
> > > > > > +      - ``dpb[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > > > > +      - The decoded picture buffer, for meta-data about reference 
> > frames.
> > > > > 
> > > > > The DPB is here, but it seems it's also in the slice control?
> > > > > 
> > > > > > +    * - __u8
> > > > > > +      - ``num_rps_poc_st_curr_before``
> > > > > > +      - The number of reference pictures in the short-term set that
> > come 
> > > > before
> > > > > > +        the current frame.
> > > > > > +    * - __u8
> > > > > > +      - ``num_rps_poc_st_curr_after``
> > > > > > +      - The number of reference pictures in the short-term set that
> > come 
> > > > after
> > > > > > +        the current frame.
> > > > > > +    * - __u8
> > > > > > +      - ``num_rps_poc_lt_curr``
> > > > > > +      - The number of reference pictures in the long-term set.
> > > > > > +    * - __u8
> > > > > > +      - ``rps_st_curr_before[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > > > > +      -
> > > > > > +    * - __u8
> > > > > > +      - ``rps_st_curr_after[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > > > > +      -
> > > > > > +    * - __u8
> > > > > > +      - ``rps_lt_curr[V4L2_HEVC_DPB_ENTRIES_NUM_MAX]``
> > > > > > +      -
> > > > > 
> > > > > Could you document these as well?
> > > > > 
> > > > > Thanks a lot,
> > > > > Ezequiel
> > > > > 
> > > > > 
> > > > 
> > > > 
> > > 
> > > 
> > > 
> > 
>
Benjamin Gaignard Feb. 26, 2021, 9:47 a.m. UTC | #12
Le 25/02/2021 à 18:55, Ezequiel Garcia a écrit :
> Hi Benjamin,
>
> Thanks for the good work!
> I mostly have two concerns with this implementation,
> the tiled output and the allocation path.
>
> More below.
>
> On Mon, 2021-02-22 at 13:24 +0100, Benjamin Gaignard wrote:
>> Implement all the logic to get G2 hardware decoding HEVC frames.
>> It support up level 5.1 HEVC stream.
>> It doesn't support yet 10 bits formats or scaling feature.
>>
>> Add HANTRO HEVC dedicated control to skip some bits at the beginning
>> of the slice header. That is very specific to this hardware so can't
>> go into uapi structures. Compute the needed value is complex and require
>> information from the stream that only the userland knows so let it
>> provide the correct value to the driver.
>>
>> Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
>> ---
>> version 2:
>> - squash multiple commits in this one.
>> - fix the comments done by Ezequiel about dma_alloc_coherent usage
>> - fix Dan's comments about control copy, reverse the test logic
>> in tile_buffer_reallocate, rework some goto and return cases.
>>
>>   drivers/staging/media/hantro/Makefile         |   2 +
>>   drivers/staging/media/hantro/hantro.h         |  19 +
>>   drivers/staging/media/hantro/hantro_drv.c     |  42 ++
>>   .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
>>   drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
>>   drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
>>   drivers/staging/media/hantro/hantro_hw.h      |  47 ++
>>   7 files changed, 1216 insertions(+)
>>   create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
>>   create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
>>   create mode 100644 drivers/staging/media/hantro/hantro_hevc.c
>>
>>
> [snip]
>> +
>> +static void set_buffers(struct hantro_ctx *ctx)
>> +{
>> +       struct vb2_v4l2_buffer *src_buf, *dst_buf;
>> +       struct hantro_dev *vpu = ctx->dev;
>> +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
>> +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
>> +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
>> +       dma_addr_t src_dma, dst_dma;
>> +       u32 src_len, src_buf_len;
>> +
>> +       src_buf = hantro_get_src_buf(ctx);
>> +       dst_buf = hantro_get_dst_buf(ctx);
>> +
>> +       /* Source (stream) buffer. */
>> +       src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
>> +       src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
>> +       src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
>> +
>> +       hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
>> +       hantro_reg_write(vpu, hevc_stream_len, src_len);
>> +       hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
>> +       hantro_reg_write(vpu, hevc_strm_start_offset, 0);
>> +       hantro_reg_write(vpu, hevc_write_mvs_e, 1);
>> +
>> +       /* Destination (decoded frame) buffer. */
>> +       dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
>> +
>> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
>> +       hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);
> The way this is done the raster-scan output is the only
> output, and the tiled ouput it kept entirely internal, in hantro_hevc_dec_hw_ctx.ref_bufs.
>
> This means there's no way to expose tiled NV12 in i.MX8M VPU tiled format
> to userspace, which could be desirable for some use cases.
>
> I'm not suggesting to actually expose tiled NV12 in this patch, but to prepare
> the driver to be able to support that easily in the future.
>
> It should be possible to consider this detiling as
> post-processing, adding some code in hantro_postproc.c
> leveraging the existing post-proc support.
>
> IOW, hantro_postproc_ctx.dec_q would hold the tiled output,
> hantro_postproc_enable() writes the raster-scan
> buffer destination address, and so on.

Well the G2 can't use the post-processor so I'm reluctant to do as if it was the case.

To support tiled NV12 for me the solution is to create a hantro_hevc_add_ref_buf() function
to add the destination buffer in hevc_dec->ref_bufs.
We can test destination format to set hevc_out_rs_e bit and HEVC_RASTER_SCAN register.

>
>> +       hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
>> +       hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
>> +       hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
>> +       hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
>> +}
>> +
>> +void hantro_g2_check_idle(struct hantro_dev *vpu)
>> +{
>> +       int i;
>> +
>> +       for (i = 0; i < 3; i++) {
>> +               u32 status;
>> +
>> +               /* Make sure the VPU is idle */
>> +               status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
>> +               if (status & HEVC_REG_INTERRUPT_DEC_E) {
>> +                       pr_warn("%s: still enabled!!! resetting.\n", __func__);
>> +                       status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
>> +                       vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
>> +               }
>> +       }
>> +}
>> +
>> +void hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
>> +{
>> +       struct hantro_dev *vpu = ctx->dev;
>> +
>> +       hantro_g2_check_idle(vpu);
>> +
>> +       /* Prepare HEVC decoder context. */
>> +       if (hantro_hevc_dec_prepare_run(ctx))
>> +               return;
>> +
>> +       /* Configure hardware registers. */
>> +       set_params(ctx);
>> +
>> +       /* set reference pictures */
>> +       if (set_ref(ctx))
>> +               /* something goes wrong */
>> +               hantro_irq_done(vpu, VB2_BUF_STATE_ERROR);
>> +
> I don't think we want to allow the _run() operation to fail like this.
> In other words, I would avoid allocating buffers in the _run() path,
> and doing all allocation at start() time.
>
> That's why hantro_start_streaming() calls hantro_postproc_alloc() if needed.

The error could also be that the reference picture isn't found in the list so
we can't perform the decode operation.
If we do the allocation at start time we will allocate all the reference frames
buffers even if we don't know if we will use them. That could increase a lot
memory usage. Note that buffers allocated once and re-used until the end of the
session.

>
>> +       set_buffers(ctx);
>> +       prepare_tile_info_buffer(ctx);
>> +
>> +       hantro_end_prepare_run(ctx);
>> +
>> +       hantro_reg_write(vpu, hevc_mode, HEVC_DEC_MODE);
>> +       hantro_reg_write(vpu, hevc_clk_gate_e, 1);
>> +
>> +       /* Don't disable output */
>> +       hantro_reg_write(vpu, hevc_out_dis, 0);
>> +
>> +       /* Don't compress buffers */
>> +       hantro_reg_write(vpu, hevc_ref_compress_bypass, 1);
>> +
>> +       /* use NV12 as output format */
>> +       hantro_reg_write(vpu, hevc_tile_e, 0);
> Unless I'm missing something, this ^
>
>> +       hantro_reg_write(vpu, hevc_out_rs_e, 1);
>> +       hantro_reg_write(vpu, hevc_num_tile_rows, 1);
>> +       hantro_reg_write(vpu, hevc_num_tile_cols, 1);
>> +
> and this ^ these shouldn't be here.
>
> HEVC tiles are handled already. See prepare_tile_info_buffer().

You are right, I will remove that, thanks.

Benjamin

> Note that HEVC "tiles" are not related to NV12 "tiled" format.
>
> Thanks!
> Ezequiel
>
>
Ezequiel Garcia Feb. 26, 2021, 9:08 p.m. UTC | #13
On Fri, 2021-02-26 at 10:47 +0100, Benjamin Gaignard wrote:
> 
> Le 25/02/2021 à 18:55, Ezequiel Garcia a écrit :
> > Hi Benjamin,
> > 
> > Thanks for the good work!
> > I mostly have two concerns with this implementation,
> > the tiled output and the allocation path.
> > 
> > More below.
> > 
> > On Mon, 2021-02-22 at 13:24 +0100, Benjamin Gaignard wrote:
> > > Implement all the logic to get G2 hardware decoding HEVC frames.
> > > It support up level 5.1 HEVC stream.
> > > It doesn't support yet 10 bits formats or scaling feature.
> > > 
> > > Add HANTRO HEVC dedicated control to skip some bits at the beginning
> > > of the slice header. That is very specific to this hardware so can't
> > > go into uapi structures. Compute the needed value is complex and require
> > > information from the stream that only the userland knows so let it
> > > provide the correct value to the driver.
> > > 
> > > Signed-off-by: Benjamin Gaignard <benjamin.gaignard@collabora.com>
> > > ---
> > > version 2:
> > > - squash multiple commits in this one.
> > > - fix the comments done by Ezequiel about dma_alloc_coherent usage
> > > - fix Dan's comments about control copy, reverse the test logic
> > > in tile_buffer_reallocate, rework some goto and return cases.
> > > 
> > >   drivers/staging/media/hantro/Makefile         |   2 +
> > >   drivers/staging/media/hantro/hantro.h         |  19 +
> > >   drivers/staging/media/hantro/hantro_drv.c     |  42 ++
> > >   .../staging/media/hantro/hantro_g2_hevc_dec.c | 587 ++++++++++++++++++
> > >   drivers/staging/media/hantro/hantro_g2_regs.h | 198 ++++++
> > >   drivers/staging/media/hantro/hantro_hevc.c    | 321 ++++++++++
> > >   drivers/staging/media/hantro/hantro_hw.h      |  47 ++
> > >   7 files changed, 1216 insertions(+)
> > >   create mode 100644 drivers/staging/media/hantro/hantro_g2_hevc_dec.c
> > >   create mode 100644 drivers/staging/media/hantro/hantro_g2_regs.h
> > >   create mode 100644 drivers/staging/media/hantro/hantro_hevc.c
> > > 
> > > 
> > [snip]
> > > +
> > > +static void set_buffers(struct hantro_ctx *ctx)
> > > +{
> > > +       struct vb2_v4l2_buffer *src_buf, *dst_buf;
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +       const struct hantro_hevc_dec_ctrls *ctrls = &ctx->hevc_dec.ctrls;
> > > +       const struct v4l2_ctrl_hevc_sps *sps = ctrls->sps;
> > > +       size_t cr_offset = hantro_hevc_chroma_offset(sps);
> > > +       dma_addr_t src_dma, dst_dma;
> > > +       u32 src_len, src_buf_len;
> > > +
> > > +       src_buf = hantro_get_src_buf(ctx);
> > > +       dst_buf = hantro_get_dst_buf(ctx);
> > > +
> > > +       /* Source (stream) buffer. */
> > > +       src_dma = vb2_dma_contig_plane_dma_addr(&src_buf->vb2_buf, 0);
> > > +       src_len = vb2_get_plane_payload(&src_buf->vb2_buf, 0);
> > > +       src_buf_len = vb2_plane_size(&src_buf->vb2_buf, 0);
> > > +
> > > +       hantro_write_addr(vpu, HEVC_ADDR_STR, src_dma);
> > > +       hantro_reg_write(vpu, hevc_stream_len, src_len);
> > > +       hantro_reg_write(vpu, hevc_strm_buffer_len, src_buf_len);
> > > +       hantro_reg_write(vpu, hevc_strm_start_offset, 0);
> > > +       hantro_reg_write(vpu, hevc_write_mvs_e, 1);
> > > +
> > > +       /* Destination (decoded frame) buffer. */
> > > +       dst_dma = hantro_get_dec_buf_addr(ctx, &dst_buf->vb2_buf);
> > > +
> > > +       hantro_write_addr(vpu, HEVC_RASTER_SCAN, dst_dma);
> > > +       hantro_write_addr(vpu, HEVC_RASTER_SCAN_CHR, dst_dma + cr_offset);
> > The way this is done the raster-scan output is the only
> > output, and the tiled ouput it kept entirely internal, in hantro_hevc_dec_hw_ctx.ref_bufs.
> > 
> > This means there's no way to expose tiled NV12 in i.MX8M VPU tiled format
> > to userspace, which could be desirable for some use cases.
> > 
> > I'm not suggesting to actually expose tiled NV12 in this patch, but to prepare
> > the driver to be able to support that easily in the future.
> > 
> > It should be possible to consider this detiling as
> > post-processing, adding some code in hantro_postproc.c
> > leveraging the existing post-proc support.
> > 
> > IOW, hantro_postproc_ctx.dec_q would hold the tiled output,
> > hantro_postproc_enable() writes the raster-scan
> > buffer destination address, and so on.
> 
> Well the G2 can't use the post-processor so I'm reluctant to do as if it was the case.
> 

I don't really see a big difference between G1 post-processing, and G2 raster-scan
detiling. As a user of the device, both features offer a post-processing.

In any case, it's not a big deal, it's fine as-is if you are inclined to this implementation.

> To support tiled NV12 for me the solution is to create a hantro_hevc_add_ref_buf() function
> to add the destination buffer in hevc_dec->ref_bufs.
> We can test destination format to set hevc_out_rs_e bit and HEVC_RASTER_SCAN register.
> 

Is it worth adding some boilerplate and/or refactoring things a bit, so it's
easier to expose tiled NV12 once that format is exposable?

> > 
> > > +       hantro_write_addr(vpu, HEVC_ADDR_TILE_SIZE, ctx->hevc_dec.tile_sizes.dma);
> > > +       hantro_write_addr(vpu, HEVC_TILE_FILTER, ctx->hevc_dec.tile_filter.dma);
> > > +       hantro_write_addr(vpu, HEVC_TILE_SAO, ctx->hevc_dec.tile_sao.dma);
> > > +       hantro_write_addr(vpu, HEVC_TILE_BSD, ctx->hevc_dec.tile_bsd.dma);
> > > +}
> > > +
> > > +void hantro_g2_check_idle(struct hantro_dev *vpu)
> > > +{
> > > +       int i;
> > > +
> > > +       for (i = 0; i < 3; i++) {
> > > +               u32 status;
> > > +
> > > +               /* Make sure the VPU is idle */
> > > +               status = vdpu_read(vpu, HEVC_REG_INTERRUPT);
> > > +               if (status & HEVC_REG_INTERRUPT_DEC_E) {
> > > +                       pr_warn("%s: still enabled!!! resetting.\n", __func__);
> > > +                       status |= HEVC_REG_INTERRUPT_DEC_ABORT_E | HEVC_REG_INTERRUPT_DEC_IRQ_DIS;
> > > +                       vdpu_write(vpu, status, HEVC_REG_INTERRUPT);
> > > +               }
> > > +       }
> > > +}
> > > +
> > > +void hantro_g2_hevc_dec_run(struct hantro_ctx *ctx)
> > > +{
> > > +       struct hantro_dev *vpu = ctx->dev;
> > > +
> > > +       hantro_g2_check_idle(vpu);
> > > +
> > > +       /* Prepare HEVC decoder context. */
> > > +       if (hantro_hevc_dec_prepare_run(ctx))
> > > +               return;
> > > +
> > > +       /* Configure hardware registers. */
> > > +       set_params(ctx);
> > > +
> > > +       /* set reference pictures */
> > > +       if (set_ref(ctx))
> > > +               /* something goes wrong */
> > > +               hantro_irq_done(vpu, VB2_BUF_STATE_ERROR);
> > > +
> > I don't think we want to allow the _run() operation to fail like this.
> > In other words, I would avoid allocating buffers in the _run() path,
> > and doing all allocation at start() time.
> > 
> > That's why hantro_start_streaming() calls hantro_postproc_alloc() if needed.
> 
> The error could also be that the reference picture isn't found in the list so
> we can't perform the decode operation.
> If we do the allocation at start time we will allocate all the reference frames
> buffers even if we don't know if we will use them. That could increase a lot
> memory usage. Note that buffers allocated once and re-used until the end of the
> session.
> 

Improving memory usage is a good point. I wonder if it makes sense to release
some buffers under some condition, instead of keeping then allocated (and unused)
until the end of the session.

(BTW, the same can be done with the post-processor, as it's basically
the same process. It would be interesting to explore.)

I'm still unsure why we'd call hantro_irq_done() (which cancels the timeout
handler and returns buffers to userspace), but we would keep programming
the operation. 

Thanks,
Ezequiel