[v1,00/14] Using component framework to support multi hardware decode

Message ID	20210707062157.21176-1-yunfei.dong@mediatek.com
Headers	show Return-Path: <devicetree-owner@vger.kernel.org> From: Yunfei Dong <yunfei.dong@mediatek.com> To: Yunfei Dong <yunfei.dong@mediatek.com>, Alexandre Courbot <acourbot@chromium.org>, Hans Verkuil <hverkuil-cisco@xs4all.nl>, Tzung-Bi Shih <tzungbi@chromium.org>, Tiffany Lin <tiffany.lin@mediatek.com>, Andrew-CT Chen <andrew-ct.chen@mediatek.com>, Mauro Carvalho Chehab <mchehab@kernel.org>, Rob Herring <robh+dt@kernel.org>, Matthias Brugger <matthias.bgg@gmail.com>, Tomasz Figa <tfiga@google.com> CC: Hsin-Yi Wang <hsinyi@chromium.org>, Fritz Koenig <frkoenig@chromium.org>, Irui Wang <irui.wang@mediatek.com>, <linux-media@vger.kernel.org>, <devicetree@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <linux-arm-kernel@lists.infradead.org>, <srv_heupstream@mediatek.com>, <linux-mediatek@lists.infradead.org>, <Project_Global_Chrome_Upstream_Group@mediatek.com> Subject: [PATCH v1, 00/14] Using component framework to support multi hardware decode Date: Wed, 7 Jul 2021 14:21:43 +0800 Message-ID: <20210707062157.21176-1-yunfei.dong@mediatek.com> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk
Series	Using component framework to support multi hardware decode \| expand [v1,00/14] Using component framework to support multi hardware decode [v1,04/14] dt-bindings: media: mtk-vcodec: Separate video encoder and decoder dt-bindings [v1,12/14] dt-bindings: media: mtk-vcodec: Adds decoder dt-bindings for mt8192

Yunfei Dong July 7, 2021, 6:21 a.m. UTC

This series adds support for multi hardware decode into mtk-vcodec, by first
adding component framework to manage each hardware information: interrupt,
clock, register bases and power. Secondly add core thread to deal with core
hardware message, at the same time, add msg queue for different hardware
share messages. Lastly, the architecture of different specs are not the same,
using specs type to separate them.

This series has been tested with both MT8183 and MT8173. Decoding was working
for both chips.

Patches 1,2 rewrite get register bases and power on/off interface.

Patch 3-5 add component framework to support multi hardware.

Patches 6-14 add interfaces to support core hardware.
----
This patch dependents on "media: mtk-vcodec: support for MT8183 decoder"[1].

Multi hardware decode is based on stateless decoder, MT8183 is the first
time to add stateless decoder. Otherwise it will cause conflict.
Please also accept this patch together with [1].

[1]https://lore.kernel.org/patchwork/project/lkml/list/?series=507084
----

Yunfei Dong (14):
  media: mtk-vcodec: Get numbers of register bases from DT
  media: mtk-vcodec: Refactor vcodec pm interface
  media: mtk-vcodec: Use component framework to manage each hardware
    information
  dt-bindings: media: mtk-vcodec: Separate video encoder and decoder
    dt-bindings
  media: mtk-vcodec: Use pure single core for MT8183
  media: mtk-vcodec: Add irq interface for core hardware
  media: mtk-vcodec: Add msg queue feature for lat and core architecture
  media: mtk-vcodec: Generalize power and clock on/off interfaces
  media: mtk-vcodec: Add new interface to lock different hardware
  media: mtk-vcodec: Add core thread
  media: mtk-vcodec: Support 34bits dma address for vdec
  dt-bindings: media: mtk-vcodec: Adds decoder dt-bindings for mt8192
  media: mtk-vcodec: Add core dec and dec end ipi msg
  media: mtk-vcodec: Use codec type to separate different hardware

 .../media/mediatek-vcodec-comp-decoder.txt    |  93 ++++++
 .../media/mediatek-vcodec-decoder.txt         | 169 +++++++++++
 .../media/mediatek-vcodec-encoder.txt         |  73 +++++
 drivers/media/platform/mtk-vcodec/Makefile    |   2 +
 .../platform/mtk-vcodec/mtk_vcodec_dec.c      |   4 +-
 .../platform/mtk-vcodec/mtk_vcodec_dec.h      |   4 +
 .../platform/mtk-vcodec/mtk_vcodec_dec_drv.c  | 286 +++++++++++++++---
 .../platform/mtk-vcodec/mtk_vcodec_dec_hw.c   | 193 ++++++++++++
 .../platform/mtk-vcodec/mtk_vcodec_dec_hw.h   |  51 ++++
 .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c   |  98 ++++--
 .../platform/mtk-vcodec/mtk_vcodec_dec_pm.h   |  13 +-
 .../mtk-vcodec/mtk_vcodec_dec_stateful.c      |   1 +
 .../mtk-vcodec/mtk_vcodec_dec_stateless.c     |   1 +
 .../platform/mtk-vcodec/mtk_vcodec_drv.h      |  69 ++++-
 .../platform/mtk-vcodec/mtk_vcodec_enc_pm.c   |   1 -
 .../platform/mtk-vcodec/mtk_vcodec_intr.c     |  30 ++
 .../platform/mtk-vcodec/mtk_vcodec_intr.h     |   2 +
 .../platform/mtk-vcodec/mtk_vcodec_util.c     |  87 +++++-
 .../platform/mtk-vcodec/mtk_vcodec_util.h     |   8 +-
 .../media/platform/mtk-vcodec/vdec_drv_if.c   |  21 +-
 .../media/platform/mtk-vcodec/vdec_ipi_msg.h  |  16 +-
 .../platform/mtk-vcodec/vdec_msg_queue.c      | 266 ++++++++++++++++
 .../platform/mtk-vcodec/vdec_msg_queue.h      | 136 +++++++++
 .../media/platform/mtk-vcodec/vdec_vpu_if.c   |  46 ++-
 .../media/platform/mtk-vcodec/vdec_vpu_if.h   |  22 ++
 25 files changed, 1582 insertions(+), 110 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/media/mediatek-vcodec-comp-decoder.txt
 create mode 100644 Documentation/devicetree/bindings/media/mediatek-vcodec-decoder.txt
 create mode 100644 Documentation/devicetree/bindings/media/mediatek-vcodec-encoder.txt
 create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_hw.c
 create mode 100644 drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_hw.h
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec_msg_queue.c
 create mode 100644 drivers/media/platform/mtk-vcodec/vdec_msg_queue.h

Tzung-Bi Shih July 8, 2021, 10:04 a.m. UTC | #1

On Wed, Jul 7, 2021 at 2:22 PM Yunfei Dong <yunfei.dong@mediatek.com> wrote:
> +#include "mtk_vcodec_util.h"
> +
>  #include <media/videobuf2-core.h>
> +#include <media/v4l2-ctrls.h>
>  #include <media/v4l2-mem2mem.h>
The changes look like independent ones.  If any .c files need the
headers, include them in the .c files instead of here.

> +               comp_node = of_find_compatible_node(NULL, NULL,
> +                       mtk_vdec_drv_ids[i].compatible);
> +               if (!comp_node)
> +                       continue;
> +
> +               if (!of_device_is_available(comp_node)) {
> +                       of_node_put(comp_node);
> +                       dev_err(&pdev->dev, "Fail to get MMSYS node\n");
> +                       continue;
> +               }
> +
> +               of_id = of_match_node(mtk_vdec_drv_ids, comp_node);
> +               if (!of_id) {
Doesn't it need to call of_node_put(comp_node)?

> +static int mtk_vcodec_init_master(struct mtk_vcodec_dev *dev)
> +{
> +       struct platform_device *pdev = dev->plat_dev;
> +       struct component_match *match;
> +       int ret = 0;
ret doesn't need to be initialized.

> +       match = mtk_vcodec_match_add(dev);
> +       if (IS_ERR_OR_NULL(match))
> +               return -EINVAL;
> +
> +       platform_set_drvdata(pdev, dev);
Why does platform_set_drvdata() need to be here?  The function neither
creates pdev nor dev.

> +static int mtk_vcodec_init_dec_params(struct mtk_vcodec_dev *dev)
> +{
> +       struct platform_device *pdev = dev->plat_dev;
> +       struct resource *res;
> +       int ret = 0;
ret doesn't need to be initialized.

> +       if (!dev->is_support_comp) {
> +               res = platform_get_resource(pdev, IORESOURCE_IRQ, 0);
> +               if (res == NULL) {
!res, res is not used BTW.

> +               dev->dec_irq = platform_get_irq(dev->plat_dev, 0);
Check return value.

> +               irq_set_status_flags(dev->dec_irq, IRQ_NOAUTOEN);
> +               ret = devm_request_irq(&pdev->dev, dev->dec_irq,
> +                               mtk_vcodec_dec_irq_handler, 0, pdev->name, dev);
> +               if (ret) {
> +                       dev_err(&pdev->dev, "failed to install dev->dec_irq %d (%d)",
> +                               dev->dec_irq,
> +                               ret);
Can join to previous line.

> +       if (!of_find_compatible_node(NULL, NULL, "mediatek,mtk-vcodec-core"))
> +               dev->is_support_comp = false;
> +       else
> +               dev->is_support_comp = true;
Need a DT binding document patch for the attribute.

Does it really need to call of_find_compatible_node() for parsing an
attribute?  If so, it needs to call of_node_put() afterward.

> @@ -319,7 +434,6 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
>                 MTK_VCODEC_DEC_NAME);
>         video_set_drvdata(vfd_dec, dev);
>         dev->vfd_dec = vfd_dec;
> -       platform_set_drvdata(pdev, dev);
Why does it need to remove platform_set_drvdata()?

> @@ -370,8 +484,17 @@ static int mtk_vcodec_probe(struct platform_device *pdev)
>         mtk_v4l2_debug(0, "decoder registered as /dev/video%d",
>                 vfd_dec->num);
>
> -       return 0;
> +       if (dev->is_support_comp) {
> +               ret = mtk_vcodec_init_master(dev);
> +               if (ret < 0)
> +                       goto err_component_match;
> +       } else {
> +               platform_set_drvdata(pdev, dev);
> +       }
mtk_vcodec_init_master() also calls platform_set_drvdata().  What is
the difference?

> +       /* clear interrupt */
> +       writel((readl(vdec_misc_addr) | VDEC_IRQ_CFG), vdec_misc_addr);
> +       writel((readl(vdec_misc_addr) & ~VDEC_IRQ_CLR), vdec_misc_addr);
Can remove 1 parenthese pair.

> +static int mtk_vdec_comp_probe(struct platform_device *pdev)
> +{
> +       struct mtk_vdec_comp_dev *dev;
> +       int ret;
> +
> +       dev = devm_kzalloc(&pdev->dev, sizeof(*dev), GFP_KERNEL);
> +       if (!dev)
> +               return -ENOMEM;
> +
> +       dev->plat_dev = pdev;
> +       spin_lock_init(&dev->irqlock);
> +
> +       ret = mtk_vcodec_init_dec_pm(dev->plat_dev, &dev->pm);
To be concise, use pdev.

> +       dev->reg_base[VDEC_COMP_MISC] =
> +               devm_platform_ioremap_resource(pdev, 0);
Confusing about the index 0, where:
VDEC_COMP_SYS = 0
VDEC_COMP_MISC = 1

> +#ifndef _MTK_VCODEC_DEC_HW_H_
> +#define _MTK_VCODEC_DEC_HW_H_
> +
> +#include <linux/component.h>
Does it really need to include component.h?

> +/**
> + * enum mtk_comp_hw_reg_idx - component register base index
> + */
> +enum mtk_comp_hw_reg_idx {
> +       VDEC_COMP_SYS,
> +       VDEC_COMP_MISC,
> +       NUM_MAX_COMP_VCODEC_REG_BASE
The name is suboptimal.  How about VDEC_COMP_MAX or VDEC_COMP_LAST?

> +#include <linux/component.h>
> +#include <linux/io.h>
>  #include <linux/platform_device.h>
>  #include <linux/videodev2.h>
>  #include <media/v4l2-ctrls.h>
The newly added code in the file doesn't look like it needs anything
from component.h and io.h.

> @@ -404,6 +422,7 @@ struct mtk_vcodec_enc_pdata {
>   *
>   * @fw_handler: used to communicate with the firmware.
>   * @id_counter: used to identify current opened instance
> + * @is_support_comp: 1: using compoent framework, 0: not support
is_support_comp is a boolean.  Use true and false instead of 1 and 0.

> @@ -422,6 +441,10 @@ struct mtk_vcodec_enc_pdata {
>   * @pm: power management control
>   * @dec_capability: used to identify decode capability, ex: 4k
>   * @enc_capability: used to identify encode capability
> + *
> + * comp_dev: component hardware device
> + * component_node: component node
> + * comp_idx: component index
To be neat, missing "@" before each symbol name.

Tzung-Bi Shih July 9, 2021, 7:59 a.m. UTC | #2

On Wed, Jul 7, 2021 at 2:22 PM Yunfei Dong <yunfei.dong@mediatek.com> wrote:
> +static int mtk_vcodec_get_hw_count(struct mtk_vcodec_dev *dev)
> +{
> +       if (dev->vdec_pdata->hw_arch == MTK_VDEC_PURE_SINGLE_CORE)
> +               return 1;
> +       else if (dev->vdec_pdata->hw_arch == MTK_VDEC_LAT_SINGLE_CORE)
> +               return 2;
> +       else
> +               return 0;
> +}
Use a switch .. case .. would be easier to read.

Would it be better to use some macro or enums for the magic numbers?

> @@ -113,8 +114,7 @@ static int mtk_vdec_comp_init_irq(struct mtk_vdec_comp_dev *dev)
>         }
>
>         ret = devm_request_irq(&pdev->dev, dev->dec_irq,
> -                               mtk_vdec_comp_irq_handler,
> -                               0, pdev->name, dev);
> +                               mtk_vdec_comp_irq_handler, 0, pdev->name, dev);
The change is irrelevant to this patch.

> @@ -154,8 +154,10 @@ static int mtk_vdec_comp_probe(struct platform_device *pdev)
>                 dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(34));
>
>         ret = mtk_vdec_comp_init_irq(dev);
> -       if (ret)
> +       if (ret) {
> +               dev_err(&pdev->dev, "Failed to register irq handler.\n");
>                 goto err;
> +       }
The change shouldn't be in this patch.  Instead, another patch that
adds the mtk_vdec_comp_init_irq() invocation.

> +int mtk_vcodec_wait_for_comp_done_ctx(struct mtk_vcodec_ctx  *ctx,
Remove the extra space before "*ctx".

Tzung-Bi Shih July 9, 2021, 9:39 a.m. UTC | #3

On Wed, Jul 7, 2021 at 2:22 PM Yunfei Dong <yunfei.dong@mediatek.com> wrote:
> @@ -464,6 +469,11 @@ struct mtk_vcodec_enc_pdata {
>   * comp_dev: component hardware device
>   * component_node: component node
>   * comp_idx: component index
> + *
> + * core_read: Wait queue used to signalize when core get useful lat buffer
> + * core_queue: List of V4L2 lat_buf
To be neat, replace "Wait" to "wait" and "List" to "list".

> +int vdec_msg_queue_init(
> +       struct mtk_vcodec_ctx *ctx,
> +       struct vdec_msg_queue *msg_queue,
> +       core_decode_cb_t core_decode,
> +       int private_size)
> +{
> +       struct vdec_lat_buf *lat_buf;
> +       int i, err;
> +
> +       init_waitqueue_head(&msg_queue->lat_read);
> +       INIT_LIST_HEAD(&msg_queue->lat_queue);
> +       spin_lock_init(&msg_queue->lat_lock);
> +       msg_queue->num_lat = 0;
> +
> +       msg_queue->wdma_addr.size = vde_msg_queue_get_trans_size(
> +               ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> +
> +       err = mtk_vcodec_mem_alloc(ctx, &msg_queue->wdma_addr);
> +       if (err) {
> +               mtk_v4l2_err("failed to allocate wdma_addr buf");
> +               return -ENOMEM;
> +       }
> +       msg_queue->wdma_rptr_addr = msg_queue->wdma_addr.dma_addr;
> +       msg_queue->wdma_wptr_addr = msg_queue->wdma_addr.dma_addr;
> +
> +       for (i = 0; i < NUM_BUFFER_COUNT; i++) {
> +               lat_buf = &msg_queue->lat_buf[i];
> +
> +               lat_buf->wdma_err_addr.size = VDEC_ERR_MAP_SZ_AVC;
> +               err = mtk_vcodec_mem_alloc(ctx, &lat_buf->wdma_err_addr);
> +               if (err) {
> +                       mtk_v4l2_err("failed to allocate wdma_err_addr buf[%d]", i);
> +                       return -ENOMEM;
> +               }
> +
> +               lat_buf->slice_bc_addr.size = VDEC_LAT_SLICE_HEADER_SZ;
> +               err = mtk_vcodec_mem_alloc(ctx, &lat_buf->slice_bc_addr);
> +               if (err) {
> +                       mtk_v4l2_err("failed to allocate wdma_addr buf[%d]", i);
> +                       return -ENOMEM;
> +               }
> +
> +               lat_buf->private_data = kzalloc(private_size, GFP_KERNEL);
> +               if (!lat_buf->private_data) {
> +                       mtk_v4l2_err("failed to allocate private_data[%d]", i);
> +                       return -ENOMEM;
> +               }
> +
> +               lat_buf->ctx = ctx;
> +               lat_buf->core_decode = core_decode;
> +               vdec_msg_queue_buf_to_lat(lat_buf);
> +       }
Doesn't it need to call mtk_vcodec_mem_free() and kfree() for any failure paths?

> +struct vdec_lat_buf *vdec_msg_queue_get_core_buf(
> +       struct mtk_vcodec_dev *dev)
> +{
> +       struct vdec_lat_buf *buf;
> +       int ret;
> +
> +       spin_lock(&dev->core_lock);
> +       if (list_empty(&dev->core_queue)) {
> +               mtk_v4l2_debug(3, "core queue is NULL, num_core = %d", dev->num_core);
> +               spin_unlock(&dev->core_lock);
> +               ret = wait_event_freezable(dev->core_read,
> +                       !list_empty(&dev->core_queue));
> +               if (ret)
> +                       return NULL;
Should be !ret?

> +void vdec_msg_queue_buf_to_core(struct mtk_vcodec_dev *dev,
> +       struct vdec_lat_buf *buf)
> +{
> +       spin_lock(&dev->core_lock);
> +       list_add_tail(&buf->core_list, &dev->core_queue);
> +       dev->num_core++;
> +       wake_up_all(&dev->core_read);
> +       mtk_v4l2_debug(3, "queu buf addr: (0x%p)", buf);
Typo.

> +bool vdec_msg_queue_wait_lat_buf_full(struct vdec_msg_queue *msg_queue)
> +{
> +       long timeout_jiff;
> +       int ret, i;
> +
> +       for (i = 0; i < NUM_BUFFER_COUNT + 2; i++) {
> +              timeout_jiff = msecs_to_jiffies(1000);
> +              ret = wait_event_timeout(msg_queue->lat_read,
> +                    msg_queue->num_lat == NUM_BUFFER_COUNT, timeout_jiff);
> +              if (ret) {
> +                     mtk_v4l2_debug(3, "success to get lat buf: %d",
> +                            msg_queue->num_lat);
> +                     return true;
> +              }
> +       }
Why does it need the loop?  i is unused.

> +void vdec_msg_queue_deinit(
> +       struct mtk_vcodec_ctx *ctx,
> +       struct vdec_msg_queue *msg_queue)
> +{
> +       struct vdec_lat_buf *lat_buf;
> +       struct mtk_vcodec_mem *mem;
> +       int i;
> +
> +       mem = &msg_queue->wdma_addr;
> +       if (mem->va)
> +               mtk_vcodec_mem_free(ctx, mem);
> +       for (i = 0; i < NUM_BUFFER_COUNT; i++) {
> +               lat_buf = &msg_queue->lat_buf[i];
> +
> +               mem = &lat_buf->wdma_err_addr;
> +               if (mem->va)
> +                       mtk_vcodec_mem_free(ctx, mem);
> +
> +               mem = &lat_buf->slice_bc_addr;
> +               if (mem->va)
> +                       mtk_vcodec_mem_free(ctx, mem);
> +
> +               if (lat_buf->private_data)
> +                       kfree(lat_buf->private_data);
> +       }
> +
> +       msg_queue->init_done = false;
Have no idea what init_done does in the code.  It is not included in
any branch condition.

> +/**
> + * vdec_msg_queue_init - init lat buffer information.
> + * @ctx: v4l2 ctx
> + * @msg_queue: used to store the lat buffer information
> + * @core_decode: core decode callback for each codec
> + * @private_size: the private data size used to share with core
> + */
> +int vdec_msg_queue_init(
> +       struct mtk_vcodec_ctx *ctx,
> +       struct vdec_msg_queue *msg_queue,
> +       core_decode_cb_t core_decode,
> +       int private_size);
Would prefer to have *msg_queue as the first argument (also applies to
all operators of vdec_msg_queue).

> +/**
> + * vdec_msg_queue_get_core_buf - get used core buffer for lat decode.
> + * @dev: mtk vcodec device
> + */
> +struct vdec_lat_buf *vdec_msg_queue_get_core_buf(
> +       struct mtk_vcodec_dev *dev);
This is weird: vdec_msg_queue's operator but manipulating mtk_vcodec_dev?

> +
> +/**
> + * vdec_msg_queue_buf_to_core - queue buf to the core for core decode.
> + * @dev: mtk vcodec device
> + * @buf: current lat buffer
> + */
> +void vdec_msg_queue_buf_to_core(struct mtk_vcodec_dev *dev,
> +       struct vdec_lat_buf *buf);
Also weird.

> +/**
> + * vdec_msg_queue_buf_to_lat - queue buf to lat for lat decode.
> + * @buf: current lat buffer
> + */
> +void vdec_msg_queue_buf_to_lat(struct vdec_lat_buf *buf);
It should at least accept a struct vdec_msg_queue argument (or which
msg queue should the buf put into?).

> +/**
> + * vdec_msg_queue_update_ube_rptr - used to updata the ube read point.
Typo.

> +/**
> + * vdec_msg_queue_update_ube_wptr - used to updata the ube write point.
Typo.

> +/**
> + * vdec_msg_queue_deinit - deinit lat buffer information.
> + * @ctx: v4l2 ctx
> + * @msg_queue: used to store the lat buffer information
> + */
> +void vdec_msg_queue_deinit(
> +       struct mtk_vcodec_ctx *ctx,
> +       struct vdec_msg_queue *msg_queue);
Would prefer to have *msg_queue as the first argument.


The position of struct vdec_msg_queue is weird.  It looks like the msg
queue is only for struct vdec_lat_buf.  If so, would vdec_msg_queue be
better to call vdec_lat_queue or something similar?

It shouldn't touch the core queue in mtk_vcodec_dev anyway.  Is it
possible to generalize the queue-related code for both lat and core
queues?

Yunfei Dong July 12, 2021, 7:27 a.m. UTC | #4

Hi Tzung-Bi,

Thanks for your detail feedback.
I add the description according to your each comments.

On Fri, 2021-07-09 at 17:39 +0800, Tzung-Bi Shih wrote:
> On Wed, Jul 7, 2021 at 2:22 PM Yunfei Dong <yunfei.dong@mediatek.com> wrote:
> > @@ -464,6 +469,11 @@ struct mtk_vcodec_enc_pdata {
> >   * comp_dev: component hardware device
> >   * component_node: component node
> >   * comp_idx: component index
> > + *
> > + * core_read: Wait queue used to signalize when core get useful lat buffer
> > + * core_queue: List of V4L2 lat_buf
> To be neat, replace "Wait" to "wait" and "List" to "list".
Will fix.
> > +int vdec_msg_queue_init(
> > +       struct mtk_vcodec_ctx *ctx,
> > +       struct vdec_msg_queue *msg_queue,
> > +       core_decode_cb_t core_decode,
> > +       int private_size)
> > +{
> > +       struct vdec_lat_buf *lat_buf;
> > +       int i, err;
> > +
> > +       init_waitqueue_head(&msg_queue->lat_read);
> > +       INIT_LIST_HEAD(&msg_queue->lat_queue);
> > +       spin_lock_init(&msg_queue->lat_lock);
> > +       msg_queue->num_lat = 0;
> > +
> > +       msg_queue->wdma_addr.size = vde_msg_queue_get_trans_size(
> > +               ctx->picinfo.buf_w, ctx->picinfo.buf_h);
> > +
> > +       err = mtk_vcodec_mem_alloc(ctx, &msg_queue->wdma_addr);
> > +       if (err) {
> > +               mtk_v4l2_err("failed to allocate wdma_addr buf");
> > +               return -ENOMEM;
> > +       }
> > +       msg_queue->wdma_rptr_addr = msg_queue->wdma_addr.dma_addr;
> > +       msg_queue->wdma_wptr_addr = msg_queue->wdma_addr.dma_addr;
> > +
> > +       for (i = 0; i < NUM_BUFFER_COUNT; i++) {
> > +               lat_buf = &msg_queue->lat_buf[i];
> > +
> > +               lat_buf->wdma_err_addr.size = VDEC_ERR_MAP_SZ_AVC;
> > +               err = mtk_vcodec_mem_alloc(ctx, &lat_buf->wdma_err_addr);
> > +               if (err) {
> > +                       mtk_v4l2_err("failed to allocate wdma_err_addr buf[%d]", i);
> > +                       return -ENOMEM;
> > +               }
> > +
> > +               lat_buf->slice_bc_addr.size = VDEC_LAT_SLICE_HEADER_SZ;
> > +               err = mtk_vcodec_mem_alloc(ctx, &lat_buf->slice_bc_addr);
> > +               if (err) {
> > +                       mtk_v4l2_err("failed to allocate wdma_addr buf[%d]", i);
> > +                       return -ENOMEM;
> > +               }
> > +
> > +               lat_buf->private_data = kzalloc(private_size, GFP_KERNEL);
> > +               if (!lat_buf->private_data) {
> > +                       mtk_v4l2_err("failed to allocate private_data[%d]", i);
> > +                       return -ENOMEM;
> > +               }
> > +
> > +               lat_buf->ctx = ctx;
> > +               lat_buf->core_decode = core_decode;
> > +               vdec_msg_queue_buf_to_lat(lat_buf);
> > +       }
> Doesn't it need to call mtk_vcodec_mem_free() and kfree() for any failure paths?
When allocate memory fail, will call deinit function auto, then free all allocated memory.
> > +struct vdec_lat_buf *vdec_msg_queue_get_core_buf(
> > +       struct mtk_vcodec_dev *dev)
> > +{
> > +       struct vdec_lat_buf *buf;
> > +       int ret;
> > +
> > +       spin_lock(&dev->core_lock);
> > +       if (list_empty(&dev->core_queue)) {
> > +               mtk_v4l2_debug(3, "core queue is NULL, num_core = %d", dev->num_core);
> > +               spin_unlock(&dev->core_lock);
> > +               ret = wait_event_freezable(dev->core_read,
> > +                       !list_empty(&dev->core_queue));
> > +               if (ret)
> > +                       return NULL;
> Should be !ret?
According the definidtion, when condition is true, return value is 0.
#define wait_event_freezable(wq_head, condition)				\
({										\
	int __ret = 0;								\
	might_sleep();								\
	if (!(condition))							\
		__ret = __wait_event_freezable(wq_head, condition);		\
	__ret;									\
}) 
> > +void vdec_msg_queue_buf_to_core(struct mtk_vcodec_dev *dev,
> > +       struct vdec_lat_buf *buf)
> > +{
> > +       spin_lock(&dev->core_lock);
> > +       list_add_tail(&buf->core_list, &dev->core_queue);
> > +       dev->num_core++;
> > +       wake_up_all(&dev->core_read);
> > +       mtk_v4l2_debug(3, "queu buf addr: (0x%p)", buf);
> Typo.
> 
> > +bool vdec_msg_queue_wait_lat_buf_full(struct vdec_msg_queue *msg_queue)
> > +{
> > +       long timeout_jiff;
> > +       int ret, i;
> > +
> > +       for (i = 0; i < NUM_BUFFER_COUNT + 2; i++) {
> > +              timeout_jiff = msecs_to_jiffies(1000);
> > +              ret = wait_event_timeout(msg_queue->lat_read,
> > +                    msg_queue->num_lat == NUM_BUFFER_COUNT, timeout_jiff);
> > +              if (ret) {
> > +                     mtk_v4l2_debug(3, "success to get lat buf: %d",
> > +                            msg_queue->num_lat);
> > +                     return true;
> > +              }
> > +       }
> Why does it need the loop?  i is unused.
Core maybe decode timeout, need to wait all core buffer process
completely.
> > +void vdec_msg_queue_deinit(
> > +       struct mtk_vcodec_ctx *ctx,
> > +       struct vdec_msg_queue *msg_queue)
> > +{
> > +       struct vdec_lat_buf *lat_buf;
> > +       struct mtk_vcodec_mem *mem;
> > +       int i;
> > +
> > +       mem = &msg_queue->wdma_addr;
> > +       if (mem->va)
> > +               mtk_vcodec_mem_free(ctx, mem);
> > +       for (i = 0; i < NUM_BUFFER_COUNT; i++) {
> > +               lat_buf = &msg_queue->lat_buf[i];
> > +
> > +               mem = &lat_buf->wdma_err_addr;
> > +               if (mem->va)
> > +                       mtk_vcodec_mem_free(ctx, mem);
> > +
> > +               mem = &lat_buf->slice_bc_addr;
> > +               if (mem->va)
> > +                       mtk_vcodec_mem_free(ctx, mem);
> > +
> > +               if (lat_buf->private_data)
> > +                       kfree(lat_buf->private_data);
> > +       }
> > +
> > +       msg_queue->init_done = false;
> Have no idea what init_done does in the code.  It is not included in
> any branch condition.
When call vdec_msg_queue_init will set this parameter to true.
> > +/**
> > + * vdec_msg_queue_init - init lat buffer information.
> > + * @ctx: v4l2 ctx
> > + * @msg_queue: used to store the lat buffer information
> > + * @core_decode: core decode callback for each codec
> > + * @private_size: the private data size used to share with core
> > + */
> > +int vdec_msg_queue_init(
> > +       struct mtk_vcodec_ctx *ctx,
> > +       struct vdec_msg_queue *msg_queue,
> > +       core_decode_cb_t core_decode,
> > +       int private_size);
> Would prefer to have *msg_queue as the first argument (also applies to
> all operators of vdec_msg_queue).
Can fix.
> > +/**
> > + * vdec_msg_queue_get_core_buf - get used core buffer for lat decode.
> > + * @dev: mtk vcodec device
> > + */
> > +struct vdec_lat_buf *vdec_msg_queue_get_core_buf(
> > +       struct mtk_vcodec_dev *dev);
> This is weird: vdec_msg_queue's operator but manipulating mtk_vcodec_dev?
vdec_msg_queue is used to share message between lat and core, for each
instance has its lat msg queue list, but all instance share one core msg
queue list. When try to get core buffer need to get it from core queue
list. Then queue it to lat queue list when core decode done.
> > +
> > +/**
> > + * vdec_msg_queue_buf_to_core - queue buf to the core for core decode.
> > + * @dev: mtk vcodec device
> > + * @buf: current lat buffer
> > + */
> > +void vdec_msg_queue_buf_to_core(struct mtk_vcodec_dev *dev,
> > +       struct vdec_lat_buf *buf);
> Also weird.
> 
> > +/**
> > + * vdec_msg_queue_buf_to_lat - queue buf to lat for lat decode.
> > + * @buf: current lat buffer
> > + */
> > +void vdec_msg_queue_buf_to_lat(struct vdec_lat_buf *buf);
> It should at least accept a struct vdec_msg_queue argument (or which
> msg queue should the buf put into?).
All buffer is struct vdec_lat_buf, used to share info between lat and core queue list.
> > +/**
> > + * vdec_msg_queue_update_ube_rptr - used to updata the ube read point.
> Typo.
> 
> > +/**
> > + * vdec_msg_queue_update_ube_wptr - used to updata the ube write point.
> Typo.
> 
> > +/**
> > + * vdec_msg_queue_deinit - deinit lat buffer information.
> > + * @ctx: v4l2 ctx
> > + * @msg_queue: used to store the lat buffer information
> > + */
> > +void vdec_msg_queue_deinit(
> > +       struct mtk_vcodec_ctx *ctx,
> > +       struct vdec_msg_queue *msg_queue);
> Would prefer to have *msg_queue as the first argument.
Yes, can fix.
> 
> The position of struct vdec_msg_queue is weird.  It looks like the msg
> queue is only for struct vdec_lat_buf.  If so, would vdec_msg_queue be
> better to call vdec_lat_queue or something similar?
> 
> It shouldn't touch the core queue in mtk_vcodec_dev anyway.  Is it
> possible to generalize the queue-related code for both lat and core
> queues?
Lat queue list is separately for each instance, but only has one core
queue list.

Yunfei Dong July 12, 2021, 8:07 a.m. UTC | #5

Hi Tzung-Bi,

Thanks for your detail feedback.
I add the description according to your each comments.

On Fri, 2021-07-09 at 15:59 +0800, Tzung-Bi Shih wrote:
> On Wed, Jul 7, 2021 at 2:22 PM Yunfei Dong <yunfei.dong@mediatek.com> wrote:
> > +static int mtk_vcodec_get_hw_count(struct mtk_vcodec_dev *dev)
> > +{
> > +       if (dev->vdec_pdata->hw_arch == MTK_VDEC_PURE_SINGLE_CORE)
> > +               return 1;
> > +       else if (dev->vdec_pdata->hw_arch == MTK_VDEC_LAT_SINGLE_CORE)
> > +               return 2;
> > +       else
> > +               return 0;
> > +}
> Use a switch .. case .. would be easier to read.
Yes
> Would it be better to use some macro or enums for the magic numbers?
Yes, add enums for magic numbers. 
enum mtk_vdec_hw_count {
	MTK_VDEC_NO_HW = 0,
	MTK_VDEC_ONE_CORE,
	MTK_VDEC_ONE_LAT_ONE_CORE,
	MTK_VDEC_MAX_HW_COUNT,
};
> > @@ -113,8 +114,7 @@ static int mtk_vdec_comp_init_irq(struct mtk_vdec_comp_dev *dev)
> >         }
> >
> >         ret = devm_request_irq(&pdev->dev, dev->dec_irq,
> > -                               mtk_vdec_comp_irq_handler,
> > -                               0, pdev->name, dev);
> > +                               mtk_vdec_comp_irq_handler, 0, pdev->name, dev);
> The change is irrelevant to this patch.
Will fix.
> > @@ -154,8 +154,10 @@ static int mtk_vdec_comp_probe(struct platform_device *pdev)
> >                 dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(34));
> >
> >         ret = mtk_vdec_comp_init_irq(dev);
> > -       if (ret)
> > +       if (ret) {
> > +               dev_err(&pdev->dev, "Failed to register irq handler.\n");
> >                 goto err;
> > +       }
> The change shouldn't be in this patch.  Instead, another patch that
> adds the mtk_vdec_comp_init_irq() invocation.
> 
> > +int mtk_vcodec_wait_for_comp_done_ctx(struct mtk_vcodec_ctx  *ctx,
> Remove the extra space before "*ctx".
Wil fix.

Tzung-Bi Shih July 13, 2021, 8:55 a.m. UTC | #6

On Mon, Jul 12, 2021 at 3:28 PM mtk12024 <yunfei.dong@mediatek.com> wrote:
> On Fri, 2021-07-09 at 17:39 +0800, Tzung-Bi Shih wrote:
> > On Wed, Jul 7, 2021 at 2:22 PM Yunfei Dong <yunfei.dong@mediatek.com> wrote:
> > Doesn't it need to call mtk_vcodec_mem_free() and kfree() for any failure paths?
> When allocate memory fail, will call deinit function auto, then free all allocated memory.
I guess you mean: if vdec_msg_queue_init() fails,
vdec_msg_queue_deinit() should be called?

If so:
- It is not "auto".  It depends on callers to invoke _deinit() if _init() fails.
- The API usage would be a bit weird: if the object hasn't been
initialized, shall we de-initialize it?

> > > +struct vdec_lat_buf *vdec_msg_queue_get_core_buf(
> > > +       struct mtk_vcodec_dev *dev)
> > > +{
> > > +       struct vdec_lat_buf *buf;
> > > +       int ret;
> > > +
> > > +       spin_lock(&dev->core_lock);
> > > +       if (list_empty(&dev->core_queue)) {
> > > +               mtk_v4l2_debug(3, "core queue is NULL, num_core = %d", dev->num_core);
> > > +               spin_unlock(&dev->core_lock);
> > > +               ret = wait_event_freezable(dev->core_read,
> > > +                       !list_empty(&dev->core_queue));
> > > +               if (ret)
> > > +                       return NULL;
> > Should be !ret?
> According the definidtion, when condition is true, return value is 0.
Yeah, you're right.  I was confused a bit with wait_event_timeout().

> > > +bool vdec_msg_queue_wait_lat_buf_full(struct vdec_msg_queue *msg_queue)
> > > +{
> > > +       long timeout_jiff;
> > > +       int ret, i;
> > > +
> > > +       for (i = 0; i < NUM_BUFFER_COUNT + 2; i++) {
> > > +              timeout_jiff = msecs_to_jiffies(1000);
> > > +              ret = wait_event_timeout(msg_queue->lat_read,
> > > +                    msg_queue->num_lat == NUM_BUFFER_COUNT, timeout_jiff);
> > > +              if (ret) {
> > > +                     mtk_v4l2_debug(3, "success to get lat buf: %d",
> > > +                            msg_queue->num_lat);
> > > +                     return true;
> > > +              }
> > > +       }
> > Why does it need the loop?  i is unused.
> Core maybe decode timeout, need to wait all core buffer process
> completely.
The point is: the i is unused.  If it needs more time to complete,
could it just wait for (NUM_BUFFER_COUNT + 2) * 1000 msecs?

> > > +       msg_queue->init_done = false;
> > Have no idea what init_done does in the code.  It is not included in
> > any branch condition.
> When call vdec_msg_queue_init will set this parameter to true.
The point is: if init_done doesn't change any code branch but just a
flag, does it really need the flag?

For example usages:
- If see the msg_queue->init_done has already been set to true in
vdec_msg_queue_init(), return errors.
- If see the msg_queue->init_done has already been set to false in
vdec_msg_queue_deinit(), return errors.

In the cases, I believe it brings very limited benefit (i.e. the
msg_queue is likely to _init and _deinit only once).

> > > +/**
> > > + * vdec_msg_queue_get_core_buf - get used core buffer for lat decode.
> > > + * @dev: mtk vcodec device
> > > + */
> > > +struct vdec_lat_buf *vdec_msg_queue_get_core_buf(
> > > +       struct mtk_vcodec_dev *dev);
> > This is weird: vdec_msg_queue's operator but manipulating mtk_vcodec_dev?
> vdec_msg_queue is used to share message between lat and core, for each
> instance has its lat msg queue list, but all instance share one core msg
> queue list. When try to get core buffer need to get it from core queue
> list. Then queue it to lat queue list when core decode done.
I guess you mean: during runtime, it has n lat queues and 1 core queue.

If so, would it be intuitive and simple by:

msg_queue *core_q;
msg_queue *lat_q[LAT_N];

vdec_msg_queue_dequeue(core_q) if it wants to get from core queue.
vdec_msg_queue_enqueue(lat_q[X], data) if it wants to put data to lat queue X.

> > > +/**
> > > + * vdec_msg_queue_buf_to_lat - queue buf to lat for lat decode.
> > > + * @buf: current lat buffer
> > > + */
> > > +void vdec_msg_queue_buf_to_lat(struct vdec_lat_buf *buf);
> > It should at least accept a struct vdec_msg_queue argument (or which
> > msg queue should the buf put into?).
> All buffer is struct vdec_lat_buf, used to share info between lat and core queue list.
The API semantic needs to provide a way to specify which msg_queue the
buf would put into.

> > The position of struct vdec_msg_queue is weird.  It looks like the msg
> > queue is only for struct vdec_lat_buf.  If so, would vdec_msg_queue be
> > better to call vdec_lat_queue or something similar?
> >
> > It shouldn't touch the core queue in mtk_vcodec_dev anyway.  Is it
> > possible to generalize the queue-related code for both lat and core
> > queues?
> Lat queue list is separately for each instance, but only has one core
> queue list.
Suggested to generalize the vdec_msg_queue to handle both lat and core
(and maybe furthermore).  See comment above.

[v1,00/14] Using component framework to support multi hardware decode

Message

Comments