diff mbox series

media: staging: tegra-vde: fix runtime pm imbalance on error

Message ID 20200520095148.10995-1-dinghao.liu@zju.edu.cn
State New
Headers show
Series media: staging: tegra-vde: fix runtime pm imbalance on error | expand

Commit Message

Dinghao Liu May 20, 2020, 9:51 a.m. UTC
pm_runtime_get_sync() increments the runtime PM usage counter even
it returns an error code. Thus a pairing decrement is needed on
the error handling path to keep the counter balanced.

Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
---
 drivers/staging/media/tegra-vde/vde.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Dmitry Osipenko May 20, 2020, 10:15 a.m. UTC | #1
20.05.2020 12:51, Dinghao Liu пишет:
> pm_runtime_get_sync() increments the runtime PM usage counter even
> it returns an error code. Thus a pairing decrement is needed on
> the error handling path to keep the counter balanced.
> 
> Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
> ---
>  drivers/staging/media/tegra-vde/vde.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/staging/media/tegra-vde/vde.c b/drivers/staging/media/tegra-vde/vde.c
> index d3e63512a765..dd134a3a15c7 100644
> --- a/drivers/staging/media/tegra-vde/vde.c
> +++ b/drivers/staging/media/tegra-vde/vde.c
> @@ -777,7 +777,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
>  
>  	ret = pm_runtime_get_sync(dev);
>  	if (ret < 0)
> -		goto unlock;
> +		goto put_runtime_pm;
>  
>  	/*
>  	 * We rely on the VDE registers reset value, otherwise VDE
> 

Hello Dinghao,

Thank you for the patch. I sent out a similar patch a week ago [1].

[1]
https://patchwork.ozlabs.org/project/linux-tegra/patch/20200514210847.9269-2-digetx@gmail.com/

The pm_runtime_put_noidle() should have the same effect as yours
variant, although my variant won't change the last_busy RPM time, which
I think is a bit more appropriate behavior.
Dan Carpenter May 20, 2020, 3:02 p.m. UTC | #2
On Wed, May 20, 2020 at 01:15:44PM +0300, Dmitry Osipenko wrote:
> 20.05.2020 12:51, Dinghao Liu пишет:
> > pm_runtime_get_sync() increments the runtime PM usage counter even
> > it returns an error code. Thus a pairing decrement is needed on
> > the error handling path to keep the counter balanced.
> > 
> > Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
> > ---
> >  drivers/staging/media/tegra-vde/vde.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/staging/media/tegra-vde/vde.c b/drivers/staging/media/tegra-vde/vde.c
> > index d3e63512a765..dd134a3a15c7 100644
> > --- a/drivers/staging/media/tegra-vde/vde.c
> > +++ b/drivers/staging/media/tegra-vde/vde.c
> > @@ -777,7 +777,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
> >  
> >  	ret = pm_runtime_get_sync(dev);
> >  	if (ret < 0)
> > -		goto unlock;
> > +		goto put_runtime_pm;
> >  
> >  	/*
> >  	 * We rely on the VDE registers reset value, otherwise VDE
> > 
> 
> Hello Dinghao,
> 
> Thank you for the patch. I sent out a similar patch a week ago [1].
> 
> [1]
> https://patchwork.ozlabs.org/project/linux-tegra/patch/20200514210847.9269-2-digetx@gmail.com/
> 
> The pm_runtime_put_noidle() should have the same effect as yours
> variant, although my variant won't change the last_busy RPM time, which
> I think is a bit more appropriate behavior.

I don't think either patch is correct.  The right thing to do is to fix
__pm_runtime_resume() so it doesn't leak a reference count on error.

The problem is that a lot of functions don't check the return so
possibly we are relying on that behavior.  We may need to introduce a
new function which cleans up properly instead of leaking reference
counts?

Also it's not documented that pm_runtime_get_sync() returns 1 sometimes
on success so it leads to a few bugs.

drivers/gpu/drm/stm/ltdc.c:             ret = pm_runtime_get_sync(ddev->dev);
drivers/gpu/drm/stm/ltdc.c-             if (ret) {
--
drivers/gpu/drm/stm/ltdc.c:             ret = pm_runtime_get_sync(ddev->dev);
drivers/gpu/drm/stm/ltdc.c-             if (ret) {

drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c:  ret = pm_runtime_get_sync(pm->dev);
drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c-  if (ret)

drivers/media/platform/ti-vpe/cal.c:    ret = pm_runtime_get_sync(&pdev->dev);
drivers/media/platform/ti-vpe/cal.c-    if (ret)

drivers/mfd/arizona-core.c:                     ret = pm_runtime_get_sync(arizona->dev);
drivers/mfd/arizona-core.c-                     if (ret != 0)

drivers/remoteproc/qcom_q6v5_adsp.c:    ret = pm_runtime_get_sync(adsp->dev);
drivers/remoteproc/qcom_q6v5_adsp.c-    if (ret)

drivers/spi/spi-img-spfi.c:     ret = pm_runtime_get_sync(dev);
drivers/spi/spi-img-spfi.c-     if (ret)

drivers/usb/dwc3/dwc3-pci.c:    ret = pm_runtime_get_sync(&dwc3->dev);
drivers/usb/dwc3/dwc3-pci.c-    if (ret)

drivers/watchdog/rti_wdt.c:     ret = pm_runtime_get_sync(dev);
drivers/watchdog/rti_wdt.c-     if (ret) {

regards,
dan carpenter

diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 99c7da112c95..e280991a977d 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1082,6 +1082,9 @@ int __pm_runtime_resume(struct device *dev, int rpmflags)
 	retval = rpm_resume(dev, rpmflags);
 	spin_unlock_irqrestore(&dev->power.lock, flags);
 
+	if (retval < 0 && rpmflags & RPM_GET_PUT)
+		atomic_dec(&dev->power.usage_count);
+
 	return retval;
 }
 EXPORT_SYMBOL_GPL(__pm_runtime_resume);
kernel test robot May 20, 2020, 8:15 p.m. UTC | #3
Hi Dinghao,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on staging/staging-testing]
[also build test WARNING on linuxtv-media/master v5.7-rc6 next-20200519]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Dinghao-Liu/media-staging-tegra-vde-fix-runtime-pm-imbalance-on-error/20200521-010315
base:   https://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git 8da047603bbcfddda6eb6ebbee0dc52c34badc6e
config: x86_64-allyesconfig (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project e6658079aca6d971b4e9d7137a3a2ecbc9c34aec)
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install x86_64 cross compiling tool for clang build
        # apt-get install binutils-x86-64-linux-gnu
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=x86_64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot <lkp@intel.com>

All warnings (new ones prefixed by >>, old ones prefixed by <<):

>> drivers/staging/media/tegra-vde/vde.c:847:1: warning: unused label 'unlock' [-Wunused-label]
unlock:
^~~~~~~
1 warning generated.

vim +/unlock +847 drivers/staging/media/tegra-vde/vde.c

cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  691  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  692  static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  693  				       unsigned long vaddr)
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  694  {
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  695  	struct device *dev = vde->miscdev.parent;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  696  	struct tegra_vde_h264_decoder_ctx ctx;
b1b9b7bee370b5 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2019-06-02  697  	struct tegra_vde_h264_frame *frames;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  698  	struct tegra_vde_h264_frame __user *frames_user;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  699  	struct video_frame *dpb_frames;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  700  	struct dma_buf_attachment *bitstream_data_dmabuf_attachment;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  701  	enum dma_data_direction dma_dir;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  702  	dma_addr_t bitstream_data_addr;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  703  	dma_addr_t bsev_ptr;
3830e4f2a907ca drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-03-17  704  	size_t lsize, csize;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  705  	size_t bitstream_data_size;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  706  	unsigned int macroblocks_nb;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  707  	unsigned int read_bytes;
3830e4f2a907ca drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-03-17  708  	unsigned int cstride;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  709  	unsigned int i;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  710  	long timeout;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  711  	int ret, err;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  712  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  713  	if (copy_from_user(&ctx, (void __user *)vaddr, sizeof(ctx)))
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  714  		return -EFAULT;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  715  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  716  	ret = tegra_vde_validate_h264_ctx(dev, &ctx);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  717  	if (ret)
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  718  		return ret;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  719  
b301f8de192504 drivers/staging/media/tegra-vde/vde.c       Dmitry Osipenko 2019-06-23  720  	ret = tegra_vde_attach_dmabuf(vde, ctx.bitstream_data_fd,
17aed9041d5ba3 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-03-17  721  				      ctx.bitstream_data_offset,
17aed9041d5ba3 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-03-17  722  				      SZ_16K, SZ_16K,
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  723  				      &bitstream_data_dmabuf_attachment,
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  724  				      &bitstream_data_addr,
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  725  				      &bitstream_data_size,
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  726  				      DMA_TO_DEVICE);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  727  	if (ret)
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  728  		return ret;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  729  
b1b9b7bee370b5 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2019-06-02  730  	frames = kmalloc_array(ctx.dpb_frames_nb, sizeof(*frames), GFP_KERNEL);
b1b9b7bee370b5 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2019-06-02  731  	if (!frames) {
b1b9b7bee370b5 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2019-06-02  732  		ret = -ENOMEM;
b1b9b7bee370b5 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2019-06-02  733  		goto release_bitstream_dmabuf;
b1b9b7bee370b5 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2019-06-02  734  	}
b1b9b7bee370b5 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2019-06-02  735  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  736  	dpb_frames = kcalloc(ctx.dpb_frames_nb, sizeof(*dpb_frames),
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  737  			     GFP_KERNEL);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  738  	if (!dpb_frames) {
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  739  		ret = -ENOMEM;
b1b9b7bee370b5 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2019-06-02  740  		goto free_frames;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  741  	}
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  742  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  743  	macroblocks_nb = ctx.pic_width_in_mbs * ctx.pic_height_in_mbs;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  744  	frames_user = u64_to_user_ptr(ctx.dpb_frames_ptr);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  745  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  746  	if (copy_from_user(frames, frames_user,
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  747  			   ctx.dpb_frames_nb * sizeof(*frames))) {
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  748  		ret = -EFAULT;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  749  		goto free_dpb_frames;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  750  	}
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  751  
3830e4f2a907ca drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-03-17  752  	cstride = ALIGN(ctx.pic_width_in_mbs * 8, 16);
3830e4f2a907ca drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-03-17  753  	csize = cstride * ctx.pic_height_in_mbs * 8;
3830e4f2a907ca drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-03-17  754  	lsize = macroblocks_nb * 256;
3830e4f2a907ca drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-03-17  755  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  756  	for (i = 0; i < ctx.dpb_frames_nb; i++) {
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  757  		ret = tegra_vde_validate_frame(dev, &frames[i]);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  758  		if (ret)
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  759  			goto release_dpb_frames;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  760  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  761  		dpb_frames[i].flags = frames[i].flags;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  762  		dpb_frames[i].frame_num = frames[i].frame_num;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  763  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  764  		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  765  
b301f8de192504 drivers/staging/media/tegra-vde/vde.c       Dmitry Osipenko 2019-06-23  766  		ret = tegra_vde_attach_dmabufs_to_frame(vde, &dpb_frames[i],
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  767  							&frames[i], dma_dir,
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  768  							ctx.baseline_profile,
3830e4f2a907ca drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-03-17  769  							lsize, csize);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  770  		if (ret)
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  771  			goto release_dpb_frames;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  772  	}
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  773  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  774  	ret = mutex_lock_interruptible(&vde->lock);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  775  	if (ret)
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  776  		goto release_dpb_frames;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  777  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  778  	ret = pm_runtime_get_sync(dev);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  779  	if (ret < 0)
bd3ff732806023 drivers/staging/media/tegra-vde/vde.c       Dinghao Liu     2020-05-20  780  		goto put_runtime_pm;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  781  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  782  	/*
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  783  	 * We rely on the VDE registers reset value, otherwise VDE
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  784  	 * causes bus lockup.
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  785  	 */
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  786  	ret = reset_control_assert(vde->rst_mc);
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  787  	if (ret) {
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  788  		dev_err(dev, "DEC start: Failed to assert MC reset: %d\n",
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  789  			ret);
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  790  		goto put_runtime_pm;
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  791  	}
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  792  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  793  	ret = reset_control_reset(vde->rst);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  794  	if (ret) {
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  795  		dev_err(dev, "DEC start: Failed to reset HW: %d\n", ret);
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  796  		goto put_runtime_pm;
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  797  	}
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  798  
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  799  	ret = reset_control_deassert(vde->rst_mc);
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  800  	if (ret) {
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  801  		dev_err(dev, "DEC start: Failed to deassert MC reset: %d\n",
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  802  			ret);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  803  		goto put_runtime_pm;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  804  	}
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  805  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  806  	ret = tegra_vde_setup_hw_context(vde, &ctx, dpb_frames,
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  807  					 bitstream_data_addr,
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  808  					 bitstream_data_size,
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  809  					 macroblocks_nb);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  810  	if (ret)
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  811  		goto put_runtime_pm;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  812  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  813  	tegra_vde_decode_frame(vde, macroblocks_nb);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  814  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  815  	timeout = wait_for_completion_interruptible_timeout(
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  816  			&vde->decode_completion, msecs_to_jiffies(1000));
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  817  	if (timeout == 0) {
91dc5e91edf767 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-11-19  818  		bsev_ptr = tegra_vde_readl(vde, vde->bsev, 0x10);
91dc5e91edf767 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-11-19  819  		macroblocks_nb = tegra_vde_readl(vde, vde->sxe, 0xC8) & 0x1FFF;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  820  		read_bytes = bsev_ptr ? bsev_ptr - bitstream_data_addr : 0;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  821  
f072c44f36590b drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-03-17  822  		dev_err(dev, "Decoding failed: read 0x%X bytes, %u macroblocks parsed\n",
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  823  			read_bytes, macroblocks_nb);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  824  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  825  		ret = -EIO;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  826  	} else if (timeout < 0) {
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  827  		ret = timeout;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  828  	}
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  829  
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  830  	/*
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  831  	 * At first reset memory client to avoid resetting VDE HW in the
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  832  	 * middle of DMA which could result into memory corruption or hang
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  833  	 * the whole system.
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  834  	 */
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  835  	err = reset_control_assert(vde->rst_mc);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  836  	if (err)
f68bbb23259e59 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-20  837  		dev_err(dev, "DEC end: Failed to assert MC reset: %d\n", err);
f956aec08d2b98 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-29  838  
f956aec08d2b98 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-29  839  	err = reset_control_assert(vde->rst);
f956aec08d2b98 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-29  840  	if (err)
f956aec08d2b98 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2018-05-29  841  		dev_err(dev, "DEC end: Failed to assert HW reset: %d\n", err);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  842  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  843  put_runtime_pm:
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  844  	pm_runtime_mark_last_busy(dev);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  845  	pm_runtime_put_autosuspend(dev);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  846  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11 @847  unlock:
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  848  	mutex_unlock(&vde->lock);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  849  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  850  release_dpb_frames:
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  851  	while (i--) {
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  852  		dma_dir = (i == 0) ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  853  
b301f8de192504 drivers/staging/media/tegra-vde/vde.c       Dmitry Osipenko 2019-06-23  854  		tegra_vde_release_frame_dmabufs(vde, &dpb_frames[i], dma_dir,
92cd14408be363 drivers/staging/media/tegra-vde/vde.c       Dmitry Osipenko 2019-06-23  855  						ctx.baseline_profile, ret != 0);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  856  	}
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  857  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  858  free_dpb_frames:
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  859  	kfree(dpb_frames);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  860  
b1b9b7bee370b5 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2019-06-02  861  free_frames:
b1b9b7bee370b5 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2019-06-02  862  	kfree(frames);
b1b9b7bee370b5 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2019-06-02  863  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  864  release_bitstream_dmabuf:
92cd14408be363 drivers/staging/media/tegra-vde/vde.c       Dmitry Osipenko 2019-06-23  865  	tegra_vde_dmabuf_cache_unmap(vde, bitstream_data_dmabuf_attachment,
92cd14408be363 drivers/staging/media/tegra-vde/vde.c       Dmitry Osipenko 2019-06-23  866  				     ret != 0);
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  867  
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  868  	return ret;
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  869  }
cd6c56feb591f6 drivers/staging/media/tegra-vde/tegra-vde.c Dmitry Osipenko 2017-12-11  870  

:::::: The code at line 847 was first introduced by commit
:::::: cd6c56feb591f6fe66bebcbeb43ecc0e2acdcffa media: staging: media: Introduce NVIDIA Tegra video decoder driver

:::::: TO: Dmitry Osipenko <digetx@gmail.com>
:::::: CC: Mauro Carvalho Chehab <mchehab@s-opensource.com>

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Dinghao Liu May 21, 2020, 3:42 a.m. UTC | #4
Hi, Dan,

I agree the best solution is to fix __pm_runtime_resume(). But there are also 
many cases that assume pm_runtime_get_sync() will change PM usage 
counter on error. According to my static analysis results, the number of these 
"right" cases are larger. Adjusting __pm_runtime_resume() directly will introduce 
more new bugs. Therefore I think we should resolve the "bug" cases individually.

I think that Dmitry's patch is more reasonable than mine. 

Dinghao

&quot;Dan Carpenter&quot; &lt;dan.carpenter@oracle.com&gt;写道:
> On Wed, May 20, 2020 at 01:15:44PM +0300, Dmitry Osipenko wrote:
> > 20.05.2020 12:51, Dinghao Liu пишет:
> > > pm_runtime_get_sync() increments the runtime PM usage counter even
> > > it returns an error code. Thus a pairing decrement is needed on
> > > the error handling path to keep the counter balanced.
> > > 
> > > Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
> > > ---
> > >  drivers/staging/media/tegra-vde/vde.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/staging/media/tegra-vde/vde.c b/drivers/staging/media/tegra-vde/vde.c
> > > index d3e63512a765..dd134a3a15c7 100644
> > > --- a/drivers/staging/media/tegra-vde/vde.c
> > > +++ b/drivers/staging/media/tegra-vde/vde.c
> > > @@ -777,7 +777,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
> > >  
> > >  	ret = pm_runtime_get_sync(dev);
> > >  	if (ret < 0)
> > > -		goto unlock;
> > > +		goto put_runtime_pm;
> > >  
> > >  	/*
> > >  	 * We rely on the VDE registers reset value, otherwise VDE
> > > 
> > 
> > Hello Dinghao,
> > 
> > Thank you for the patch. I sent out a similar patch a week ago [1].
> > 
> > [1]
> > https://patchwork.ozlabs.org/project/linux-tegra/patch/20200514210847.9269-2-digetx@gmail.com/
> > 
> > The pm_runtime_put_noidle() should have the same effect as yours
> > variant, although my variant won't change the last_busy RPM time, which
> > I think is a bit more appropriate behavior.
> 
> I don't think either patch is correct.  The right thing to do is to fix
> __pm_runtime_resume() so it doesn't leak a reference count on error.
> 
> The problem is that a lot of functions don't check the return so
> possibly we are relying on that behavior.  We may need to introduce a
> new function which cleans up properly instead of leaking reference
> counts?
> 
> Also it's not documented that pm_runtime_get_sync() returns 1 sometimes
> on success so it leads to a few bugs.
> 
> drivers/gpu/drm/stm/ltdc.c:             ret = pm_runtime_get_sync(ddev->dev);
> drivers/gpu/drm/stm/ltdc.c-             if (ret) {
> --
> drivers/gpu/drm/stm/ltdc.c:             ret = pm_runtime_get_sync(ddev->dev);
> drivers/gpu/drm/stm/ltdc.c-             if (ret) {
> 
> drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c:  ret = pm_runtime_get_sync(pm->dev);
> drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c-  if (ret)
> 
> drivers/media/platform/ti-vpe/cal.c:    ret = pm_runtime_get_sync(&pdev->dev);
> drivers/media/platform/ti-vpe/cal.c-    if (ret)
> 
> drivers/mfd/arizona-core.c:                     ret = pm_runtime_get_sync(arizona->dev);
> drivers/mfd/arizona-core.c-                     if (ret != 0)
> 
> drivers/remoteproc/qcom_q6v5_adsp.c:    ret = pm_runtime_get_sync(adsp->dev);
> drivers/remoteproc/qcom_q6v5_adsp.c-    if (ret)
> 
> drivers/spi/spi-img-spfi.c:     ret = pm_runtime_get_sync(dev);
> drivers/spi/spi-img-spfi.c-     if (ret)
> 
> drivers/usb/dwc3/dwc3-pci.c:    ret = pm_runtime_get_sync(&dwc3->dev);
> drivers/usb/dwc3/dwc3-pci.c-    if (ret)
> 
> drivers/watchdog/rti_wdt.c:     ret = pm_runtime_get_sync(dev);
> drivers/watchdog/rti_wdt.c-     if (ret) {
> 
> regards,
> dan carpenter
> 
> diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
> index 99c7da112c95..e280991a977d 100644
> --- a/drivers/base/power/runtime.c
> +++ b/drivers/base/power/runtime.c
> @@ -1082,6 +1082,9 @@ int __pm_runtime_resume(struct device *dev, int rpmflags)
>  	retval = rpm_resume(dev, rpmflags);
>  	spin_unlock_irqrestore(&dev->power.lock, flags);
>  
> +	if (retval < 0 && rpmflags & RPM_GET_PUT)
> +		atomic_dec(&dev->power.usage_count);
> +
>  	return retval;
>  }
>  EXPORT_SYMBOL_GPL(__pm_runtime_resume);
Dan Carpenter May 21, 2020, 9:15 a.m. UTC | #5
On Thu, May 21, 2020 at 11:42:55AM +0800, dinghao.liu@zju.edu.cn wrote:
> Hi, Dan,
> 
> I agree the best solution is to fix __pm_runtime_resume(). But there are also 
> many cases that assume pm_runtime_get_sync() will change PM usage 
> counter on error. According to my static analysis results, the number of these 
> "right" cases are larger. Adjusting __pm_runtime_resume() directly will introduce 
> more new bugs. Therefore I think we should resolve the "bug" cases individually.
> 

That's why I was saying that we may need to introduce a new replacement
function for pm_runtime_get_sync() that works as expected.

There is no reason why we have to live with the old behavior.

regards,
dan carpenter
Rafael J. Wysocki May 21, 2020, 3:22 p.m. UTC | #6
On Thu, May 21, 2020 at 11:15 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> On Thu, May 21, 2020 at 11:42:55AM +0800, dinghao.liu@zju.edu.cn wrote:
> > Hi, Dan,
> >
> > I agree the best solution is to fix __pm_runtime_resume(). But there are also
> > many cases that assume pm_runtime_get_sync() will change PM usage
> > counter on error. According to my static analysis results, the number of these
> > "right" cases are larger. Adjusting __pm_runtime_resume() directly will introduce
> > more new bugs. Therefore I think we should resolve the "bug" cases individually.
> >
>
> That's why I was saying that we may need to introduce a new replacement
> function for pm_runtime_get_sync() that works as expected.
>
> There is no reason why we have to live with the old behavior.

What exactly do you mean by "the old behavior"?
Rafael J. Wysocki May 21, 2020, 5:02 p.m. UTC | #7
On Wednesday, May 20, 2020 5:02:30 PM CEST Dan Carpenter wrote:
> On Wed, May 20, 2020 at 01:15:44PM +0300, Dmitry Osipenko wrote:
> > 20.05.2020 12:51, Dinghao Liu пишет:
> > > pm_runtime_get_sync() increments the runtime PM usage counter even
> > > it returns an error code. Thus a pairing decrement is needed on
> > > the error handling path to keep the counter balanced.
> > > 
> > > Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
> > > ---
> > >  drivers/staging/media/tegra-vde/vde.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/staging/media/tegra-vde/vde.c b/drivers/staging/media/tegra-vde/vde.c
> > > index d3e63512a765..dd134a3a15c7 100644
> > > --- a/drivers/staging/media/tegra-vde/vde.c
> > > +++ b/drivers/staging/media/tegra-vde/vde.c
> > > @@ -777,7 +777,7 @@ static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
> > >  
> > >  	ret = pm_runtime_get_sync(dev);
> > >  	if (ret < 0)
> > > -		goto unlock;
> > > +		goto put_runtime_pm;
> > >  
> > >  	/*
> > >  	 * We rely on the VDE registers reset value, otherwise VDE
> > > 
> > 
> > Hello Dinghao,
> > 
> > Thank you for the patch. I sent out a similar patch a week ago [1].
> > 
> > [1]
> > https://patchwork.ozlabs.org/project/linux-tegra/patch/20200514210847.9269-2-digetx@gmail.com/
> > 
> > The pm_runtime_put_noidle() should have the same effect as yours
> > variant, although my variant won't change the last_busy RPM time, which
> > I think is a bit more appropriate behavior.
> 
> I don't think either patch is correct.  The right thing to do is to fix
> __pm_runtime_resume() so it doesn't leak a reference count on error.
> 
> The problem is that a lot of functions don't check the return so
> possibly we are relying on that behavior.

Actually, the function was written with this case in mind.

In retrospect, that has been a mistake and there should be a void variant
to cover this case, but it's been like that for several years and the
documentation doesn't really say that the reference counter will be
decremented on errors.

> We may need to introduce a
> new function which cleans up properly instead of leaking reference
> counts?

Well, even with that, all of the broken callers of pm_runtime_get_sync()
would need to be changed to use the new function instead?

Is that what you mean?
Dan Carpenter May 21, 2020, 5:39 p.m. UTC | #8
On Thu, May 21, 2020 at 05:22:05PM +0200, Rafael J. Wysocki wrote:
> On Thu, May 21, 2020 at 11:15 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
> >
> > On Thu, May 21, 2020 at 11:42:55AM +0800, dinghao.liu@zju.edu.cn wrote:
> > > Hi, Dan,
> > >
> > > I agree the best solution is to fix __pm_runtime_resume(). But there are also
> > > many cases that assume pm_runtime_get_sync() will change PM usage
> > > counter on error. According to my static analysis results, the number of these
> > > "right" cases are larger. Adjusting __pm_runtime_resume() directly will introduce
> > > more new bugs. Therefore I think we should resolve the "bug" cases individually.
> > >
> >
> > That's why I was saying that we may need to introduce a new replacement
> > function for pm_runtime_get_sync() that works as expected.
> >
> > There is no reason why we have to live with the old behavior.
> 
> What exactly do you mean by "the old behavior"?

I'm suggesting we leave pm_runtime_get_sync() alone but we add a new
function which called pm_runtime_get_sync_resume() which does something
like this:

static inline int pm_runtime_get_sync_resume(struct device *dev)
{
	int ret;

	ret = __pm_runtime_resume(dev, RPM_GET_PUT);
	if (ret < 0) {
		pm_runtime_put(dev);
		return ret;
	}
	return 0;
}

I'm not sure if pm_runtime_put() is the correct thing to do?  The other
thing is that this always returns zero on success.  I don't know that
drivers ever care to differentiate between one and zero returns.

Then if any of the caller expect that behavior we update them to use the
new function.

regards,
dan carpenter
Thierry Reding May 22, 2020, 1:10 p.m. UTC | #9
On Thu, May 21, 2020 at 08:39:02PM +0300, Dan Carpenter wrote:
> On Thu, May 21, 2020 at 05:22:05PM +0200, Rafael J. Wysocki wrote:
> > On Thu, May 21, 2020 at 11:15 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
> > >
> > > On Thu, May 21, 2020 at 11:42:55AM +0800, dinghao.liu@zju.edu.cn wrote:
> > > > Hi, Dan,
> > > >
> > > > I agree the best solution is to fix __pm_runtime_resume(). But there are also
> > > > many cases that assume pm_runtime_get_sync() will change PM usage
> > > > counter on error. According to my static analysis results, the number of these
> > > > "right" cases are larger. Adjusting __pm_runtime_resume() directly will introduce
> > > > more new bugs. Therefore I think we should resolve the "bug" cases individually.
> > > >
> > >
> > > That's why I was saying that we may need to introduce a new replacement
> > > function for pm_runtime_get_sync() that works as expected.
> > >
> > > There is no reason why we have to live with the old behavior.
> > 
> > What exactly do you mean by "the old behavior"?
> 
> I'm suggesting we leave pm_runtime_get_sync() alone but we add a new
> function which called pm_runtime_get_sync_resume() which does something
> like this:
> 
> static inline int pm_runtime_get_sync_resume(struct device *dev)
> {
> 	int ret;
> 
> 	ret = __pm_runtime_resume(dev, RPM_GET_PUT);
> 	if (ret < 0) {
> 		pm_runtime_put(dev);
> 		return ret;
> 	}
> 	return 0;
> }
> 
> I'm not sure if pm_runtime_put() is the correct thing to do?  The other
> thing is that this always returns zero on success.  I don't know that
> drivers ever care to differentiate between one and zero returns.
> 
> Then if any of the caller expect that behavior we update them to use the
> new function.

Does that really have many benefits, though? I understand that this
would perhaps be easier to use because it is more in line with how other
functions operate. On the other hand, in some cases you may want to call
a different version of pm_runtime_put() on failure, as discussed in
other threads.

Even ignoring that issue, any existing callsites that are leaking the
reference would have to be updated to call the new function, which would
be pretty much the same amount of work as updating the callsites to fix
the leak, right?

So if instead we just fix up the leaks, we might have a case of an API
that doesn't work as some of us (myself included) expected it, but at
least it would be consistent. If we add another variant things become
fragmented and therefore even more complicated to use and review.

Thierry
Dan Carpenter May 22, 2020, 1:23 p.m. UTC | #10
On Fri, May 22, 2020 at 03:10:31PM +0200, Thierry Reding wrote:
> On Thu, May 21, 2020 at 08:39:02PM +0300, Dan Carpenter wrote:
> > On Thu, May 21, 2020 at 05:22:05PM +0200, Rafael J. Wysocki wrote:
> > > On Thu, May 21, 2020 at 11:15 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
> > > >
> > > > On Thu, May 21, 2020 at 11:42:55AM +0800, dinghao.liu@zju.edu.cn wrote:
> > > > > Hi, Dan,
> > > > >
> > > > > I agree the best solution is to fix __pm_runtime_resume(). But there are also
> > > > > many cases that assume pm_runtime_get_sync() will change PM usage
> > > > > counter on error. According to my static analysis results, the number of these
> > > > > "right" cases are larger. Adjusting __pm_runtime_resume() directly will introduce
> > > > > more new bugs. Therefore I think we should resolve the "bug" cases individually.
> > > > >
> > > >
> > > > That's why I was saying that we may need to introduce a new replacement
> > > > function for pm_runtime_get_sync() that works as expected.
> > > >
> > > > There is no reason why we have to live with the old behavior.
> > > 
> > > What exactly do you mean by "the old behavior"?
> > 
> > I'm suggesting we leave pm_runtime_get_sync() alone but we add a new
> > function which called pm_runtime_get_sync_resume() which does something
> > like this:
> > 
> > static inline int pm_runtime_get_sync_resume(struct device *dev)
> > {
> > 	int ret;
> > 
> > 	ret = __pm_runtime_resume(dev, RPM_GET_PUT);
> > 	if (ret < 0) {
> > 		pm_runtime_put(dev);
> > 		return ret;
> > 	}
> > 	return 0;
> > }
> > 
> > I'm not sure if pm_runtime_put() is the correct thing to do?  The other
> > thing is that this always returns zero on success.  I don't know that
> > drivers ever care to differentiate between one and zero returns.
> > 
> > Then if any of the caller expect that behavior we update them to use the
> > new function.
> 
> Does that really have many benefits, though? I understand that this
> would perhaps be easier to use because it is more in line with how other
> functions operate. On the other hand, in some cases you may want to call
> a different version of pm_runtime_put() on failure, as discussed in
> other threads.

I wasn't CC'd on the other threads so I don't know.  :/  I have always
assumed it was something like this but I don't know the details and
there is no documentation.

http://sweng.the-davies.net/Home/rustys-api-design-manifesto
You're essentially arguing that it's a #1 on Rusty's scale but ideally
we would want to be at #7.

> 
> Even ignoring that issue, any existing callsites that are leaking the
> reference would have to be updated to call the new function, which would
> be pretty much the same amount of work as updating the callsites to fix
> the leak, right?

With the current API we're constantly adding bugs.  I imagine that once
we add a straight forward default and some documentation then we will
solve this.

> 
> So if instead we just fix up the leaks, we might have a case of an API
> that doesn't work as some of us (myself included) expected it, but at
> least it would be consistent. If we add another variant things become
> fragmented and therefore even more complicated to use and review.

That's the approach that we've been trying and it's clearly not working.

regards,
dan carpenter
Thierry Reding May 22, 2020, 2:43 p.m. UTC | #11
On Fri, May 22, 2020 at 04:23:18PM +0300, Dan Carpenter wrote:
> On Fri, May 22, 2020 at 03:10:31PM +0200, Thierry Reding wrote:
> > On Thu, May 21, 2020 at 08:39:02PM +0300, Dan Carpenter wrote:
> > > On Thu, May 21, 2020 at 05:22:05PM +0200, Rafael J. Wysocki wrote:
> > > > On Thu, May 21, 2020 at 11:15 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
> > > > >
> > > > > On Thu, May 21, 2020 at 11:42:55AM +0800, dinghao.liu@zju.edu.cn wrote:
> > > > > > Hi, Dan,
> > > > > >
> > > > > > I agree the best solution is to fix __pm_runtime_resume(). But there are also
> > > > > > many cases that assume pm_runtime_get_sync() will change PM usage
> > > > > > counter on error. According to my static analysis results, the number of these
> > > > > > "right" cases are larger. Adjusting __pm_runtime_resume() directly will introduce
> > > > > > more new bugs. Therefore I think we should resolve the "bug" cases individually.
> > > > > >
> > > > >
> > > > > That's why I was saying that we may need to introduce a new replacement
> > > > > function for pm_runtime_get_sync() that works as expected.
> > > > >
> > > > > There is no reason why we have to live with the old behavior.
> > > > 
> > > > What exactly do you mean by "the old behavior"?
> > > 
> > > I'm suggesting we leave pm_runtime_get_sync() alone but we add a new
> > > function which called pm_runtime_get_sync_resume() which does something
> > > like this:
> > > 
> > > static inline int pm_runtime_get_sync_resume(struct device *dev)
> > > {
> > > 	int ret;
> > > 
> > > 	ret = __pm_runtime_resume(dev, RPM_GET_PUT);
> > > 	if (ret < 0) {
> > > 		pm_runtime_put(dev);
> > > 		return ret;
> > > 	}
> > > 	return 0;
> > > }
> > > 
> > > I'm not sure if pm_runtime_put() is the correct thing to do?  The other
> > > thing is that this always returns zero on success.  I don't know that
> > > drivers ever care to differentiate between one and zero returns.
> > > 
> > > Then if any of the caller expect that behavior we update them to use the
> > > new function.
> > 
> > Does that really have many benefits, though? I understand that this
> > would perhaps be easier to use because it is more in line with how other
> > functions operate. On the other hand, in some cases you may want to call
> > a different version of pm_runtime_put() on failure, as discussed in
> > other threads.
> 
> I wasn't CC'd on the other threads so I don't know.  :/

It was actually earlier in this thread, see here for example:

	http://patchwork.ozlabs.org/project/linux-tegra/patch/20200520095148.10995-1-dinghao.liu@zju.edu.cn/#2438776

> I have always assumed it was something like this but I don't know the
> details and there is no documentation.

Now, I don't know more than you do, but it sounds to me like there are
multiple valid ways that we can use to drop the runtime PM reference and
whatever we choose to do in this new function may not always be the
right thing.

> http://sweng.the-davies.net/Home/rustys-api-design-manifesto
> You're essentially arguing that it's a #1 on Rusty's scale but ideally
> we would want to be at #7.

I think we could probably get it to at least a 3 or a 4 on that list if
we add a bit of documentation and fix all existing users.

Yes, 7 would be better than that, but I think we have to weigh the cost
of the added fragmentation versus the benefits that it gives us.

> > Even ignoring that issue, any existing callsites that are leaking the
> > reference would have to be updated to call the new function, which would
> > be pretty much the same amount of work as updating the callsites to fix
> > the leak, right?
> 
> With the current API we're constantly adding bugs.  I imagine that once
> we add a straight forward default and some documentation then we will
> solve this.

In my experience this stuff is often copy/pasted, so once we fix up all
of the bugs (and perhaps even add a coccinelle script) we shoudl be
seeing less bugs added all the time.

That said, I'm not opposed to adding a new function if we can make it
actually result in an overall improvement. What I'd hate to do is add a
new API that we all think is superior but then ends up not being usable
in half of the cases.

> > So if instead we just fix up the leaks, we might have a case of an API
> > that doesn't work as some of us (myself included) expected it, but at
> > least it would be consistent. If we add another variant things become
> > fragmented and therefore even more complicated to use and review.
> 
> That's the approach that we've been trying and it's clearly not working.

I think this is something we can likely solve through education and
documentation. Runtime PM is still a fairly new topic that not a lot of
people have experience with (at least if I extrapolate from the many
issues I've run into lately related to runtime PM), so I think it just
takes time for everyone to catch up. This looks similar to me to how we
used to have every allocation failure print out an error, even though
the allocator already complains pretty loudly when things go wrong. Now
we've removed most (if not all) of the redundant error messages and it's
become common knowledge among most maintainers, so new instances
typically get caught during review.

But again, if you can come up with a good alternative that works for the
majority of cases I think that would also be fine. Getting things right
without actually knowing any of the background is obviously better than
having to actually educate people. =)

Thierry
Dan Carpenter May 28, 2020, 12:08 p.m. UTC | #12
On Fri, May 22, 2020 at 04:43:12PM +0200, Thierry Reding wrote:
> On Fri, May 22, 2020 at 04:23:18PM +0300, Dan Carpenter wrote:
> > On Fri, May 22, 2020 at 03:10:31PM +0200, Thierry Reding wrote:
> > > On Thu, May 21, 2020 at 08:39:02PM +0300, Dan Carpenter wrote:
> > > > On Thu, May 21, 2020 at 05:22:05PM +0200, Rafael J. Wysocki wrote:
> > > > > On Thu, May 21, 2020 at 11:15 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
> > > > > >
> > > > > > On Thu, May 21, 2020 at 11:42:55AM +0800, dinghao.liu@zju.edu.cn wrote:
> > > > > > > Hi, Dan,
> > > > > > >
> > > > > > > I agree the best solution is to fix __pm_runtime_resume(). But there are also
> > > > > > > many cases that assume pm_runtime_get_sync() will change PM usage
> > > > > > > counter on error. According to my static analysis results, the number of these
> > > > > > > "right" cases are larger. Adjusting __pm_runtime_resume() directly will introduce
> > > > > > > more new bugs. Therefore I think we should resolve the "bug" cases individually.
> > > > > > >
> > > > > >
> > > > > > That's why I was saying that we may need to introduce a new replacement
> > > > > > function for pm_runtime_get_sync() that works as expected.
> > > > > >
> > > > > > There is no reason why we have to live with the old behavior.
> > > > > 
> > > > > What exactly do you mean by "the old behavior"?
> > > > 
> > > > I'm suggesting we leave pm_runtime_get_sync() alone but we add a new
> > > > function which called pm_runtime_get_sync_resume() which does something
> > > > like this:
> > > > 
> > > > static inline int pm_runtime_get_sync_resume(struct device *dev)
> > > > {
> > > > 	int ret;
> > > > 
> > > > 	ret = __pm_runtime_resume(dev, RPM_GET_PUT);
> > > > 	if (ret < 0) {
> > > > 		pm_runtime_put(dev);
> > > > 		return ret;
> > > > 	}
> > > > 	return 0;
> > > > }
> > > > 
> > > > I'm not sure if pm_runtime_put() is the correct thing to do?  The other
> > > > thing is that this always returns zero on success.  I don't know that
> > > > drivers ever care to differentiate between one and zero returns.
> > > > 
> > > > Then if any of the caller expect that behavior we update them to use the
> > > > new function.
> > > 
> > > Does that really have many benefits, though? I understand that this
> > > would perhaps be easier to use because it is more in line with how other
> > > functions operate. On the other hand, in some cases you may want to call
> > > a different version of pm_runtime_put() on failure, as discussed in
> > > other threads.
> > 
> > I wasn't CC'd on the other threads so I don't know.  :/
> 
> It was actually earlier in this thread, see here for example:
> 
> 	http://patchwork.ozlabs.org/project/linux-tegra/patch/20200520095148.10995-1-dinghao.liu@zju.edu.cn/#2438776

I'm not seeing what you're talking about.

The only thing I see in this thread is that we don't want to call
pm_runtime_mark_last_busy(dev) which updates the last_busy time that is
used for autosuspend.

The other thing that was discussed was pm_runtime_put_noidle() vs
pm_runtime_put_autosuspend().  "The pm_runtime_put_noidle() should have
the same effect as yours variant".  So apparently they are equivalent
in this situation.  How should we choose one vs the other?

I'm not trying to be obtuse.  I understand that probably if I worked in
PM then I wouldn't need documentation...  :/

regards,
dan carpenter
Rafael J. Wysocki May 28, 2020, 12:31 p.m. UTC | #13
On Thu, May 28, 2020 at 2:08 PM Dan Carpenter <dan.carpenter@oracle.com> wrote:
>
> On Fri, May 22, 2020 at 04:43:12PM +0200, Thierry Reding wrote:
> > On Fri, May 22, 2020 at 04:23:18PM +0300, Dan Carpenter wrote:
> > > On Fri, May 22, 2020 at 03:10:31PM +0200, Thierry Reding wrote:
> > > > On Thu, May 21, 2020 at 08:39:02PM +0300, Dan Carpenter wrote:
> > > > > On Thu, May 21, 2020 at 05:22:05PM +0200, Rafael J. Wysocki wrote:
> > > > > > On Thu, May 21, 2020 at 11:15 AM Dan Carpenter <dan.carpenter@oracle.com> wrote:
> > > > > > >
> > > > > > > On Thu, May 21, 2020 at 11:42:55AM +0800, dinghao.liu@zju.edu.cn wrote:
> > > > > > > > Hi, Dan,
> > > > > > > >
> > > > > > > > I agree the best solution is to fix __pm_runtime_resume(). But there are also
> > > > > > > > many cases that assume pm_runtime_get_sync() will change PM usage
> > > > > > > > counter on error. According to my static analysis results, the number of these
> > > > > > > > "right" cases are larger. Adjusting __pm_runtime_resume() directly will introduce
> > > > > > > > more new bugs. Therefore I think we should resolve the "bug" cases individually.
> > > > > > > >
> > > > > > >
> > > > > > > That's why I was saying that we may need to introduce a new replacement
> > > > > > > function for pm_runtime_get_sync() that works as expected.
> > > > > > >
> > > > > > > There is no reason why we have to live with the old behavior.
> > > > > >
> > > > > > What exactly do you mean by "the old behavior"?
> > > > >
> > > > > I'm suggesting we leave pm_runtime_get_sync() alone but we add a new
> > > > > function which called pm_runtime_get_sync_resume() which does something
> > > > > like this:
> > > > >
> > > > > static inline int pm_runtime_get_sync_resume(struct device *dev)
> > > > > {
> > > > >         int ret;
> > > > >
> > > > >         ret = __pm_runtime_resume(dev, RPM_GET_PUT);
> > > > >         if (ret < 0) {
> > > > >                 pm_runtime_put(dev);
> > > > >                 return ret;
> > > > >         }
> > > > >         return 0;
> > > > > }
> > > > >
> > > > > I'm not sure if pm_runtime_put() is the correct thing to do?  The other
> > > > > thing is that this always returns zero on success.  I don't know that
> > > > > drivers ever care to differentiate between one and zero returns.
> > > > >
> > > > > Then if any of the caller expect that behavior we update them to use the
> > > > > new function.
> > > >
> > > > Does that really have many benefits, though? I understand that this
> > > > would perhaps be easier to use because it is more in line with how other
> > > > functions operate. On the other hand, in some cases you may want to call
> > > > a different version of pm_runtime_put() on failure, as discussed in
> > > > other threads.
> > >
> > > I wasn't CC'd on the other threads so I don't know.  :/
> >
> > It was actually earlier in this thread, see here for example:
> >
> >       http://patchwork.ozlabs.org/project/linux-tegra/patch/20200520095148.10995-1-dinghao.liu@zju.edu.cn/#2438776
>
> I'm not seeing what you're talking about.
>
> The only thing I see in this thread is that we don't want to call
> pm_runtime_mark_last_busy(dev) which updates the last_busy time that is
> used for autosuspend.

That shouldn't be a problem, though, because if pm_runtime_get_sync()
returns an error, PM-runtime is not going to work for this device
until it is explicitly disabled for it and fixed up.

> The other thing that was discussed was pm_runtime_put_noidle() vs
> pm_runtime_put_autosuspend().  "The pm_runtime_put_noidle() should have
> the same effect as yours variant".  So apparently they are equivalent
> in this situation.  How should we choose one vs the other?

The point is that pm_runtime_put_noidle() is *sufficient* to drop the
reference and nothing more is needed in the error path.

So you can always do something like this:

ret = pm_runtime_get_sync(dev);
if (ret < 0) {
        pm_runtime_put_noidle(dev);
        return ret;
}

However, it would not be a bug to do something like this:

        ret = pm_runtime_get_sync(dev);
        if (ret < 0)
                goto rpm_put;

        ...

rpm_put:
        pm_runtime_put_autosuspend(dev);

> I'm not trying to be obtuse.  I understand that probably if I worked in
> PM then I wouldn't need documentation...  :/

So Documentation/power/runtime_pm.rst says this:

  `int pm_runtime_get_sync(struct device *dev);`
    - increment the device's usage counter, run pm_runtime_resume(dev) and
      return its result

In particular, it doesn't say "decrement the device's usage counter on
errors returned by pm_runtime_resume(dev)", so I'm not sure where that
expectation comes from.
diff mbox series

Patch

diff --git a/drivers/staging/media/tegra-vde/vde.c b/drivers/staging/media/tegra-vde/vde.c
index d3e63512a765..dd134a3a15c7 100644
--- a/drivers/staging/media/tegra-vde/vde.c
+++ b/drivers/staging/media/tegra-vde/vde.c
@@ -777,7 +777,7 @@  static int tegra_vde_ioctl_decode_h264(struct tegra_vde *vde,
 
 	ret = pm_runtime_get_sync(dev);
 	if (ret < 0)
-		goto unlock;
+		goto put_runtime_pm;
 
 	/*
 	 * We rely on the VDE registers reset value, otherwise VDE