mbox series

[v6,00/52] Introduce memory interconnect for NVIDIA Tegra SoCs

Message ID 20201025221735.3062-1-digetx@gmail.com
Headers show
Series Introduce memory interconnect for NVIDIA Tegra SoCs | expand

Message

Dmitry Osipenko Oct. 25, 2020, 10:16 p.m. UTC
Hello,

This series brings initial support for memory interconnect to Tegra20,
Tegra30 and Tegra124 SoCs.

For the starter only display controllers and devfreq devices are getting
interconnect API support, others could be supported later on. The display
controllers have the biggest demand for interconnect API right now because
dynamic memory frequency scaling can't be done safely without taking into
account bandwidth requirement from the displays. In particular this series
fixes distorted display output on T30 Ouya and T124 TK1 devices.

Changelog:

v6: - This series was massively reworked in comparison to v5, most of the
      patches that previously got r-b need a new round of a review (!).

    - Added missed clk-rounding to the set() callback of EMC ICC providers.
      Now clk_set_min_rate() doesn't error out on rate overflow.

    - Now peak bandwidth is properly taken into account by the set() callback
      of EMC ICC providers.

    - EMC runs at 2x of the DRAM bus only on Tegra20, this now taken in account
      properly by the EMC ICC set() callbacks.

    - ICC drivers use new icc_sync_state() and xlate_extended().

    - ICC drivers support new TEGRA_MC_ICC_TAG_ISO for ICC paths, which
      conveys to ICC driver that memory path uses isochronous transfers.

    - Added support for memory latency scaling to Tegra30 ICC provider.
      It's required for fixing display FIFO underflows on T30.

    - Added basic interconnect support to Tegra124 drivers.

    - Tegra20/30/124 EMC drivers now support voltage scaling using generic
      OPP API.

    - The display bandwidth management is reworked and improved. It now
      supports both bandwidth and latency allocations. The nv-display is
      now also taken into account properly, i.e. it's kept untouched.
      The extra bandwidth reservation required for ISO clients is moved
      from DC driver to the ICC drivers.

    - Dropped patch that tuned T20 display controller memory client because
      turned out that it kills ~30% of memory bandwidth. It should be possible
      to support client tuning, but it's too complicated for now.

    - Corrected display's cursor and winb-vfilter ICC clients.
      The winb-vfilter was erroneously used in place of cursor's client
      in device-trees.

    - Added devm_tegra_get_memory_controller() and switched drivers to
      use it.

    - Device-tree OPP tables are now supported by memory and devfreq
      drivers.

    - Tegra20-devfeq driver is reworked and now uses EMC-stats instead
      of IMC-stats (which are nearly identical modules) because previously
      I failed to understand how EMC-stats work and succeeded this time,
      thinking that it simply doesn't work. This removes a bit icky dependency
      on using both EMC and MC drivers simultaneously by the devfreq driver.

    - Tegra20-devfeq driver now is a sub-device of the EMC, it also now uses
      interconnect API for driving memory bandwidth.

    - Tegra30-devfreq got interconnect support.

    - Devfreq patches now use dev_err_probe(), which was suggested by
      Chanwoo Choi.

    - Added acks from Chanwoo Choi and Rob Herring to the reviewed and
      unchanged patches.

    - Added tested-by from Peter Geis and Nicolas Chauvet, who tested this
      series on Ouya and TK1 devices, reporting that it fixes display
      corruption on these devices which happened due to insufficient memory
      bandwidth.

    - Added patches to fix T20 EMC registers size.

    - Fixed missing LA entry for PTC in the Tegra MC drivers.

    - New and updated patches in v6:

        dt-bindings: memory: tegra20: emc: Correct registers range in example
        dt-bindings: memory: tegra20: emc: Document nvidia,memory-controller property
        dt-bindings: memory: tegra20: emc: Document OPP table and voltage regulator
        dt-bindings: memory: tegra20: emc: Document mfd-simple compatible and statistics sub-device
        dt-bindings: memory: tegra30: emc: Document OPP table and voltage regulator
        dt-bindings: memory: tegra124: mc: Document new interconnect property
        dt-bindings: memory: tegra124: emc: Document new interconnect property
        dt-bindings: memory: tegra124: emc: Document OPP table and voltage regulator
        dt-bindings: tegra30-actmon: Document OPP and interconnect properties
        dt-bindings: memory: tegra124: Add memory client IDs
        ARM: tegra: Correct EMC registers size in Tegra20 device-tree
        ARM: tegra: Add interconnect properties to Tegra124 device-tree
        ARM: tegra: Add nvidia,memory-controller phandle to Tegra20 EMC device-tree
        ARM: tegra: Add DVFS properties to Tegra20 EMC device-tree node
        ARM: tegra: Add DVFS properties to Tegra30 EMC and ACTMON device-tree nodes
        ARM: tegra: Add DVFS properties to Tegra124 EMC and ACTMON device-tree nodes
        memory: tegra: Add and use devm_tegra_get_memory_controller()
        memory: tegra-mc: Add interconnect framework
        memory: tegra20: Support interconnect framework
        memory: tegra20-emc: Skip parsing of emc-stats DT sub-node
        memory: tegra: Add missing latency allowness entry for Page Table Cache
        memory: tegra: Add FIFO sizes to Tegra30 memory clients
        memory: tegra30: Support interconnect framework
        memory: tegra124-emc: Make driver modular
        memory: tegra124: Support interconnect framework
        memory: tegra: Remove superfluous error messages around platform_get_irq()
        drm/tegra: dc: Support memory bandwidth management
        drm/tegra: dc: Extend debug stats with total number of events
        PM / devfreq: tegra20: Convert to EMC_STAT driver, support interconnect and device-tree
        PM / devfreq: tegra30: Support interconnect and OPPs from device-tree
        PM / devfreq: tegra30: Separate configurations per-SoC generation
        opp: Put interconnect paths outside of opp_table_lock

v5: - The devfreq drivers now won't probe if memory timings aren't specified
      in a device-tree, like was suggested by Chanwoo Choi in a review comment
      to v4. Initially I wanted to always probe the driver even with a single
      fixed memory freq, but after a closer look turned out it can't be done
      easily for Tegra20 driver.

    - The "interconnect: Relax requirement in of_icc_get_from_provider()"
      patch was already applied, hence one less patch in comparison to v4.

    - Renamed display interconnect paths in accordance to the names that
      were used by Thierry Reding in one of his recent patches that supposed
      to update the Host1x's DT binding.

    - Added acks from Chanwoo Choi.

    - Added clarifying comment to tegra_mc_icc_set() about why it's a dummy
      function, this is done in a response to the review comment made by
      Georgi Djakov to v4.

v4: - All drivers that use interconnect API now select it in the Kconfig in
      order to properly express the build dependency.

    - The IS_ENABLED(CONFIG_INTERCONNECT) is dropped now from all patches.

    - Added MODULE_AUTHOR() to the modularized drivers, for completeness.

    - Added missed TEGRA_MC Kconfig dependency for the Tegra20 EMC driver.

    - Added more acks from Rob Herring that I accidentally missed to add in v3.

v3: - Added acks from Rob Herring that were given to some of the v2 patches.

    - Specified name of the TRM documentation chapter in the patch
      "dt-bindings: host1x: Document new interconnect properties", which was
      suggested by Rob Herring in the review comment to v2.

    - Added patches that allow EMC drivers to be compiled as a loadable kernel
      modules. This came up during of the v2 review when Georgi Djakov pointed
      out that interconnect-core could be compiled as a kernel module. Please
      note that the Tegra124 EMC driver is compile-tested only, I don't have
      Tegra124 HW.

    - In the review comment to [1] Stephen Boyd suggested that it will be
      better not to make changes to clk API, which was needed in order to
      avoid clashing of the interconnect driver with the devfreq in regards
      to memory clk-rate rounding.

      [1] https://patchwork.ozlabs.org/project/linux-tegra/patch/20200330231617.17079-3-digetx@gmail.com/

      Stephen Boyd suggested that instead we should provide OPP table via DT.
      I tried to investigate whether this could be done and turned out
      it's a bit complicated. Technically it should be doable, but:

        1. For now we don't fully support voltage scaling of the CORE regulator
           and so OPP table in the DT isn't really needed today. We can
           generate table from the memory timings, which is what Tegra devfreq
           drivers already do.

        2. The OPP table should be defined in the DT for the Memory Controller
           node and then its usage somehow should be shared by both interconnect
           and devfreq drivers. It's not obvious what's the best way to do it.

      So, it will be much better to postpone the DT OPP table addition
      until these questions are resolved. We can infer OPPs from the
      memory timings and we could get the memory rates from the memory
      driver directly, avoiding the problems induced by the clk API usage.
      This idea is implemented in v3, see these patches:

        PM / devfreq: tegra20: Use MC timings for building OPP table
        PM / devfreq: tegra30: Use MC timings for building OPP table

v2: - Instead of a single dma-mem interconnect path, the paths are now
      defined per memory client.

    - The EMC provider now uses #interconnect-cells=<0>.

    - Dropped Tegra124 because there is no enough information about how to
      properly calculate required EMC clock rate for it and I don't have
      hardware for testing. Somebody else will have to work on it.

    - Moved interconnect providers code into drivers/memory/tegra/*.

    - Added "Create tegra20-devfreq device" patch because interconnect
      is not very usable without the devfreq memory auto-scaling since
      memory freq will be fixed to the display's requirement.

Dmitry Osipenko (52):
  clk: tegra: Export Tegra20 EMC kernel symbols
  soc/tegra: fuse: Export tegra_read_ram_code()
  dt-bindings: memory: tegra20: emc: Correct registers range in example
  dt-bindings: memory: tegra20: emc: Document nvidia,memory-controller
    property
  dt-bindings: memory: tegra20: mc: Document new interconnect property
  dt-bindings: memory: tegra20: emc: Document new interconnect property
  dt-bindings: memory: tegra20: emc: Document OPP table and voltage
    regulator
  dt-bindings: memory: tegra20: emc: Document mfd-simple compatible and
    statistics sub-device
  dt-bindings: memory: tegra30: mc: Document new interconnect property
  dt-bindings: memory: tegra30: emc: Document new interconnect property
  dt-bindings: memory: tegra30: emc: Document OPP table and voltage
    regulator
  dt-bindings: memory: tegra124: mc: Document new interconnect property
  dt-bindings: memory: tegra124: emc: Document new interconnect property
  dt-bindings: memory: tegra124: emc: Document OPP table and voltage
    regulator
  dt-bindings: tegra30-actmon: Document OPP and interconnect properties
  dt-bindings: host1x: Document new interconnect properties
  dt-bindings: memory: tegra20: Add memory client IDs
  dt-bindings: memory: tegra30: Add memory client IDs
  dt-bindings: memory: tegra124: Add memory client IDs
  ARM: tegra: Correct EMC registers size in Tegra20 device-tree
  ARM: tegra: Add interconnect properties to Tegra20 device-tree
  ARM: tegra: Add interconnect properties to Tegra30 device-tree
  ARM: tegra: Add interconnect properties to Tegra124 device-tree
  ARM: tegra: Add nvidia,memory-controller phandle to Tegra20 EMC
    device-tree
  ARM: tegra: Add DVFS properties to Tegra20 EMC device-tree node
  ARM: tegra: Add DVFS properties to Tegra30 EMC and ACTMON device-tree
    nodes
  ARM: tegra: Add DVFS properties to Tegra124 EMC and ACTMON device-tree
    nodes
  memory: tegra: Add and use devm_tegra_get_memory_controller()
  memory: tegra-mc: Add interconnect framework
  memory: tegra20-emc: Make driver modular
  memory: tegra20-emc: Use devm_platform_ioremap_resource()
  memory: tegra20-emc: Continue probing if timings are missing in
    device-tree
  memory: tegra20: Support interconnect framework
  memory: tegra20-emc: Don't parse emc-stats node
  memory: tegra: Add missing latency allowness entry for Page Table
    Cache
  memory: tegra: Add FIFO sizes to Tegra30 memory clients
  memory: tegra30-emc: Make driver modular
  memory: tegra30-emc: Continue probing if timings are missing in
    device-tree
  memory: tegra30: Support interconnect framework
  memory: tegra124-emc: Make driver modular
  memory: tegra124-emc: Use devm_platform_ioremap_resource()
  memory: tegra124: Support interconnect framework
  memory: tegra: Remove superfluous error messages around
    platform_get_irq()
  drm/tegra: dc: Support memory bandwidth management
  drm/tegra: dc: Extend debug stats with total number of events
  opp: Put interconnect paths outside of opp_table_lock
  PM / devfreq: tegra20: Silence deferred probe error
  PM / devfreq: tegra20: Relax Kconfig dependency
  PM / devfreq: tegra20: Convert to EMC_STAT driver, support
    interconnect and device-tree
  PM / devfreq: tegra30: Silence deferred probe error
  PM / devfreq: tegra30: Support interconnect and OPPs from device-tree
  PM / devfreq: tegra30: Separate configurations per-SoC generation

 .../arm/tegra/nvidia,tegra30-actmon.txt       |  25 ++
 .../display/tegra/nvidia,tegra20-host1x.txt   |  68 +++
 .../nvidia,tegra124-emc.yaml                  |  18 +
 .../nvidia,tegra124-mc.yaml                   |   5 +
 .../memory-controllers/nvidia,tegra20-emc.txt |  63 ++-
 .../memory-controllers/nvidia,tegra20-mc.txt  |   3 +
 .../nvidia,tegra30-emc.yaml                   |  18 +
 .../memory-controllers/nvidia,tegra30-mc.yaml |   5 +
 arch/arm/boot/dts/tegra124-apalis-emc.dtsi    |   8 +
 .../arm/boot/dts/tegra124-jetson-tk1-emc.dtsi |   8 +
 arch/arm/boot/dts/tegra124-nyan-big-emc.dtsi  |  10 +
 .../boot/dts/tegra124-peripherals-opp.dtsi    | 419 ++++++++++++++++++
 arch/arm/boot/dts/tegra124.dtsi               |  31 ++
 .../boot/dts/tegra20-acer-a500-picasso.dts    |  12 +
 arch/arm/boot/dts/tegra20-colibri.dtsi        |   8 +
 arch/arm/boot/dts/tegra20-paz00.dts           |  10 +
 .../arm/boot/dts/tegra20-peripherals-opp.dtsi | 181 ++++++++
 arch/arm/boot/dts/tegra20.dtsi                |  42 +-
 .../tegra30-asus-nexus7-grouper-common.dtsi   |  16 +
 .../arm/boot/dts/tegra30-peripherals-opp.dtsi | 383 ++++++++++++++++
 arch/arm/boot/dts/tegra30.dtsi                |  33 +-
 drivers/clk/tegra/Makefile                    |   2 +-
 drivers/clk/tegra/clk-tegra124-emc.c          |  41 +-
 drivers/clk/tegra/clk-tegra124.c              |  27 +-
 drivers/clk/tegra/clk-tegra20-emc.c           |   3 +
 drivers/clk/tegra/clk.h                       |  16 +-
 drivers/devfreq/Kconfig                       |   3 +-
 drivers/devfreq/tegra20-devfreq.c             | 186 ++++----
 drivers/devfreq/tegra30-devfreq.c             | 166 ++++---
 drivers/gpu/drm/tegra/Kconfig                 |   1 +
 drivers/gpu/drm/tegra/dc.c                    | 340 ++++++++++++++
 drivers/gpu/drm/tegra/dc.h                    |  11 +
 drivers/gpu/drm/tegra/drm.c                   |  14 +
 drivers/gpu/drm/tegra/hub.c                   |   3 +
 drivers/gpu/drm/tegra/plane.c                 | 122 +++++
 drivers/gpu/drm/tegra/plane.h                 |  15 +
 drivers/memory/tegra/Kconfig                  |  12 +-
 drivers/memory/tegra/mc.c                     | 184 +++++++-
 drivers/memory/tegra/mc.h                     |  20 +
 drivers/memory/tegra/tegra114.c               |   6 +
 drivers/memory/tegra/tegra124-emc.c           | 235 ++++++++--
 drivers/memory/tegra/tegra124.c               |  37 ++
 drivers/memory/tegra/tegra20-emc.c            | 244 ++++++++--
 drivers/memory/tegra/tegra20.c                |  34 ++
 drivers/memory/tegra/tegra210-emc-core.c      |  39 +-
 drivers/memory/tegra/tegra30-emc.c            | 245 ++++++++--
 drivers/memory/tegra/tegra30.c                | 191 ++++++++
 drivers/opp/core.c                            |  21 +-
 drivers/soc/tegra/fuse/tegra-apbmisc.c        |   2 +
 include/dt-bindings/memory/tegra124-mc.h      |  68 +++
 include/dt-bindings/memory/tegra20-mc.h       |  53 +++
 include/dt-bindings/memory/tegra30-mc.h       |  67 +++
 include/linux/clk/tegra.h                     |   9 +
 include/soc/tegra/emc.h                       |  16 -
 include/soc/tegra/mc.h                        |  26 ++
 55 files changed, 3477 insertions(+), 348 deletions(-)
 create mode 100644 arch/arm/boot/dts/tegra124-peripherals-opp.dtsi
 create mode 100644 arch/arm/boot/dts/tegra20-peripherals-opp.dtsi
 create mode 100644 arch/arm/boot/dts/tegra30-peripherals-opp.dtsi
 delete mode 100644 include/soc/tegra/emc.h

Comments

Chanwoo Choi Oct. 26, 2020, 3:16 a.m. UTC | #1
On 10/26/20 7:17 AM, Dmitry Osipenko wrote:
> Tegra EMC driver was turned into a regular kernel driver, meaning that it
> could be compiled as a loadable kernel module now. Hence EMC clock isn't
> guaranteed to be available and clk_get("emc") may return -EPROBE_DEFER.
> Let's silence the deferred probe error.
> 
> Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/devfreq/tegra20-devfreq.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/devfreq/tegra20-devfreq.c b/drivers/devfreq/tegra20-devfreq.c
> index ff82bac9ee4e..fd801534771d 100644
> --- a/drivers/devfreq/tegra20-devfreq.c
> +++ b/drivers/devfreq/tegra20-devfreq.c
> @@ -141,11 +141,9 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
>  
>  	/* EMC is a system-critical clock that is always enabled */
>  	tegra->emc_clock = devm_clk_get(&pdev->dev, "emc");
> -	if (IS_ERR(tegra->emc_clock)) {
> -		err = PTR_ERR(tegra->emc_clock);
> -		dev_err(&pdev->dev, "failed to get emc clock: %d\n", err);
> -		return err;
> -	}
> +	if (IS_ERR(tegra->emc_clock))
> +		return dev_err_probe(&pdev->dev, PTR_ERR(tegra->emc_clock),
> +				     "failed to get emc clock\n");
>  
>  	tegra->regs = mc->regs;
>  
> 

Applied it. Thanks.
Chanwoo Choi Oct. 26, 2020, 3:17 a.m. UTC | #2
On 10/26/20 7:17 AM, Dmitry Osipenko wrote:
> Tegra EMC driver was turned into a regular kernel driver, meaning that it
> could be compiled as a loadable kernel module now. Hence EMC clock isn't
> guaranteed to be available and clk_get("emc") may return -EPROBE_DEFER.
> Let's silence the deferred probe error.
> 
> Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/devfreq/tegra30-devfreq.c | 7 +++----
>  1 file changed, 3 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c
> index f5e74c2ede85..3f732ab53573 100644
> --- a/drivers/devfreq/tegra30-devfreq.c
> +++ b/drivers/devfreq/tegra30-devfreq.c
> @@ -801,10 +801,9 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
>  	}
>  
>  	tegra->emc_clock = devm_clk_get(&pdev->dev, "emc");
> -	if (IS_ERR(tegra->emc_clock)) {
> -		dev_err(&pdev->dev, "Failed to get emc clock\n");
> -		return PTR_ERR(tegra->emc_clock);
> -	}
> +	if (IS_ERR(tegra->emc_clock))
> +		return dev_err_probe(&pdev->dev, PTR_ERR(tegra->emc_clock),
> +				     "Failed to get emc clock\n");
>  
>  	err = platform_get_irq(pdev, 0);
>  	if (err < 0)
> 

Applied it. Thanks.
Chanwoo Choi Oct. 26, 2020, 3:18 a.m. UTC | #3
On 10/26/20 7:17 AM, Dmitry Osipenko wrote:
> The Tegra EMC driver now could be compiled as a loadable kernel module.
> Currently devfreq driver depends on the EMC/MC drivers in Kconfig, and
> thus, devfreq is forced to be a kernel module if EMC is compiled as a
> module. This build dependency could be relaxed since devfreq driver
> checks MC/EMC presence on probe, allowing kernel configuration where
> devfreq is a built-in driver and EMC driver is a loadable module.
> This change puts Tegra20 devfreq Kconfig entry on a par with the Tegra30
> devfreq entry.
> 
> Acked-by: Chanwoo Choi <cw00.choi@samsung.com>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/devfreq/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> index 37dc40d1fcfb..0ee36ae2fa79 100644
> --- a/drivers/devfreq/Kconfig
> +++ b/drivers/devfreq/Kconfig
> @@ -123,7 +123,7 @@ config ARM_TEGRA_DEVFREQ
>  
>  config ARM_TEGRA20_DEVFREQ
>  	tristate "NVIDIA Tegra20 DEVFREQ Driver"
> -	depends on (TEGRA_MC && TEGRA20_EMC) || COMPILE_TEST
> +	depends on ARCH_TEGRA_2x_SOC || COMPILE_TEST
>  	depends on COMMON_CLK
>  	select DEVFREQ_GOV_SIMPLE_ONDEMAND
>  	help
> 

Applied it.
Krzysztof Kozlowski Oct. 26, 2020, 3:08 p.m. UTC | #4
On Mon, Oct 26, 2020 at 01:16:43AM +0300, Dmitry Osipenko wrote:
> Hello,
> 
> This series brings initial support for memory interconnect to Tegra20,
> Tegra30 and Tegra124 SoCs.
> 
> For the starter only display controllers and devfreq devices are getting
> interconnect API support, others could be supported later on. The display
> controllers have the biggest demand for interconnect API right now because
> dynamic memory frequency scaling can't be done safely without taking into
> account bandwidth requirement from the displays. In particular this series
> fixes distorted display output on T30 Ouya and T124 TK1 devices.

Hi,

You introduced in v6 multiple new patches. Could you describe the
dependencies, if any?

Best regards,
Krzysztof

> Changelog:
> 
> v6: - This series was massively reworked in comparison to v5, most of the
>       patches that previously got r-b need a new round of a review (!).
> 
>     - Added missed clk-rounding to the set() callback of EMC ICC providers.
>       Now clk_set_min_rate() doesn't error out on rate overflow.
> 
>     - Now peak bandwidth is properly taken into account by the set() callback
>       of EMC ICC providers.
> 
>     - EMC runs at 2x of the DRAM bus only on Tegra20, this now taken in account
>       properly by the EMC ICC set() callbacks.
> 
>     - ICC drivers use new icc_sync_state() and xlate_extended().
> 
>     - ICC drivers support new TEGRA_MC_ICC_TAG_ISO for ICC paths, which
>       conveys to ICC driver that memory path uses isochronous transfers.
> 
>     - Added support for memory latency scaling to Tegra30 ICC provider.
>       It's required for fixing display FIFO underflows on T30.
> 
>     - Added basic interconnect support to Tegra124 drivers.
> 
>     - Tegra20/30/124 EMC drivers now support voltage scaling using generic
>       OPP API.
> 
>     - The display bandwidth management is reworked and improved. It now
>       supports both bandwidth and latency allocations. The nv-display is
>       now also taken into account properly, i.e. it's kept untouched.
>       The extra bandwidth reservation required for ISO clients is moved
>       from DC driver to the ICC drivers.
> 
>     - Dropped patch that tuned T20 display controller memory client because
>       turned out that it kills ~30% of memory bandwidth. It should be possible
>       to support client tuning, but it's too complicated for now.
> 
>     - Corrected display's cursor and winb-vfilter ICC clients.
>       The winb-vfilter was erroneously used in place of cursor's client
>       in device-trees.
> 
>     - Added devm_tegra_get_memory_controller() and switched drivers to
>       use it.
> 
>     - Device-tree OPP tables are now supported by memory and devfreq
>       drivers.
> 
>     - Tegra20-devfeq driver is reworked and now uses EMC-stats instead
>       of IMC-stats (which are nearly identical modules) because previously
>       I failed to understand how EMC-stats work and succeeded this time,
>       thinking that it simply doesn't work. This removes a bit icky dependency
>       on using both EMC and MC drivers simultaneously by the devfreq driver.
> 
>     - Tegra20-devfeq driver now is a sub-device of the EMC, it also now uses
>       interconnect API for driving memory bandwidth.
> 
>     - Tegra30-devfreq got interconnect support.
> 
>     - Devfreq patches now use dev_err_probe(), which was suggested by
>       Chanwoo Choi.
> 
>     - Added acks from Chanwoo Choi and Rob Herring to the reviewed and
>       unchanged patches.
> 
>     - Added tested-by from Peter Geis and Nicolas Chauvet, who tested this
>       series on Ouya and TK1 devices, reporting that it fixes display
>       corruption on these devices which happened due to insufficient memory
>       bandwidth.
> 
>     - Added patches to fix T20 EMC registers size.
> 
>     - Fixed missing LA entry for PTC in the Tegra MC drivers.
> 
>     - New and updated patches in v6:
> 
>         dt-bindings: memory: tegra20: emc: Correct registers range in example
>         dt-bindings: memory: tegra20: emc: Document nvidia,memory-controller property
>         dt-bindings: memory: tegra20: emc: Document OPP table and voltage regulator
>         dt-bindings: memory: tegra20: emc: Document mfd-simple compatible and statistics sub-device
>         dt-bindings: memory: tegra30: emc: Document OPP table and voltage regulator
>         dt-bindings: memory: tegra124: mc: Document new interconnect property
>         dt-bindings: memory: tegra124: emc: Document new interconnect property
>         dt-bindings: memory: tegra124: emc: Document OPP table and voltage regulator
>         dt-bindings: tegra30-actmon: Document OPP and interconnect properties
>         dt-bindings: memory: tegra124: Add memory client IDs
>         ARM: tegra: Correct EMC registers size in Tegra20 device-tree
>         ARM: tegra: Add interconnect properties to Tegra124 device-tree
>         ARM: tegra: Add nvidia,memory-controller phandle to Tegra20 EMC device-tree
>         ARM: tegra: Add DVFS properties to Tegra20 EMC device-tree node
>         ARM: tegra: Add DVFS properties to Tegra30 EMC and ACTMON device-tree nodes
>         ARM: tegra: Add DVFS properties to Tegra124 EMC and ACTMON device-tree nodes
>         memory: tegra: Add and use devm_tegra_get_memory_controller()
>         memory: tegra-mc: Add interconnect framework
>         memory: tegra20: Support interconnect framework
>         memory: tegra20-emc: Skip parsing of emc-stats DT sub-node
>         memory: tegra: Add missing latency allowness entry for Page Table Cache
>         memory: tegra: Add FIFO sizes to Tegra30 memory clients
>         memory: tegra30: Support interconnect framework
>         memory: tegra124-emc: Make driver modular
>         memory: tegra124: Support interconnect framework
>         memory: tegra: Remove superfluous error messages around platform_get_irq()
>         drm/tegra: dc: Support memory bandwidth management
>         drm/tegra: dc: Extend debug stats with total number of events
>         PM / devfreq: tegra20: Convert to EMC_STAT driver, support interconnect and device-tree
>         PM / devfreq: tegra30: Support interconnect and OPPs from device-tree
>         PM / devfreq: tegra30: Separate configurations per-SoC generation
>         opp: Put interconnect paths outside of opp_table_lock
Dmitry Osipenko Oct. 26, 2020, 7:14 p.m. UTC | #5
26.10.2020 18:08, Krzysztof Kozlowski пишет:
> On Mon, Oct 26, 2020 at 01:16:43AM +0300, Dmitry Osipenko wrote:
>> Hello,
>>
>> This series brings initial support for memory interconnect to Tegra20,
>> Tegra30 and Tegra124 SoCs.
>>
>> For the starter only display controllers and devfreq devices are getting
>> interconnect API support, others could be supported later on. The display
>> controllers have the biggest demand for interconnect API right now because
>> dynamic memory frequency scaling can't be done safely without taking into
>> account bandwidth requirement from the displays. In particular this series
>> fixes distorted display output on T30 Ouya and T124 TK1 devices.
> 
> Hi,
> 
> You introduced in v6 multiple new patches. Could you describe the
> dependencies, if any?

Hello,

The v6 dropped some older patches and replaced them with the new
patches. Previously I wanted to postpone the more complex changes for
later times, like supporting OPP tables and DVFS, but then the review
started to take more time than was expected and there was enough time to
complete those features.

There are five basic sets of patches:

	1. DT bindings
	2. DT changes
	3. SoC, clk and memory
	4. devfreq
	5. DRM

Each set could be applied separately.

Memory patches have a build dependency on the SoC and clk patches.

The "tegra-mc: Add interconnect framework" and "Add and use
devm_tegra_get_memory_controller()" are the root build dependencies for
all memory/ patches. Other patches are grouped per SoC generation
(Tegra20/30/124), patches within a SoC-gen group are interdependent.

I suppose the best variant would be to merge the whole series via
tegra-tree in order to preserve logical order of the patches. Although,
merging each set of patches separately also should be okay to do.
Krzysztof Kozlowski Oct. 27, 2020, 8:52 a.m. UTC | #6
On Mon, Oct 26, 2020 at 10:14:10PM +0300, Dmitry Osipenko wrote:
> 26.10.2020 18:08, Krzysztof Kozlowski пишет:
> > On Mon, Oct 26, 2020 at 01:16:43AM +0300, Dmitry Osipenko wrote:
> >> Hello,
> >>
> >> This series brings initial support for memory interconnect to Tegra20,
> >> Tegra30 and Tegra124 SoCs.
> >>
> >> For the starter only display controllers and devfreq devices are getting
> >> interconnect API support, others could be supported later on. The display
> >> controllers have the biggest demand for interconnect API right now because
> >> dynamic memory frequency scaling can't be done safely without taking into
> >> account bandwidth requirement from the displays. In particular this series
> >> fixes distorted display output on T30 Ouya and T124 TK1 devices.
> > 
> > Hi,
> > 
> > You introduced in v6 multiple new patches. Could you describe the
> > dependencies, if any?
> 
> Hello,
> 
> The v6 dropped some older patches and replaced them with the new
> patches. Previously I wanted to postpone the more complex changes for
> later times, like supporting OPP tables and DVFS, but then the review
> started to take more time than was expected and there was enough time to
> complete those features.
> 
> There are five basic sets of patches:
> 
> 	1. DT bindings
> 	2. DT changes
> 	3. SoC, clk and memory
> 	4. devfreq
> 	5. DRM
> 
> Each set could be applied separately.
> 
> Memory patches have a build dependency on the SoC and clk patches.
> 
> The "tegra-mc: Add interconnect framework" and "Add and use
> devm_tegra_get_memory_controller()" are the root build dependencies for
> all memory/ patches. Other patches are grouped per SoC generation
> (Tegra20/30/124), patches within a SoC-gen group are interdependent.
> 
> I suppose the best variant would be to merge the whole series via
> tegra-tree in order to preserve logical order of the patches. Although,
> merging each set of patches separately also should be okay to do.

Thanks for explanation. I already have three patches for Tegra MC (and
probably two more will be respun) so this might be conflict-prone. The
easiest in such case is to provide me soc and clk patches on the branch,
so I could keep all Tegra MC together.

Best regards,
Krzysztof
Krzysztof Kozlowski Oct. 27, 2020, 9:10 a.m. UTC | #7
On Mon, Oct 26, 2020 at 01:17:03AM +0300, Dmitry Osipenko wrote:
> The Tegra20 EMC registers size should be twice bigger. This patch fixes
> the size.

Don't use "This patch" (this appears here). Better to use:
"Fix the size of ..." or just "The size should be twice bigger" as it is
obvious that you fix it.

https://elixir.bootlin.com/linux/latest/source/Documentation/process/submitting-patches.rst#L151

Acked-by: Krzysztof Kozlowski <krzk@kernel.org>

Best regards,
Krzysztof
Krzysztof Kozlowski Oct. 27, 2020, 9:12 a.m. UTC | #8
On Mon, Oct 26, 2020 at 01:17:04AM +0300, Dmitry Osipenko wrote:
> Add interconnect properties to the Memory Controller, External Memory
> Controller and the Display Controller nodes in order to describe hardware
> interconnection.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  arch/arm/boot/dts/tegra20.dtsi | 26 +++++++++++++++++++++++++-
>  1 file changed, 25 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
> index 9347f7789245..2e1304493f7d 100644
> --- a/arch/arm/boot/dts/tegra20.dtsi
> +++ b/arch/arm/boot/dts/tegra20.dtsi
> @@ -111,6 +111,17 @@ dc@54200000 {
>  
>  			nvidia,head = <0>;
>  
> +			interconnects = <&mc TEGRA20_MC_DISPLAY0A &emc>,

I think you just added the defines and did not include them here, so
this should not even build. Did you test it?

Best regards,
Krzysztof
Krzysztof Kozlowski Oct. 27, 2020, 9:15 a.m. UTC | #9
On Mon, Oct 26, 2020 at 01:17:05AM +0300, Dmitry Osipenko wrote:
> Add interconnect properties to the Memory Controller, External Memory
> Controller and the Display Controller nodes in order to describe hardware
> interconnection.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  arch/arm/boot/dts/tegra30.dtsi | 27 ++++++++++++++++++++++++++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm/boot/dts/tegra30.dtsi b/arch/arm/boot/dts/tegra30.dtsi
> index aeae8c092d41..2caf6cc6f4b1 100644
> --- a/arch/arm/boot/dts/tegra30.dtsi
> +++ b/arch/arm/boot/dts/tegra30.dtsi
> @@ -210,6 +210,17 @@ dc@54200000 {
>  
>  			nvidia,head = <0>;
>  
> +			interconnects = <&mc TEGRA30_MC_DISPLAY0A &emc>,

Does not compile.

> +					<&mc TEGRA30_MC_DISPLAY0B &emc>,
> +					<&mc TEGRA30_MC_DISPLAY1B &emc>,
> +					<&mc TEGRA30_MC_DISPLAY0C &emc>,
> +					<&mc TEGRA30_MC_DISPLAYHC &emc>;
> +			interconnect-names = "wina",
> +					     "winb",
> +					     "winb-vfilter",
> +					     "winc",
> +					     "cursor";
> +
>  			rgb {
>  				status = "disabled";
>  			};
> @@ -229,6 +240,17 @@ dc@54240000 {
>  
>  			nvidia,head = <1>;
>  
> +			interconnects = <&mc TEGRA30_MC_DISPLAY0AB &emc>,
> +					<&mc TEGRA30_MC_DISPLAY0BB &emc>,
> +					<&mc TEGRA30_MC_DISPLAY1BB &emc>,
> +					<&mc TEGRA30_MC_DISPLAY0CB &emc>,
> +					<&mc TEGRA30_MC_DISPLAYHCB &emc>;
> +			interconnect-names = "wina",
> +					     "winb",
> +					     "winb-vfilter",
> +					     "winc",
> +					     "cursor";
> +
>  			rgb {
>  				status = "disabled";
>  			};
> @@ -748,15 +770,18 @@ mc: memory-controller@7000f000 {
>  
>  		#iommu-cells = <1>;
>  		#reset-cells = <1>;
> +		#interconnect-cells = <1>;
>  	};
>  
> -	memory-controller@7000f400 {
> +	emc: memory-controller@7000f400 {
>  		compatible = "nvidia,tegra30-emc";
>  		reg = <0x7000f400 0x400>;
>  		interrupts = <GIC_SPI 78 IRQ_TYPE_LEVEL_HIGH>;
>  		clocks = <&tegra_car TEGRA30_CLK_EMC>;
>  
>  		nvidia,memory-controller = <&mc>;
> +

No need for blank line.

Best regards,
Krzysztof

> +		#interconnect-cells = <0>;
>  	};
>  
>  	fuse@7000f800 {
> -- 
> 2.27.0
>
Krzysztof Kozlowski Oct. 27, 2020, 9:16 a.m. UTC | #10
On Mon, Oct 26, 2020 at 01:17:06AM +0300, Dmitry Osipenko wrote:
> Add interconnect properties to the Memory Controller, External Memory
> Controller and the Display Controller nodes in order to describe hardware
> interconnection.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  arch/arm/boot/dts/tegra124.dtsi | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi
> index 64f488ba1e72..1801e30b1d3a 100644
> --- a/arch/arm/boot/dts/tegra124.dtsi
> +++ b/arch/arm/boot/dts/tegra124.dtsi
> @@ -113,6 +113,19 @@ dc@54200000 {
>  			iommus = <&mc TEGRA_SWGROUP_DC>;
>  
>  			nvidia,head = <0>;
> +
> +			interconnects = <&mc TEGRA124_MC_DISPLAY0A &emc>,

This does not compile.

> +					<&mc TEGRA124_MC_DISPLAY0B &emc>,
> +					<&mc TEGRA124_MC_DISPLAY0C &emc>,
> +					<&mc TEGRA124_MC_DISPLAYHC &emc>,
> +					<&mc TEGRA124_MC_DISPLAYD &emc>,
> +					<&mc TEGRA124_MC_DISPLAYT &emc>;
> +			interconnect-names = "wina",
> +					     "winb",
> +					     "winc",
> +					     "cursor",
> +					     "wind",
> +					     "wint";
>  		};
>  
>  		dc@54240000 {
> @@ -127,6 +140,15 @@ dc@54240000 {
>  			iommus = <&mc TEGRA_SWGROUP_DCB>;
>  
>  			nvidia,head = <1>;
> +
> +			interconnects = <&mc TEGRA124_MC_DISPLAY0AB &emc>,
> +					<&mc TEGRA124_MC_DISPLAY0BB &emc>,
> +					<&mc TEGRA124_MC_DISPLAY0CB &emc>,
> +					<&mc TEGRA124_MC_DISPLAYHCB &emc>;
> +			interconnect-names = "wina",
> +					     "winb",
> +					     "winc",
> +					     "cursor";
>  		};
>  
>  		hdmi: hdmi@54280000 {
> @@ -628,6 +650,7 @@ mc: memory-controller@70019000 {
>  
>  		#iommu-cells = <1>;
>  		#reset-cells = <1>;
> +		#interconnect-cells = <1>;
>  	};
>  
>  	emc: external-memory-controller@7001b000 {
> @@ -637,6 +660,8 @@ emc: external-memory-controller@7001b000 {
>  		clock-names = "emc";
>  
>  		nvidia,memory-controller = <&mc>;
> +

No need for blank line.

Best regards,
Krzysztof
Krzysztof Kozlowski Oct. 27, 2020, 9:18 a.m. UTC | #11
On Mon, Oct 26, 2020 at 01:17:08AM +0300, Dmitry Osipenko wrote:
> Add EMC OPP DVFS/DFS tables and emc-stats subdev that will be used for
> dynamic memory bandwidth scaling, while EMC itself will perform voltage
> scaling. Update board device-trees with optional EMC core supply and
> remove unsupported OPPs.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  .../boot/dts/tegra20-acer-a500-picasso.dts    |  12 ++
>  arch/arm/boot/dts/tegra20-colibri.dtsi        |   8 +
>  arch/arm/boot/dts/tegra20-paz00.dts           |  10 +
>  .../arm/boot/dts/tegra20-peripherals-opp.dtsi | 181 ++++++++++++++++++
>  arch/arm/boot/dts/tegra20.dtsi                |  12 +-
>  5 files changed, 222 insertions(+), 1 deletion(-)
>  create mode 100644 arch/arm/boot/dts/tegra20-peripherals-opp.dtsi
> 
> diff --git a/arch/arm/boot/dts/tegra20-acer-a500-picasso.dts b/arch/arm/boot/dts/tegra20-acer-a500-picasso.dts
> index a0b829738e8f..f5c1591c8ea8 100644
> --- a/arch/arm/boot/dts/tegra20-acer-a500-picasso.dts
> +++ b/arch/arm/boot/dts/tegra20-acer-a500-picasso.dts
> @@ -1058,9 +1058,21 @@ map0 {
>  		};
>  	};
>  
> +	emc_opp_table0 {

All node names with hyphens -. Not underscores.

Best regards,
Krzysztof
Krzysztof Kozlowski Oct. 27, 2020, 9:42 a.m. UTC | #12
On Mon, Oct 26, 2020 at 01:17:11AM +0300, Dmitry Osipenko wrote:
> Multiple Tegra drivers need to retrieve Memory Controller and there is
> duplication of the retrieval code among the drivers. This patch removes
> the duplication and fixes put_device() which was missed in the duplicated
> code.
> 
> EMC drivers now use new common devm_tegra_get_memory_controller() helper
> instead of opencoding the MC retrieval.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/memory/tegra/mc.c                | 48 ++++++++++++++++++++++++
>  drivers/memory/tegra/tegra124-emc.c      | 18 ++-------
>  drivers/memory/tegra/tegra210-emc-core.c | 39 +++++--------------
>  drivers/memory/tegra/tegra30-emc.c       | 18 ++-------
>  include/soc/tegra/mc.h                   | 10 +++++
>  5 files changed, 74 insertions(+), 59 deletions(-)
> 
> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> index ec8403557ed4..12ea2c79205a 100644
> --- a/drivers/memory/tegra/mc.c
> +++ b/drivers/memory/tegra/mc.c
> @@ -42,6 +42,54 @@ static const struct of_device_id tegra_mc_of_match[] = {
>  };
>  MODULE_DEVICE_TABLE(of, tegra_mc_of_match);
>  
> +static void tegra_mc_devm_action_put_device(void *data)
> +{
> +	struct tegra_mc *mc = data;
> +
> +	put_device(mc->dev);
> +}
> +
> +/**
> + * devm_tegra_get_memory_controller() - get Tegra Memory Controller handle
> + * @dev: device pointer for the consumer device
> + *
> + * This function will search for the Memory Controller node in a device-tree
> + * and retrieve the Memory Controller handle.
> + *
> + * Return: ERR_PTR() on error or a valid pointer to a struct tegra_mc.
> + */
> +struct tegra_mc *devm_tegra_get_memory_controller(struct device *dev)

Usually 'get' is a suffix (for example in clk, gpiod, iio, led), so:
devm_tegra_memory_controller_get()

Best regards,
Krzysztof
Krzysztof Kozlowski Oct. 27, 2020, 10:09 a.m. UTC | #13
On Mon, Oct 26, 2020 at 01:17:16AM +0300, Dmitry Osipenko wrote:
> Now Internal and External Memory Controllers are memory interconnection
> providers. This allows us to use interconnect API for tuning of memory
> configuration. EMC driver now supports OPPs and DVFS.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/memory/tegra/Kconfig       |   3 +-
>  drivers/memory/tegra/mc.h          |  12 ++
>  drivers/memory/tegra/tegra20-emc.c | 176 +++++++++++++++++++++++++++++
>  drivers/memory/tegra/tegra20.c     |  34 ++++++
>  4 files changed, 224 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/memory/tegra/Kconfig b/drivers/memory/tegra/Kconfig
> index ff426747cd7d..ac3dfe155505 100644
> --- a/drivers/memory/tegra/Kconfig
> +++ b/drivers/memory/tegra/Kconfig
> @@ -11,7 +11,8 @@ config TEGRA_MC
>  config TEGRA20_EMC
>  	tristate "NVIDIA Tegra20 External Memory Controller driver"
>  	default y
> -	depends on ARCH_TEGRA_2x_SOC
> +	depends on TEGRA_MC && ARCH_TEGRA_2x_SOC
> +	select PM_OPP
>  	help
>  	  This driver is for the External Memory Controller (EMC) found on
>  	  Tegra20 chips. The EMC controls the external DRAM on the board.
> diff --git a/drivers/memory/tegra/mc.h b/drivers/memory/tegra/mc.h
> index abeb6a2cc36a..531fb4fb7b17 100644
> --- a/drivers/memory/tegra/mc.h
> +++ b/drivers/memory/tegra/mc.h
> @@ -78,6 +78,18 @@
>  
>  #define MC_TIMING_UPDATE				BIT(0)
>  
> +static inline u32 tegra_mc_scale_percents(u64 val, unsigned int percents)
> +{
> +	val = val * percents;
> +	do_div(val, 100);
> +
> +	/*
> +	 * High freq + high boosting percent + large polling interval are
> +	 * resulting in integer overflow when watermarks are calculated.
> +	 */
> +	return min_t(u64, val, U32_MAX);
> +}
> +
>  static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
>  {
>  	return readl_relaxed(mc->regs + offset);
> diff --git a/drivers/memory/tegra/tegra20-emc.c b/drivers/memory/tegra/tegra20-emc.c
> index 34085e26dced..69ccb3fe5b0b 100644
> --- a/drivers/memory/tegra/tegra20-emc.c
> +++ b/drivers/memory/tegra/tegra20-emc.c
> @@ -9,6 +9,7 @@
>  #include <linux/clk/tegra.h>
>  #include <linux/debugfs.h>
>  #include <linux/err.h>
> +#include <linux/interconnect-provider.h>
>  #include <linux/interrupt.h>
>  #include <linux/io.h>
>  #include <linux/iopoll.h>
> @@ -16,11 +17,15 @@
>  #include <linux/module.h>
>  #include <linux/of.h>
>  #include <linux/platform_device.h>
> +#include <linux/pm_opp.h>
> +#include <linux/slab.h>
>  #include <linux/sort.h>
>  #include <linux/types.h>
>  
>  #include <soc/tegra/fuse.h>
>  
> +#include "mc.h"
> +
>  #define EMC_INTSTATUS				0x000
>  #define EMC_INTMASK				0x004
>  #define EMC_DBG					0x008
> @@ -144,6 +149,9 @@ struct emc_timing {
>  
>  struct tegra_emc {
>  	struct device *dev;
> +	struct tegra_mc *mc;
> +	struct opp_table *opp_table;
> +	struct icc_provider provider;
>  	struct notifier_block clk_nb;
>  	struct clk *clk;
>  	void __iomem *regs;
> @@ -658,6 +666,166 @@ static void tegra_emc_debugfs_init(struct tegra_emc *emc)
>  			    emc, &tegra_emc_debug_max_rate_fops);
>  }
>  
> +static inline struct tegra_emc *
> +to_tegra_emc_provider(struct icc_provider *provider)
> +{
> +	return container_of(provider, struct tegra_emc, provider);
> +}
> +
> +static struct icc_node_data *
> +emc_of_icc_xlate_extended(struct of_phandle_args *spec, void *data)
> +{
> +	struct icc_provider *provider = data;
> +	struct icc_node_data *ndata;
> +	struct icc_node *node;
> +
> +	/* External Memory is the only possible ICC route */
> +	list_for_each_entry(node, &provider->nodes, node_list) {
> +		if (node->id != TEGRA_ICC_EMEM)
> +			continue;
> +
> +		ndata = kzalloc(sizeof(*ndata), GFP_KERNEL);
> +		if (!ndata)
> +			return ERR_PTR(-ENOMEM);
> +
> +		/*
> +		 * SRC and DST nodes should have matching TAG in order to have
> +		 * it set by default for a requested path.
> +		 */
> +		ndata->tag = TEGRA_MC_ICC_TAG_ISO;
> +		ndata->node = node;
> +
> +		return ndata;
> +	}
> +
> +	return ERR_PTR(-EINVAL);
> +}
> +
> +static int emc_icc_set(struct icc_node *src, struct icc_node *dst)
> +{
> +	struct tegra_emc *emc = to_tegra_emc_provider(dst->provider);
> +	unsigned long long peak_bw = icc_units_to_bps(dst->peak_bw);
> +	unsigned long long avg_bw = icc_units_to_bps(dst->avg_bw);
> +	unsigned long long rate = max(avg_bw, peak_bw);
> +	unsigned int dram_data_bus_width_bytes = 4;
> +	long rounded_rate;
> +	int err;
> +
> +	/*
> +	 * Tegra20 EMC runs on x2 clock rate of SDRAM bus because DDR data
> +	 * is sampled on both clock edges. This means that EMC clock rate
> +	 * equals to the peak data rate.
> +	 */
> +	do_div(rate, dram_data_bus_width_bytes);
> +	rate = min_t(u64, rate, U32_MAX);
> +
> +	rounded_rate = emc_round_rate(rate, 0, U32_MAX, emc);
> +	if (rounded_rate < 0)
> +		return rounded_rate;
> +
> +	err = dev_pm_opp_set_rate(emc->dev, rounded_rate);
> +	if (err)
> +		return err;
> +
> +	return 0;
> +}
> +
> +static int tegra_emc_interconnect_init(struct tegra_emc *emc)
> +{
> +	const struct tegra_mc_soc *soc;
> +	struct icc_node *node;
> +	int err;
> +
> +	emc->mc = devm_tegra_get_memory_controller(emc->dev);
> +	if (IS_ERR(emc->mc))
> +		return PTR_ERR(emc->mc);
> +
> +	soc = emc->mc->soc;
> +
> +	emc->provider.dev = emc->dev;
> +	emc->provider.set = emc_icc_set;
> +	emc->provider.data = &emc->provider;
> +	emc->provider.aggregate = soc->icc_ops->aggregate;
> +	emc->provider.xlate_extended = emc_of_icc_xlate_extended;
> +
> +	err = icc_provider_add(&emc->provider);
> +	if (err)
> +		goto err_msg;
> +
> +	/* create External Memory Controller node */
> +	node = icc_node_create(TEGRA_ICC_EMC);
> +	err = PTR_ERR_OR_ZERO(node);
> +	if (err)
> +		goto del_provider;
> +
> +	node->name = "External Memory Controller";
> +	icc_node_add(node, &emc->provider);
> +
> +	/* link External Memory Controller to External Memory (DRAM) */
> +	err = icc_link_create(node, TEGRA_ICC_EMEM);
> +	if (err)
> +		goto remove_nodes;
> +
> +	/* create External Memory node */
> +	node = icc_node_create(TEGRA_ICC_EMEM);
> +	err = PTR_ERR_OR_ZERO(node);
> +	if (err)
> +		goto remove_nodes;
> +
> +	node->name = "External Memory (DRAM)";
> +	icc_node_add(node, &emc->provider);
> +
> +	return 0;
> +
> +remove_nodes:
> +	icc_nodes_remove(&emc->provider);
> +del_provider:
> +	icc_provider_del(&emc->provider);
> +err_msg:
> +	dev_err(emc->dev, "failed to initialize ICC: %d\n", err);

You will print such errors on all existing DTBs. Since it is not a
failure of probe (it is actually quite expected, normal situation when
booting with older DTB), let's change it to warning (here and in all
other places and drivers).

> +
> +	return err;
> +}
> +
> +static int tegra_emc_opp_table_init(struct tegra_emc *emc)
> +{
> +	const char *rname = "core";
> +	int err;
> +
> +	/*
> +	 * Legacy device-trees don't have OPP table and EMC driver isn't
> +	 * useful in this case.
> +	 */
> +	if (!device_property_present(emc->dev, "operating-points-v2")) {
> +		dev_err(emc->dev, "OPP table not found\n");
> +		dev_err(emc->dev, "please update your device tree\n");
> +		return -ENODEV;
> +	}
> +
> +	/* voltage scaling is optional */
> +	if (device_property_present(emc->dev, "core-supply"))
> +		emc->opp_table = dev_pm_opp_set_regulators(emc->dev, &rname, 1);
> +	else
> +		emc->opp_table = dev_pm_opp_get_opp_table(emc->dev);
> +
> +	if (IS_ERR(emc->opp_table))
> +		return dev_err_probe(emc->dev, PTR_ERR(emc->opp_table),
> +				     "failed to prepare OPP table\n");
> +
> +	err = dev_pm_opp_of_add_table(emc->dev);
> +	if (err) {
> +		dev_err(emc->dev, "failed to add OPP table: %d\n", err);
> +		goto put_table;
> +	}
> +
> +	return 0;
> +
> +put_table:
> +	dev_pm_opp_put_opp_table(emc->opp_table);
> +
> +	return err;
> +}
> +
>  static int tegra_emc_probe(struct platform_device *pdev)
>  {
>  	struct device_node *np;
> @@ -717,8 +885,13 @@ static int tegra_emc_probe(struct platform_device *pdev)
>  		goto unset_cb;
>  	}
>  
> +	err = tegra_emc_opp_table_init(emc);
> +	if (err)
> +		goto unreg_notifier;

This looks like the ABI break I mentioned around DT bindings. Are the
bindings marked as unstable?

Best regards,
Krzysztof
Krzysztof Kozlowski Oct. 27, 2020, 10:27 a.m. UTC | #14
On Mon, Oct 26, 2020 at 01:17:24AM +0300, Dmitry Osipenko wrote:
> Use devm_platform_ioremap_resource() helper which makes code a bit
> cleaner.

Such cleanups (and few other in this patchset) should be at beginning of
patchset or even as part of a separate one.  I think there is not much
stopping anyone from applying these... except that you put them in the
middle of big dependency.

Best regards,
Krzysztof
Thierry Reding Oct. 27, 2020, 1:04 p.m. UTC | #15
On Mon, Oct 26, 2020 at 01:16:44AM +0300, Dmitry Osipenko wrote:
> We're going to modularize Tegra EMC drivers and some of the EMC clk driver
> symbols need to be exported, let's export them.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/clk/tegra/clk-tegra20-emc.c | 3 +++
>  1 file changed, 3 insertions(+)

Acked-by: Thierry Reding <treding@nvidia.com>
Thierry Reding Oct. 27, 2020, 1:17 p.m. UTC | #16
On Mon, Oct 26, 2020 at 01:16:45AM +0300, Dmitry Osipenko wrote:
> The tegra_read_ram_code() is used by EMC drivers and we're going to make
> these driver modular, hence this function needs to be exported.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/soc/tegra/fuse/tegra-apbmisc.c | 2 ++
>  1 file changed, 2 insertions(+)

I'm not a big fan of exporting yet another of these tiny helpers, but I
don't have any better ideas, so:

Acked-by: Thierry Reding <treding@nvidia.com>
Thierry Reding Oct. 27, 2020, 1:30 p.m. UTC | #17
On Tue, Oct 27, 2020 at 10:12:47AM +0100, Krzysztof Kozlowski wrote:
> On Mon, Oct 26, 2020 at 01:17:04AM +0300, Dmitry Osipenko wrote:
> > Add interconnect properties to the Memory Controller, External Memory
> > Controller and the Display Controller nodes in order to describe hardware
> > interconnection.
> > 
> > Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> > ---
> >  arch/arm/boot/dts/tegra20.dtsi | 26 +++++++++++++++++++++++++-
> >  1 file changed, 25 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
> > index 9347f7789245..2e1304493f7d 100644
> > --- a/arch/arm/boot/dts/tegra20.dtsi
> > +++ b/arch/arm/boot/dts/tegra20.dtsi
> > @@ -111,6 +111,17 @@ dc@54200000 {
> >  
> >  			nvidia,head = <0>;
> >  
> > +			interconnects = <&mc TEGRA20_MC_DISPLAY0A &emc>,
> 
> I think you just added the defines and did not include them here, so
> this should not even build. Did you test it?

The dt-bindings/memory/tegra20-mc.h header is already included in
existing DTS files for MC hot flush resets, so this should be fine.

Thierry
Thierry Reding Oct. 27, 2020, 1:35 p.m. UTC | #18
On Mon, Oct 26, 2020 at 01:17:11AM +0300, Dmitry Osipenko wrote:
> Multiple Tegra drivers need to retrieve Memory Controller and there is
> duplication of the retrieval code among the drivers. This patch removes
> the duplication and fixes put_device() which was missed in the duplicated
> code.
> 
> EMC drivers now use new common devm_tegra_get_memory_controller() helper
> instead of opencoding the MC retrieval.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/memory/tegra/mc.c                | 48 ++++++++++++++++++++++++
>  drivers/memory/tegra/tegra124-emc.c      | 18 ++-------
>  drivers/memory/tegra/tegra210-emc-core.c | 39 +++++--------------
>  drivers/memory/tegra/tegra30-emc.c       | 18 ++-------
>  include/soc/tegra/mc.h                   | 10 +++++
>  5 files changed, 74 insertions(+), 59 deletions(-)

Acked-by: Thierry Reding <treding@nvidia.com>
Thierry Reding Oct. 27, 2020, 1:48 p.m. UTC | #19
On Mon, Oct 26, 2020 at 01:17:12AM +0300, Dmitry Osipenko wrote:
> Now Memory Controller is a memory interconnection provider. This allows
> us to use interconnect API for tuning of memory configuration. This patch
> adds common ICC core and adds hooks which should be implemented by the SoC
> drivers.
> 
> Tested-by: Peter Geis <pgwipeout@gmail.com>
> Tested-by: Nicolas Chauvet <kwizart@gmail.com>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/memory/tegra/Kconfig |   1 +
>  drivers/memory/tegra/mc.c    | 129 +++++++++++++++++++++++++++++++++++
>  drivers/memory/tegra/mc.h    |   8 +++
>  include/soc/tegra/mc.h       |  16 +++++
>  4 files changed, 154 insertions(+)
> 
> diff --git a/drivers/memory/tegra/Kconfig b/drivers/memory/tegra/Kconfig
> index 9f0a96bf9ccc..b38e5255effe 100644
> --- a/drivers/memory/tegra/Kconfig
> +++ b/drivers/memory/tegra/Kconfig
> @@ -3,6 +3,7 @@ config TEGRA_MC
>  	bool "NVIDIA Tegra Memory Controller support"
>  	default y
>  	depends on ARCH_TEGRA
> +	select INTERCONNECT
>  	help
>  	  This driver supports the Memory Controller (MC) hardware found on
>  	  NVIDIA Tegra SoCs.
> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> index 12ea2c79205a..53d61b05ebf8 100644
> --- a/drivers/memory/tegra/mc.c
> +++ b/drivers/memory/tegra/mc.c
> @@ -639,6 +639,133 @@ static __maybe_unused irqreturn_t tegra20_mc_irq(int irq, void *data)
>  	return IRQ_HANDLED;
>  }
>  
> +static struct icc_node_data *
> +tegra_mc_of_icc_xlate_extended(struct of_phandle_args *spec, void *data)
> +{
> +	struct icc_provider *provider = data;
> +	unsigned int idx = spec->args[0];
> +	struct icc_node_data *ndata;
> +	struct icc_node *node;
> +
> +	list_for_each_entry(node, &provider->nodes, node_list) {
> +		if (node->id != idx)
> +			continue;
> +
> +		ndata = kzalloc(sizeof(*ndata), GFP_KERNEL);
> +		if (!ndata)
> +			return ERR_PTR(-ENOMEM);
> +
> +		ndata->node = node;
> +
> +		/* these clients are isochronous by default on all SoCs */
> +		if (strstarts(node->name, "display") ||
> +		    strstarts(node->name, "ptc") ||
> +		    strstarts(node->name, "vi"))
> +			ndata->tag = TEGRA_MC_ICC_TAG_ISO;

This looks like something that might be better left to the drivers to
decide. Doing this here seems okay for now, but I suspect that this will
get fairly complicated to keep accurate as we add more clients later on.

> +
> +		return ndata;
> +	}
> +
> +	pr_err("%s: invalid client index %u\n", __func__, idx);

Perhaps use "dev_err(provider->dev, ...);"?

> +
> +	return ERR_PTR(-EINVAL);
> +}
> +
> +/*
> + * Memory Controller (MC) has few Memory Clients that are issuing memory
> + * bandwidth allocation requests to the MC interconnect provider. The MC
> + * provider aggregates the requests and then sends the aggregated request
> + * up to the External Memory Controller (EMC) interconnect provider which
> + * re-configures hardware interface to External Memory (EMEM) in accordance
> + * to the required bandwidth. Each MC interconnect node represents an
> + * individual Memory Client.
> + *
> + * Memory interconnect topology:
> + *
> + *               +----+
> + * +--------+    |    |
> + * | TEXSRD +--->+    |
> + * +--------+    |    |
> + *               |    |    +-----+    +------+
> + *    ...        | MC +--->+ EMC +--->+ EMEM |
> + *               |    |    +-----+    +------+
> + * +--------+    |    |
> + * | DISP.. +--->+    |
> + * +--------+    |    |
> + *               +----+
> + */
> +static int tegra_mc_interconnect_setup(struct tegra_mc *mc)
> +{
> +	struct icc_node *node;
> +	unsigned int i;
> +	int err;
> +
> +	/* older device-trees don't have interconnect properties */
> +	if (!of_find_property(mc->dev->of_node, "#interconnect-cells", NULL) ||
> +	    !mc->soc->icc_ops)
> +		return 0;

This indicates that this property is indeed optional, so the bindings
should reflect that.

> +
> +	mc->provider.dev = mc->dev;
> +	mc->provider.data = &mc->provider;
> +	mc->provider.set = mc->soc->icc_ops->set;
> +	mc->provider.aggregate = mc->soc->icc_ops->aggregate;
> +	mc->provider.xlate_extended = tegra_mc_of_icc_xlate_extended;
> +
> +	err = icc_provider_add(&mc->provider);
> +	if (err)
> +		goto err_msg;
> +
> +	/* create Memory Controller node */
> +	node = icc_node_create(TEGRA_ICC_MC);
> +	err = PTR_ERR_OR_ZERO(node);
> +	if (err)
> +		goto del_provider;
> +
> +	node->name = "Memory Controller";
> +	icc_node_add(node, &mc->provider);
> +
> +	/* link Memory Controller to External Memory Controller */
> +	err = icc_link_create(node, TEGRA_ICC_EMC);
> +	if (err)
> +		goto remove_nodes;
> +
> +	for (i = 0; i < mc->soc->num_clients; i++) {
> +		/* create MC client node */
> +		node = icc_node_create(mc->soc->clients[i].id);
> +		err = PTR_ERR_OR_ZERO(node);
> +		if (err)
> +			goto remove_nodes;
> +
> +		node->name = mc->soc->clients[i].name;
> +		icc_node_add(node, &mc->provider);

I'm not fully familiar with how these nodes are set up, but would it be
possible to set the isochronous tag here already? I'd still prefer this
to be up to the drivers because I think that nicely localizes the
device-specific information in the driver, but if that's not an option,
then doing it here, based on lookup data from the MC clients table
sounds like the next best thing.

> +		/* link Memory Client to Memory Controller */
> +		err = icc_link_create(node, TEGRA_ICC_MC);
> +		if (err)
> +			goto remove_nodes;
> +	}
> +
> +	/*
> +	 * MC driver is registered too early, so early that generic driver
> +	 * syncing doesn't work for the MC. But it doesn't really matter
> +	 * since syncing works for the EMC drivers, hence the we can sync
> +	 * the MC driver by ourselves and then EMC will complete syncing of
> +	 * the whole ICC state.
> +	 */
> +	icc_sync_state(mc->dev);
> +
> +	return 0;
> +
> +remove_nodes:
> +	icc_nodes_remove(&mc->provider);
> +del_provider:
> +	icc_provider_del(&mc->provider);
> +err_msg:
> +	dev_err(mc->dev, "failed to initialize ICC: %d\n", err);
> +
> +	return err;
> +}
> +
>  static int tegra_mc_probe(struct platform_device *pdev)
>  {
>  	struct resource *res;
> @@ -747,6 +874,8 @@ static int tegra_mc_probe(struct platform_device *pdev)
>  		}
>  	}
>  
> +	tegra_mc_interconnect_setup(mc);

Do you want to check the return value here for errors? If not, might as
well make the function return void.

> +
>  	return 0;
>  }
>  
> diff --git a/drivers/memory/tegra/mc.h b/drivers/memory/tegra/mc.h
> index afa3ba45c9e6..abeb6a2cc36a 100644
> --- a/drivers/memory/tegra/mc.h
> +++ b/drivers/memory/tegra/mc.h
> @@ -115,4 +115,12 @@ extern const struct tegra_mc_soc tegra132_mc_soc;
>  extern const struct tegra_mc_soc tegra210_mc_soc;
>  #endif
>  
> +/*
> + * These IDs are for internal use of Tegra's ICC, the values are chosen
> + * such that they don't conflict with the device-tree ICC node IDs.
> + */
> +#define TEGRA_ICC_EMC		1000
> +#define TEGRA_ICC_EMEM		2000
> +#define TEGRA_ICC_MC		3000

Sounds to me like these could equally well be 1000, 1001 and 1002. Why
leave these large holes in the number space?

Thierry
Thierry Reding Oct. 27, 2020, 1:49 p.m. UTC | #20
On Mon, Oct 26, 2020 at 01:17:13AM +0300, Dmitry Osipenko wrote:
> This patch adds modularization support to the Tegra20 EMC driver. Driver
> now can be compiled as a loadable kernel module.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/memory/tegra/Kconfig       |  2 +-
>  drivers/memory/tegra/tegra20-emc.c | 17 ++++++++++++-----
>  2 files changed, 13 insertions(+), 6 deletions(-)

Acked-by: Thierry Reding <treding@nvidia.com>
Thierry Reding Oct. 27, 2020, 1:50 p.m. UTC | #21
On Mon, Oct 26, 2020 at 01:17:14AM +0300, Dmitry Osipenko wrote:
> Use devm_platform_ioremap_resource() helper which makes code a bit
> cleaner.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/memory/tegra/tegra20-emc.c | 4 +---
>  1 file changed, 1 insertion(+), 3 deletions(-)

I'm not a fan of this helper, to be honest, because I think all the
churn that we've seen with the conversions isn't really worth the 1 or 2
lines that it saves, but hey, looks like this is pretty broadly
accepted, so if Krzysztof likes it:

Acked-by: Thierry Reding <treding@nvidia.com>
Thierry Reding Oct. 27, 2020, 1:52 p.m. UTC | #22
On Mon, Oct 26, 2020 at 01:17:15AM +0300, Dmitry Osipenko wrote:
> EMC driver will become mandatory after turning it into interconnect
> provider because interconnect users, like display controller driver, will
> fail to probe using newer device-trees that have interconnect properties.
> Thus make EMC driver to probe even if timings are missing in device-tree.

Does it really have to be mandatory? Sounds like that's going to make it
unnecessarily complicated to merge all of this. Is it complicated to
make interconnect support optional in consumer drivers?

Thierry
Krzysztof Kozlowski Oct. 27, 2020, 1:57 p.m. UTC | #23
On Tue, 27 Oct 2020 at 14:50, Thierry Reding <thierry.reding@gmail.com> wrote:
>
> On Mon, Oct 26, 2020 at 01:17:14AM +0300, Dmitry Osipenko wrote:
> > Use devm_platform_ioremap_resource() helper which makes code a bit
> > cleaner.
> >
> > Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> > ---
> >  drivers/memory/tegra/tegra20-emc.c | 4 +---
> >  1 file changed, 1 insertion(+), 3 deletions(-)
>
> I'm not a fan of this helper, to be honest, because I think all the
> churn that we've seen with the conversions isn't really worth the 1 or 2
> lines that it saves, but hey, looks like this is pretty broadly
> accepted, so if Krzysztof likes it:
>
> Acked-by: Thierry Reding <treding@nvidia.com>

Such changes indeed do not bring much but still less local variables
and -1 line. I am fine with them. They also save one error msg from
devm_ioremap_resource() in case of platform_get_resource() failure.

Best regards,
Krzysztof
Thierry Reding Oct. 27, 2020, 2:11 p.m. UTC | #24
On Mon, Oct 26, 2020 at 01:17:16AM +0300, Dmitry Osipenko wrote:
> Now Internal and External Memory Controllers are memory interconnection
> providers. This allows us to use interconnect API for tuning of memory
> configuration. EMC driver now supports OPPs and DVFS.
> 
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/memory/tegra/Kconfig       |   3 +-
>  drivers/memory/tegra/mc.h          |  12 ++
>  drivers/memory/tegra/tegra20-emc.c | 176 +++++++++++++++++++++++++++++
>  drivers/memory/tegra/tegra20.c     |  34 ++++++
>  4 files changed, 224 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/memory/tegra/Kconfig b/drivers/memory/tegra/Kconfig
> index ff426747cd7d..ac3dfe155505 100644
> --- a/drivers/memory/tegra/Kconfig
> +++ b/drivers/memory/tegra/Kconfig
> @@ -11,7 +11,8 @@ config TEGRA_MC
>  config TEGRA20_EMC
>  	tristate "NVIDIA Tegra20 External Memory Controller driver"
>  	default y
> -	depends on ARCH_TEGRA_2x_SOC
> +	depends on TEGRA_MC && ARCH_TEGRA_2x_SOC
> +	select PM_OPP
>  	help
>  	  This driver is for the External Memory Controller (EMC) found on
>  	  Tegra20 chips. The EMC controls the external DRAM on the board.
> diff --git a/drivers/memory/tegra/mc.h b/drivers/memory/tegra/mc.h
> index abeb6a2cc36a..531fb4fb7b17 100644
> --- a/drivers/memory/tegra/mc.h
> +++ b/drivers/memory/tegra/mc.h
> @@ -78,6 +78,18 @@
>  
>  #define MC_TIMING_UPDATE				BIT(0)
>  
> +static inline u32 tegra_mc_scale_percents(u64 val, unsigned int percents)
> +{
> +	val = val * percents;
> +	do_div(val, 100);
> +
> +	/*
> +	 * High freq + high boosting percent + large polling interval are
> +	 * resulting in integer overflow when watermarks are calculated.
> +	 */
> +	return min_t(u64, val, U32_MAX);
> +}
> +
>  static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
>  {
>  	return readl_relaxed(mc->regs + offset);
> diff --git a/drivers/memory/tegra/tegra20-emc.c b/drivers/memory/tegra/tegra20-emc.c
> index 34085e26dced..69ccb3fe5b0b 100644
> --- a/drivers/memory/tegra/tegra20-emc.c
> +++ b/drivers/memory/tegra/tegra20-emc.c
> @@ -9,6 +9,7 @@
>  #include <linux/clk/tegra.h>
>  #include <linux/debugfs.h>
>  #include <linux/err.h>
> +#include <linux/interconnect-provider.h>
>  #include <linux/interrupt.h>
>  #include <linux/io.h>
>  #include <linux/iopoll.h>
> @@ -16,11 +17,15 @@
>  #include <linux/module.h>
>  #include <linux/of.h>
>  #include <linux/platform_device.h>
> +#include <linux/pm_opp.h>
> +#include <linux/slab.h>
>  #include <linux/sort.h>
>  #include <linux/types.h>
>  
>  #include <soc/tegra/fuse.h>
>  
> +#include "mc.h"
> +
>  #define EMC_INTSTATUS				0x000
>  #define EMC_INTMASK				0x004
>  #define EMC_DBG					0x008
> @@ -144,6 +149,9 @@ struct emc_timing {
>  
>  struct tegra_emc {
>  	struct device *dev;
> +	struct tegra_mc *mc;
> +	struct opp_table *opp_table;
> +	struct icc_provider provider;
>  	struct notifier_block clk_nb;
>  	struct clk *clk;
>  	void __iomem *regs;
> @@ -658,6 +666,166 @@ static void tegra_emc_debugfs_init(struct tegra_emc *emc)
>  			    emc, &tegra_emc_debug_max_rate_fops);
>  }
>  
> +static inline struct tegra_emc *
> +to_tegra_emc_provider(struct icc_provider *provider)
> +{
> +	return container_of(provider, struct tegra_emc, provider);
> +}
> +
> +static struct icc_node_data *
> +emc_of_icc_xlate_extended(struct of_phandle_args *spec, void *data)
> +{
> +	struct icc_provider *provider = data;
> +	struct icc_node_data *ndata;
> +	struct icc_node *node;
> +
> +	/* External Memory is the only possible ICC route */
> +	list_for_each_entry(node, &provider->nodes, node_list) {
> +		if (node->id != TEGRA_ICC_EMEM)
> +			continue;
> +
> +		ndata = kzalloc(sizeof(*ndata), GFP_KERNEL);
> +		if (!ndata)
> +			return ERR_PTR(-ENOMEM);
> +
> +		/*
> +		 * SRC and DST nodes should have matching TAG in order to have
> +		 * it set by default for a requested path.
> +		 */
> +		ndata->tag = TEGRA_MC_ICC_TAG_ISO;
> +		ndata->node = node;
> +
> +		return ndata;
> +	}
> +
> +	return ERR_PTR(-EINVAL);
> +}
> +
> +static int emc_icc_set(struct icc_node *src, struct icc_node *dst)
> +{
> +	struct tegra_emc *emc = to_tegra_emc_provider(dst->provider);
> +	unsigned long long peak_bw = icc_units_to_bps(dst->peak_bw);
> +	unsigned long long avg_bw = icc_units_to_bps(dst->avg_bw);
> +	unsigned long long rate = max(avg_bw, peak_bw);
> +	unsigned int dram_data_bus_width_bytes = 4;

Perhaps use something shorter for this variable (like dram_bus_width)? Also,
since it's never modified, perhaps make it const? Or a #define?

> +	long rounded_rate;
> +	int err;
> +
> +	/*
> +	 * Tegra20 EMC runs on x2 clock rate of SDRAM bus because DDR data
> +	 * is sampled on both clock edges. This means that EMC clock rate
> +	 * equals to the peak data rate.
> +	 */
> +	do_div(rate, dram_data_bus_width_bytes);
> +	rate = min_t(u64, rate, U32_MAX);
> +
> +	rounded_rate = emc_round_rate(rate, 0, U32_MAX, emc);
> +	if (rounded_rate < 0)
> +		return rounded_rate;
> +
> +	err = dev_pm_opp_set_rate(emc->dev, rounded_rate);
> +	if (err)
> +		return err;
> +
> +	return 0;
> +}
> +
> +static int tegra_emc_interconnect_init(struct tegra_emc *emc)
> +{
> +	const struct tegra_mc_soc *soc;
> +	struct icc_node *node;
> +	int err;
> +
> +	emc->mc = devm_tegra_get_memory_controller(emc->dev);
> +	if (IS_ERR(emc->mc))
> +		return PTR_ERR(emc->mc);
> +
> +	soc = emc->mc->soc;
> +
> +	emc->provider.dev = emc->dev;
> +	emc->provider.set = emc_icc_set;
> +	emc->provider.data = &emc->provider;
> +	emc->provider.aggregate = soc->icc_ops->aggregate;
> +	emc->provider.xlate_extended = emc_of_icc_xlate_extended;
> +
> +	err = icc_provider_add(&emc->provider);
> +	if (err)
> +		goto err_msg;
> +
> +	/* create External Memory Controller node */
> +	node = icc_node_create(TEGRA_ICC_EMC);
> +	err = PTR_ERR_OR_ZERO(node);
> +	if (err)
> +		goto del_provider;

As far as I can tell, icc_node_create() always returns either a valid
pointer or an ERR_PTR-encoded negative error-code. So I think the more
idiomatic way to write this would be:

	node = icc_node_create(TEGRA_ICC_EMC);
	if (IS_ERR(node)) {
		err = PTR_ERR(node);
		goto del_provider;
	}

> +
> +	node->name = "External Memory Controller";
> +	icc_node_add(node, &emc->provider);
> +
> +	/* link External Memory Controller to External Memory (DRAM) */
> +	err = icc_link_create(node, TEGRA_ICC_EMEM);
> +	if (err)
> +		goto remove_nodes;
> +
> +	/* create External Memory node */
> +	node = icc_node_create(TEGRA_ICC_EMEM);
> +	err = PTR_ERR_OR_ZERO(node);
> +	if (err)
> +		goto remove_nodes;

Same here.

> +
> +	node->name = "External Memory (DRAM)";
> +	icc_node_add(node, &emc->provider);
> +
> +	return 0;
> +
> +remove_nodes:
> +	icc_nodes_remove(&emc->provider);
> +del_provider:
> +	icc_provider_del(&emc->provider);
> +err_msg:
> +	dev_err(emc->dev, "failed to initialize ICC: %d\n", err);

It might be worth duplicating this error message to the failure
locations so that the exact failure can be identified.

> +
> +	return err;
> +}
> +
> +static int tegra_emc_opp_table_init(struct tegra_emc *emc)
> +{
> +	const char *rname = "core";
> +	int err;
> +
> +	/*
> +	 * Legacy device-trees don't have OPP table and EMC driver isn't
> +	 * useful in this case.
> +	 */
> +	if (!device_property_present(emc->dev, "operating-points-v2")) {
> +		dev_err(emc->dev, "OPP table not found\n");
> +		dev_err(emc->dev, "please update your device tree\n");

This should be a single error message. These messages end up in kmsg
records and having this split into two dev_err() calls makes them into
two separate records and that in turn makes it more difficult to
determine whether they belong together or not.

> +		return -ENODEV;
> +	}
> +
> +	/* voltage scaling is optional */
> +	if (device_property_present(emc->dev, "core-supply"))
> +		emc->opp_table = dev_pm_opp_set_regulators(emc->dev, &rname, 1);
> +	else
> +		emc->opp_table = dev_pm_opp_get_opp_table(emc->dev);
> +
> +	if (IS_ERR(emc->opp_table))
> +		return dev_err_probe(emc->dev, PTR_ERR(emc->opp_table),
> +				     "failed to prepare OPP table\n");
> +
> +	err = dev_pm_opp_of_add_table(emc->dev);
> +	if (err) {
> +		dev_err(emc->dev, "failed to add OPP table: %d\n", err);
> +		goto put_table;
> +	}
> +
> +	return 0;
> +
> +put_table:
> +	dev_pm_opp_put_opp_table(emc->opp_table);
> +
> +	return err;
> +}
> +
>  static int tegra_emc_probe(struct platform_device *pdev)
>  {
>  	struct device_node *np;
> @@ -717,8 +885,13 @@ static int tegra_emc_probe(struct platform_device *pdev)
>  		goto unset_cb;
>  	}
>  
> +	err = tegra_emc_opp_table_init(emc);
> +	if (err)
> +		goto unreg_notifier;
> +
>  	platform_set_drvdata(pdev, emc);
>  	tegra_emc_debugfs_init(emc);
> +	tegra_emc_interconnect_init(emc);
>  
>  	/*
>  	 * Don't allow the kernel module to be unloaded. Unloading adds some
> @@ -729,6 +902,8 @@ static int tegra_emc_probe(struct platform_device *pdev)
>  
>  	return 0;
>  
> +unreg_notifier:
> +	clk_notifier_unregister(emc->clk, &emc->clk_nb);
>  unset_cb:
>  	tegra20_clk_set_emc_round_callback(NULL, NULL);
>  
> @@ -747,6 +922,7 @@ static struct platform_driver tegra_emc_driver = {
>  		.name = "tegra20-emc",
>  		.of_match_table = tegra_emc_of_match,
>  		.suppress_bind_attrs = true,
> +		.sync_state = icc_sync_state,
>  	},
>  };
>  module_platform_driver(tegra_emc_driver);
> diff --git a/drivers/memory/tegra/tegra20.c b/drivers/memory/tegra/tegra20.c
> index a8098bff91d9..5127e8e8250f 100644
> --- a/drivers/memory/tegra/tegra20.c
> +++ b/drivers/memory/tegra/tegra20.c
> @@ -280,6 +280,39 @@ static const struct tegra_mc_reset_ops tegra20_mc_reset_ops = {
>  	.reset_status = tegra20_mc_reset_status,
>  };
>  
> +static int tegra20_mc_icc_set(struct icc_node *src, struct icc_node *dst)
> +{
> +	/*
> +	 * Technically, it should be possible to tune arbitration knobs here,
> +	 * but the default values are known to work well on all devices.
> +	 * Hence nothing to do here so far.
> +	 */
> +	return 0;
> +}
> +
> +static int tegra20_mc_icc_aggreate(struct icc_node *node, u32 tag, u32 avg_bw,
> +				   u32 peak_bw, u32 *agg_avg, u32 *agg_peak)
> +{
> +	/*
> +	 * ISO clients need to reserve extra bandwidth up-front because
> +	 * there could high bandwidth pressure during initial fulling-up

"filling of the client's FIFO buffers"

> +	 * of the client's FIFO buffers. Secondly, we need to take into
> +	 * account impurities of the memory subsystem.
> +	 */
> +	if (tag == TEGRA_MC_ICC_TAG_ISO)
> +		peak_bw = tegra_mc_scale_percents(peak_bw, 300);

300% sounds a bit excessive. Do we really need that much?

> +
> +	*agg_avg += avg_bw;
> +	*agg_peak = max(*agg_peak, peak_bw);

I'm not very familiar with ICC, but shouldn't the aggregated peak value
be the sum of the current aggregated peak and the new peak bandwidth?
Currently you're selecting the maximum peak bandwidth across all
clients, so isn't that going to be too small if for whatever reason
multiple clients need peak bandwidth at the same time?

Thierry
Dmitry Osipenko Oct. 27, 2020, 7:23 p.m. UTC | #25
27.10.2020 12:15, Krzysztof Kozlowski пишет:
...>> @@ -748,15 +770,18 @@ mc: memory-controller@7000f000 {
>>  
>>  		#iommu-cells = <1>;
>>  		#reset-cells = <1>;
>> +		#interconnect-cells = <1>;
>>  	};
>>  
>> -	memory-controller@7000f400 {
>> +	emc: memory-controller@7000f400 {
>>  		compatible = "nvidia,tegra30-emc";
>>  		reg = <0x7000f400 0x400>;
>>  		interrupts = <GIC_SPI 78 IRQ_TYPE_LEVEL_HIGH>;
>>  		clocks = <&tegra_car TEGRA30_CLK_EMC>;
>>  
>>  		nvidia,memory-controller = <&mc>;
>> +
> 
> No need for blank line.

It's needed to make MC and EMC nodes look consistent. See the MC node
above which has the blank line.

> 
>> +		#interconnect-cells = <0>;
>>  	};
>>  
>>  	fuse@7000f800 {
Dmitry Osipenko Oct. 27, 2020, 7:24 p.m. UTC | #26
27.10.2020 12:42, Krzysztof Kozlowski пишет:
> On Mon, Oct 26, 2020 at 01:17:11AM +0300, Dmitry Osipenko wrote:
>> Multiple Tegra drivers need to retrieve Memory Controller and there is
>> duplication of the retrieval code among the drivers. This patch removes
>> the duplication and fixes put_device() which was missed in the duplicated
>> code.
>>
>> EMC drivers now use new common devm_tegra_get_memory_controller() helper
>> instead of opencoding the MC retrieval.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>  drivers/memory/tegra/mc.c                | 48 ++++++++++++++++++++++++
>>  drivers/memory/tegra/tegra124-emc.c      | 18 ++-------
>>  drivers/memory/tegra/tegra210-emc-core.c | 39 +++++--------------
>>  drivers/memory/tegra/tegra30-emc.c       | 18 ++-------
>>  include/soc/tegra/mc.h                   | 10 +++++
>>  5 files changed, 74 insertions(+), 59 deletions(-)
>>
>> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
>> index ec8403557ed4..12ea2c79205a 100644
>> --- a/drivers/memory/tegra/mc.c
>> +++ b/drivers/memory/tegra/mc.c
>> @@ -42,6 +42,54 @@ static const struct of_device_id tegra_mc_of_match[] = {
>>  };
>>  MODULE_DEVICE_TABLE(of, tegra_mc_of_match);
>>  
>> +static void tegra_mc_devm_action_put_device(void *data)
>> +{
>> +	struct tegra_mc *mc = data;
>> +
>> +	put_device(mc->dev);
>> +}
>> +
>> +/**
>> + * devm_tegra_get_memory_controller() - get Tegra Memory Controller handle
>> + * @dev: device pointer for the consumer device
>> + *
>> + * This function will search for the Memory Controller node in a device-tree
>> + * and retrieve the Memory Controller handle.
>> + *
>> + * Return: ERR_PTR() on error or a valid pointer to a struct tegra_mc.
>> + */
>> +struct tegra_mc *devm_tegra_get_memory_controller(struct device *dev)
> 
> Usually 'get' is a suffix (for example in clk, gpiod, iio, led), so:
> devm_tegra_memory_controller_get()

Alright, I'll rename it in v7.
Dmitry Osipenko Oct. 27, 2020, 7:30 p.m. UTC | #27
27.10.2020 16:48, Thierry Reding пишет:
...
>> +static struct icc_node_data *
>> +tegra_mc_of_icc_xlate_extended(struct of_phandle_args *spec, void *data)
>> +{
>> +	struct icc_provider *provider = data;
>> +	unsigned int idx = spec->args[0];
>> +	struct icc_node_data *ndata;
>> +	struct icc_node *node;
>> +
>> +	list_for_each_entry(node, &provider->nodes, node_list) {
>> +		if (node->id != idx)
>> +			continue;
>> +
>> +		ndata = kzalloc(sizeof(*ndata), GFP_KERNEL);
>> +		if (!ndata)
>> +			return ERR_PTR(-ENOMEM);
>> +
>> +		ndata->node = node;
>> +
>> +		/* these clients are isochronous by default on all SoCs */
>> +		if (strstarts(node->name, "display") ||
>> +		    strstarts(node->name, "ptc") ||
>> +		    strstarts(node->name, "vi"))
>> +			ndata->tag = TEGRA_MC_ICC_TAG_ISO;
> 
> This looks like something that might be better left to the drivers to
> decide. Doing this here seems okay for now, but I suspect that this will
> get fairly complicated to keep accurate as we add more clients later on.

It's not a problem to add a driver-specific hook for the
xlate_extended(), like it's done for the aggregate() and set() hooks below.

...
>> +static int tegra_mc_interconnect_setup(struct tegra_mc *mc)
>> +{
>> +	struct icc_node *node;
>> +	unsigned int i;
>> +	int err;
>> +
>> +	/* older device-trees don't have interconnect properties */
>> +	if (!of_find_property(mc->dev->of_node, "#interconnect-cells", NULL) ||
>> +	    !mc->soc->icc_ops)
>> +		return 0;
> 
> This indicates that this property is indeed optional, so the bindings
> should reflect that.

Yes, but the property isn't optional for the newer binding. Does it
really need to be documented as optional?

>> +	mc->provider.dev = mc->dev;
>> +	mc->provider.data = &mc->provider;
>> +	mc->provider.set = mc->soc->icc_ops->set;
>> +	mc->provider.aggregate = mc->soc->icc_ops->aggregate;
>> +	mc->provider.xlate_extended = tegra_mc_of_icc_xlate_extended;
>> +
>> +	err = icc_provider_add(&mc->provider);
>> +	if (err)
>> +		goto err_msg;
>> +
>> +	/* create Memory Controller node */
>> +	node = icc_node_create(TEGRA_ICC_MC);
>> +	err = PTR_ERR_OR_ZERO(node);
>> +	if (err)
>> +		goto del_provider;
>> +
>> +	node->name = "Memory Controller";
>> +	icc_node_add(node, &mc->provider);
>> +
>> +	/* link Memory Controller to External Memory Controller */
>> +	err = icc_link_create(node, TEGRA_ICC_EMC);
>> +	if (err)
>> +		goto remove_nodes;
>> +
>> +	for (i = 0; i < mc->soc->num_clients; i++) {
>> +		/* create MC client node */
>> +		node = icc_node_create(mc->soc->clients[i].id);
>> +		err = PTR_ERR_OR_ZERO(node);
>> +		if (err)
>> +			goto remove_nodes;
>> +
>> +		node->name = mc->soc->clients[i].name;
>> +		icc_node_add(node, &mc->provider);
> 
> I'm not fully familiar with how these nodes are set up, but would it be
> possible to set the isochronous tag here already? I'd still prefer this
> to be up to the drivers because I think that nicely localizes the
> device-specific information in the driver, but if that's not an option,
> then doing it here, based on lookup data from the MC clients table
> sounds like the next best thing.

The tag needs to be set by xlate_extended(), otherwise it won't be
applied by default.

https://elixir.bootlin.com/linux/v5.10-rc1/source/drivers/interconnect/core.c#L501

...
>>  static int tegra_mc_probe(struct platform_device *pdev)
>>  {
>>  	struct resource *res;
>> @@ -747,6 +874,8 @@ static int tegra_mc_probe(struct platform_device *pdev)
>>  		}
>>  	}
>>  
>> +	tegra_mc_interconnect_setup(mc);
> 
> Do you want to check the return value here for errors? If not, might as
> well make the function return void.

The error won't be fatal and shouldn't block the rest functionality of
the MC driver.

It's possible to return void, but it's not necessary because compiler
will take care of optimizing the code and to me it's more consistent to
have error code returned by the function.

Perhaps should be better to just add a comment telling that error
skipping is intentional?

...
>> diff --git a/drivers/memory/tegra/mc.h b/drivers/memory/tegra/mc.h
>> index afa3ba45c9e6..abeb6a2cc36a 100644
>> --- a/drivers/memory/tegra/mc.h
>> +++ b/drivers/memory/tegra/mc.h
>> @@ -115,4 +115,12 @@ extern const struct tegra_mc_soc tegra132_mc_soc;
>>  extern const struct tegra_mc_soc tegra210_mc_soc;
>>  #endif
>>  
>> +/*
>> + * These IDs are for internal use of Tegra's ICC, the values are chosen
>> + * such that they don't conflict with the device-tree ICC node IDs.
>> + */
>> +#define TEGRA_ICC_EMC		1000
>> +#define TEGRA_ICC_EMEM		2000
>> +#define TEGRA_ICC_MC		3000
> 
> Sounds to me like these could equally well be 1000, 1001 and 1002. Why
> leave these large holes in the number space?

There is no specific reason, I can change the numbers if you want.
Dmitry Osipenko Oct. 27, 2020, 7:38 p.m. UTC | #28
27.10.2020 16:52, Thierry Reding пишет:
> On Mon, Oct 26, 2020 at 01:17:15AM +0300, Dmitry Osipenko wrote:
>> EMC driver will become mandatory after turning it into interconnect
>> provider because interconnect users, like display controller driver, will
>> fail to probe using newer device-trees that have interconnect properties.
>> Thus make EMC driver to probe even if timings are missing in device-tree.
> 
> Does it really have to be mandatory? Sounds like that's going to make it
> unnecessarily complicated to merge all of this. Is it complicated to
> make interconnect support optional in consumer drivers?

Interconnect provider can't be optional if interconnect properties
present in a device-tree because drivers that use ICC path will get
-EPROBE_DEFER until ICC provider is registered.

Older device-trees don't have ICC properties, and thus, the ICC
users/consumers will get a dummy NULL ICC path in this case. I.e. ICC
core handles this for us.
Dmitry Osipenko Oct. 27, 2020, 8:22 p.m. UTC | #29
27.10.2020 17:11, Thierry Reding пишет:
...
>> +static int emc_icc_set(struct icc_node *src, struct icc_node *dst)
>> +{
>> +	struct tegra_emc *emc = to_tegra_emc_provider(dst->provider);
>> +	unsigned long long peak_bw = icc_units_to_bps(dst->peak_bw);
>> +	unsigned long long avg_bw = icc_units_to_bps(dst->avg_bw);
>> +	unsigned long long rate = max(avg_bw, peak_bw);
>> +	unsigned int dram_data_bus_width_bytes = 4;
> 
> Perhaps use something shorter for this variable (like dram_bus_width)? Also,
> since it's never modified, perhaps make it const? Or a #define?

It actually could be 2, depending on a board configuration, but I don't
know whether a 16bit bus was ever used in a wild. AFAIK, nv-tegra
kernels assumes 32bit bus for all devices.

...
>> +err_msg:
>> +	dev_err(emc->dev, "failed to initialize ICC: %d\n", err);
> 
> It might be worth duplicating this error message to the failure
> locations so that the exact failure can be identified.

I think it should be better to extend error messages on by as-needed
basis. It's very unlikely that we will ever see this error in practice.
Okay?

...
>> +	 * of the client's FIFO buffers. Secondly, we need to take into
>> +	 * account impurities of the memory subsystem.
>> +	 */
>> +	if (tag == TEGRA_MC_ICC_TAG_ISO)
>> +		peak_bw = tegra_mc_scale_percents(peak_bw, 300);
> 
> 300% sounds a bit excessive. Do we really need that much?

It should be possible to drop it to 150% by tuning priority timers and
hysteresis of the clients, but some of those configurations are placed
within device registers range and we will need a more complicated
bandwidth manager.

The 300% is an overestimation, but it's better to overestimate for the
starter than have an unusable devices. This is what nv-tegra kernel does
as well, btw.

>> +
>> +	*agg_avg += avg_bw;
>> +	*agg_peak = max(*agg_peak, peak_bw);
> 
> I'm not very familiar with ICC, but shouldn't the aggregated peak value
> be the sum of the current aggregated peak and the new peak bandwidth?
> Currently you're selecting the maximum peak bandwidth across all
> clients, so isn't that going to be too small if for whatever reason
> multiple clients need peak bandwidth at the same time?

It's up to the platform drivers to decide how to interpret and use the
avg and peak values.

Please see the above emc_icc_set() which selects max of (avg, peak)
values, but maybe it also should be good to move it out from ICC set()
to the ICC aggregate() callback:

*agg_peak = max(*agg_peak, *agg_avg);

I'll need to take a closer look.
Dmitry Osipenko Oct. 27, 2020, 8:25 p.m. UTC | #30
27.10.2020 13:09, Krzysztof Kozlowski пишет:
...
>> +err_msg:
>> +	dev_err(emc->dev, "failed to initialize ICC: %d\n", err);
> 
> You will print such errors on all existing DTBs. Since it is not a
> failure of probe (it is actually quite expected, normal situation when
> booting with older DTB), let's change it to warning (here and in all
> other places and drivers).

The existing DTBs will be stopped on the error message below.

>> +
>> +	return err;
>> +}
>> +
>> +static int tegra_emc_opp_table_init(struct tegra_emc *emc)
>> +{
>> +	const char *rname = "core";
>> +	int err;
>> +
>> +	/*
>> +	 * Legacy device-trees don't have OPP table and EMC driver isn't
>> +	 * useful in this case.
>> +	 */
>> +	if (!device_property_present(emc->dev, "operating-points-v2")) {
>> +		dev_err(emc->dev, "OPP table not found\n");
>> +		dev_err(emc->dev, "please update your device tree\n");
>> +		return -ENODEV;
>> +	}

The existing DTBs are stopped here.

...
>> +	err = tegra_emc_opp_table_init(emc);
>> +	if (err)
>> +		goto unreg_notifier;
> 
> This looks like the ABI break I mentioned around DT bindings. Are the
> bindings marked as unstable?

This T20 EMC driver wasn't ever used so far at all and this series makes
it useful. Hence I think it should be fine to assume that the T20 EMC
ABI is unstable.
Dmitry Osipenko Oct. 27, 2020, 8:26 p.m. UTC | #31
27.10.2020 08:10, Viresh Kumar пишет:
> On 26-10-20, 01:17, Dmitry Osipenko wrote:
>> This patch fixes lockup which happens when OPP table is released if
>> interconnect provider uses OPP in the icc_provider->set() callback
>> and bandwidth of the ICC path is set to 0 by the ICC core when path
>> is released. The icc_put() doesn't need the opp_table_lock protection,
>> hence let's move it outside of the lock in order to resolve the problem.
>>
>> In particular this fixes tegra-devfreq driver lockup on trying to unload
>> the driver module. The devfreq driver uses OPP-bandwidth API and its ICC
>> provider also uses OPP for DVFS, hence they both take same opp_table_lock
>> when OPP table of the devfreq is released.
>>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
...
> 
> Never make such _fixes_ part of such a big patchset. Always send them
> separately.

Perhaps it's not obvious from the commit description that this patch
doesn't fix any known problems of the current mainline kernel and it's
needed only for the new patches.

> Having said that, I already have a patch with me which shall fix it for you as
> well:

I see that yours fix is already applied, thanks!
Dmitry Osipenko Oct. 27, 2020, 8:30 p.m. UTC | #32
27.10.2020 13:27, Krzysztof Kozlowski пишет:
> On Mon, Oct 26, 2020 at 01:17:24AM +0300, Dmitry Osipenko wrote:
>> Use devm_platform_ioremap_resource() helper which makes code a bit
>> cleaner.
> 
> Such cleanups (and few other in this patchset) should be at beginning of
> patchset or even as part of a separate one.  I think there is not much
> stopping anyone from applying these... except that you put them in the
> middle of big dependency.

Some of these cleanup patches can't be applied separately without a need
to make a rebase. I think it should be more preferred to have all the
patches within a single series.

I'll try to reorder the patches in v7 if this will ease the review, thanks.
Dmitry Osipenko Oct. 27, 2020, 8:31 p.m. UTC | #33
27.10.2020 11:52, Krzysztof Kozlowski пишет:
> On Mon, Oct 26, 2020 at 10:14:10PM +0300, Dmitry Osipenko wrote:
>> 26.10.2020 18:08, Krzysztof Kozlowski пишет:
>>> On Mon, Oct 26, 2020 at 01:16:43AM +0300, Dmitry Osipenko wrote:
>>>> Hello,
>>>>
>>>> This series brings initial support for memory interconnect to Tegra20,
>>>> Tegra30 and Tegra124 SoCs.
>>>>
>>>> For the starter only display controllers and devfreq devices are getting
>>>> interconnect API support, others could be supported later on. The display
>>>> controllers have the biggest demand for interconnect API right now because
>>>> dynamic memory frequency scaling can't be done safely without taking into
>>>> account bandwidth requirement from the displays. In particular this series
>>>> fixes distorted display output on T30 Ouya and T124 TK1 devices.
>>>
>>> Hi,
>>>
>>> You introduced in v6 multiple new patches. Could you describe the
>>> dependencies, if any?
>>
>> Hello,
>>
>> The v6 dropped some older patches and replaced them with the new
>> patches. Previously I wanted to postpone the more complex changes for
>> later times, like supporting OPP tables and DVFS, but then the review
>> started to take more time than was expected and there was enough time to
>> complete those features.
>>
>> There are five basic sets of patches:
>>
>> 	1. DT bindings
>> 	2. DT changes
>> 	3. SoC, clk and memory
>> 	4. devfreq
>> 	5. DRM
>>
>> Each set could be applied separately.
>>
>> Memory patches have a build dependency on the SoC and clk patches.
>>
>> The "tegra-mc: Add interconnect framework" and "Add and use
>> devm_tegra_get_memory_controller()" are the root build dependencies for
>> all memory/ patches. Other patches are grouped per SoC generation
>> (Tegra20/30/124), patches within a SoC-gen group are interdependent.
>>
>> I suppose the best variant would be to merge the whole series via
>> tegra-tree in order to preserve logical order of the patches. Although,
>> merging each set of patches separately also should be okay to do.
> 
> Thanks for explanation. I already have three patches for Tegra MC (and
> probably two more will be respun) so this might be conflict-prone. The
> easiest in such case is to provide me soc and clk patches on the branch,
> so I could keep all Tegra MC together.

Okay, but those T210 patches don't touch the same code, neither same
source files. For now there should be no merge conflicts.
Dmitry Osipenko Oct. 27, 2020, 8:43 p.m. UTC | #34
27.10.2020 12:10, Krzysztof Kozlowski пишет:
> On Mon, Oct 26, 2020 at 01:17:03AM +0300, Dmitry Osipenko wrote:
>> The Tegra20 EMC registers size should be twice bigger. This patch fixes
>> the size.
> 
> Don't use "This patch" (this appears here). Better to use:
> "Fix the size of ..." or just "The size should be twice bigger" as it is
> obvious that you fix it.
> 
> https://elixir.bootlin.com/linux/latest/source/Documentation/process/submitting-patches.rst#L151
> 
> Acked-by: Krzysztof Kozlowski <krzk@kernel.org>

Thanks, I wasn't aware that it's a preferred wording style now.
Dmitry Osipenko Oct. 27, 2020, 9:12 p.m. UTC | #35
27.10.2020 23:22, Dmitry Osipenko пишет:
...
>>> +
>>> +	*agg_avg += avg_bw;
>>> +	*agg_peak = max(*agg_peak, peak_bw);
>>
>> I'm not very familiar with ICC, but shouldn't the aggregated peak value
>> be the sum of the current aggregated peak and the new peak bandwidth?
>> Currently you're selecting the maximum peak bandwidth across all
>> clients, so isn't that going to be too small if for whatever reason
>> multiple clients need peak bandwidth at the same time?

The current variant with max-peak selection should be okay since it
takes into account the competing ISO bandwidths of other devices by
overestimating the bandwidth.

For now we have only display ISO clients and it won't be a problem to
tune the algorithm later on if it won't work well for other ISO clients.
Viresh Kumar Oct. 28, 2020, 4:03 a.m. UTC | #36
On 27-10-20, 23:26, Dmitry Osipenko wrote:
> 27.10.2020 08:10, Viresh Kumar пишет:
> > On 26-10-20, 01:17, Dmitry Osipenko wrote:
> >> This patch fixes lockup which happens when OPP table is released if
> >> interconnect provider uses OPP in the icc_provider->set() callback
> >> and bandwidth of the ICC path is set to 0 by the ICC core when path
> >> is released. The icc_put() doesn't need the opp_table_lock protection,
> >> hence let's move it outside of the lock in order to resolve the problem.
> >>
> >> In particular this fixes tegra-devfreq driver lockup on trying to unload
> >> the driver module. The devfreq driver uses OPP-bandwidth API and its ICC
> >> provider also uses OPP for DVFS, hence they both take same opp_table_lock
> >> when OPP table of the devfreq is released.
> >>
> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >> ---
> ...
> > 
> > Never make such _fixes_ part of such a big patchset. Always send them
> > separately.
> 
> Perhaps it's not obvious from the commit description that this patch
> doesn't fix any known problems of the current mainline kernel and it's
> needed only for the new patches.

No, I understood that we started getting the warning now only after
some other patches of yours. Nevertheless, it should be considered as
a fix only as that generated lockdep because of locking placement. And
so sending such stuff separately is better as that allows people to
apply it fast.

> > Having said that, I already have a patch with me which shall fix it for you as
> > well:
> 
> I see that yours fix is already applied, thanks!

I hope it worked for you. Thanks.
Krzysztof Kozlowski Oct. 28, 2020, 7:28 p.m. UTC | #37
On Tue, Oct 27, 2020 at 11:30:31PM +0300, Dmitry Osipenko wrote:
> 27.10.2020 13:27, Krzysztof Kozlowski пишет:
> > On Mon, Oct 26, 2020 at 01:17:24AM +0300, Dmitry Osipenko wrote:
> >> Use devm_platform_ioremap_resource() helper which makes code a bit
> >> cleaner.
> > 
> > Such cleanups (and few other in this patchset) should be at beginning of
> > patchset or even as part of a separate one.  I think there is not much
> > stopping anyone from applying these... except that you put them in the
> > middle of big dependency.
> 
> Some of these cleanup patches can't be applied separately without a need
> to make a rebase. I think it should be more preferred to have all the
> patches within a single series.
> 
> I'll try to reorder the patches in v7 if this will ease the review, thanks.

If feasible, that would be good. Thanks.

Best regards,
Krzysztof
Chanwoo Choi Nov. 1, 2020, 1:31 p.m. UTC | #38
Hi Dmitry,

This patch contains the three features as following:
1. Use interconnect interface for controlling the clock instead of
controlling it direclty
2. Use EMC_STAT instead of IMC_STAT
3. Change polling_interval and upthreshold for more fast responsiveness

I think you need to make the separate patches for each role.
But, if it is difficult or not proper to split out 1,2 roles, you can
make two patches for 1,2 and 3 roles.

Also, if you want to get more responsiveness, you could use delayed timer
instead of deferrable timer by editing the devfreq_dev_profile structure.

Regards,
Chanwoo Choi



On Mon, Oct 26, 2020 at 7:21 AM Dmitry Osipenko <digetx@gmail.com> wrote:
>
> External (EMC) and Internal Memory Controllers (IMC) have a nearly
> identical statistics gathering module. This patch switches driver to use
> EMC_STAT instead of IMC_STAT and adds device-tree support which brings ICC
> support and makes driver to use bandwidth OPPs defined in device-tree.
>
> The previous tegra20-devfreq variant was depending on presence of both
> EMC and IMC drivers simultaneously because it wasn't apparent how to use
> EMC_STAT properly back in the day. Dependency on the IMC driver is gone
> after this patch.
>
> The older variant of the devfreq driver also isn't suitable anymore
> because EMC got support for interconnect framework and DVFS, hence
> tegra20-devfreq shouldn't drive the EMC clock directly, but use OPP
> API for issuing memory bandwidth requests.
>
> The polling interval is changed from 500ms to 30ms in order to improve
> responsiveness of the system in general and because EMC clock is now
> allowed to go lower than before since display driver supports ICC now
> as well.
>
> The parent EMC device is an MFD device now and tegra20-devfreq its
> sub-device. Devfreq driver uses SYSCON API for retrieving regmap of the
> EMC registers from the parent device.
>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/devfreq/Kconfig           |   1 +
>  drivers/devfreq/tegra20-devfreq.c | 174 +++++++++++++-----------------
>  2 files changed, 75 insertions(+), 100 deletions(-)
>
> diff --git a/drivers/devfreq/Kconfig b/drivers/devfreq/Kconfig
> index 0ee36ae2fa79..1bd225e571df 100644
> --- a/drivers/devfreq/Kconfig
> +++ b/drivers/devfreq/Kconfig
> @@ -126,6 +126,7 @@ config ARM_TEGRA20_DEVFREQ
>         depends on ARCH_TEGRA_2x_SOC || COMPILE_TEST
>         depends on COMMON_CLK
>         select DEVFREQ_GOV_SIMPLE_ONDEMAND
> +       select MFD_SYSCON
>         help
>           This adds the DEVFREQ driver for the Tegra20 family of SoCs.
>           It reads Memory Controller counters and adjusts the operating
> diff --git a/drivers/devfreq/tegra20-devfreq.c b/drivers/devfreq/tegra20-devfreq.c
> index fd801534771d..0a36b085d32a 100644
> --- a/drivers/devfreq/tegra20-devfreq.c
> +++ b/drivers/devfreq/tegra20-devfreq.c
> @@ -7,180 +7,148 @@
>
>  #include <linux/clk.h>
>  #include <linux/devfreq.h>
> -#include <linux/io.h>
>  #include <linux/kernel.h>
>  #include <linux/module.h>
> +#include <linux/mfd/syscon.h>
>  #include <linux/of_device.h>
>  #include <linux/platform_device.h>
>  #include <linux/pm_opp.h>
> +#include <linux/regmap.h>
>  #include <linux/slab.h>
>
> -#include <soc/tegra/mc.h>
> -
>  #include "governor.h"
>
> -#define MC_STAT_CONTROL                                0x90
> -#define MC_STAT_EMC_CLOCK_LIMIT                        0xa0
> -#define MC_STAT_EMC_CLOCKS                     0xa4
> -#define MC_STAT_EMC_CONTROL                    0xa8
> -#define MC_STAT_EMC_COUNT                      0xb8
> +#define EMC_STAT_CONTROL                       0x160
> +#define EMC_STAT_LLMC_CONTROL                  0x178
> +#define EMC_STAT_PWR_CLOCK_LIMIT               0x198
> +#define EMC_STAT_PWR_CLOCKS                    0x19c
> +#define EMC_STAT_PWR_COUNT                     0x1a0
>
> -#define EMC_GATHER_CLEAR                       (1 << 8)
> -#define EMC_GATHER_ENABLE                      (3 << 8)
> +#define EMC_PWR_GATHER_CLEAR                   (1 << 8)
> +#define EMC_PWR_GATHER_DISABLE                 (2 << 8)
> +#define EMC_PWR_GATHER_ENABLE                  (3 << 8)
>
>  struct tegra_devfreq {
> +       struct devfreq_simple_ondemand_data ondemand_data;
> +       struct opp_table *opp_table;
>         struct devfreq *devfreq;
>         struct clk *emc_clock;
> -       void __iomem *regs;
> +       struct regmap *rmap;
>  };
>
>  static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
>                                 u32 flags)
>  {
> -       struct tegra_devfreq *tegra = dev_get_drvdata(dev);
> -       struct devfreq *devfreq = tegra->devfreq;
>         struct dev_pm_opp *opp;
> -       unsigned long rate;
> -       int err;
> +       int ret;
>
>         opp = devfreq_recommended_opp(dev, freq, flags);
> -       if (IS_ERR(opp))
> +       if (IS_ERR(opp)) {
> +               dev_err(dev, "failed to find opp for %lu Hz\n", *freq);
>                 return PTR_ERR(opp);
> +       }
>
> -       rate = dev_pm_opp_get_freq(opp);
> +       ret = dev_pm_opp_set_bw(dev, opp);
>         dev_pm_opp_put(opp);
>
> -       err = clk_set_min_rate(tegra->emc_clock, rate);
> -       if (err)
> -               return err;
> -
> -       err = clk_set_rate(tegra->emc_clock, 0);
> -       if (err)
> -               goto restore_min_rate;
> -
> -       return 0;
> -
> -restore_min_rate:
> -       clk_set_min_rate(tegra->emc_clock, devfreq->previous_freq);
> -
> -       return err;
> +       return ret;
>  }
>
>  static int tegra_devfreq_get_dev_status(struct device *dev,
>                                         struct devfreq_dev_status *stat)
>  {
>         struct tegra_devfreq *tegra = dev_get_drvdata(dev);
> +       u32 count, clocks;
>
> -       /*
> -        * EMC_COUNT returns number of memory events, that number is lower
> -        * than the number of clocks. Conversion ratio of 1/8 results in a
> -        * bit higher bandwidth than actually needed, it is good enough for
> -        * the time being because drivers don't support requesting minimum
> -        * needed memory bandwidth yet.
> -        *
> -        * TODO: adjust the ratio value once relevant drivers will support
> -        * memory bandwidth management.
> -        */
> -       stat->busy_time = readl_relaxed(tegra->regs + MC_STAT_EMC_COUNT);
> -       stat->total_time = readl_relaxed(tegra->regs + MC_STAT_EMC_CLOCKS) / 8;
> -       stat->current_frequency = clk_get_rate(tegra->emc_clock);
> +       /* freeze counters */
> +       regmap_write(tegra->rmap, EMC_STAT_CONTROL, EMC_PWR_GATHER_DISABLE);
> +
> +       /* number of clocks when EMC request was accepted */
> +       regmap_read(tegra->rmap, EMC_STAT_PWR_COUNT, &count);
> +       /* total number of clocks while PWR_GATHER control was set to ENABLE */
> +       regmap_read(tegra->rmap, EMC_STAT_PWR_CLOCKS, &clocks);
>
> -       writel_relaxed(EMC_GATHER_CLEAR, tegra->regs + MC_STAT_CONTROL);
> -       writel_relaxed(EMC_GATHER_ENABLE, tegra->regs + MC_STAT_CONTROL);
> +       /* clear counters and restart */
> +       regmap_write(tegra->rmap, EMC_STAT_CONTROL, EMC_PWR_GATHER_CLEAR);
> +       regmap_write(tegra->rmap, EMC_STAT_CONTROL, EMC_PWR_GATHER_ENABLE);
> +
> +       stat->busy_time = count;
> +       stat->total_time = clocks;
> +       stat->current_frequency = clk_get_rate(tegra->emc_clock);
>
>         return 0;
>  }
>
>  static struct devfreq_dev_profile tegra_devfreq_profile = {
> -       .polling_ms     = 500,
> +       .polling_ms     = 30,
>         .target         = tegra_devfreq_target,
>         .get_dev_status = tegra_devfreq_get_dev_status,
>  };
>
> -static struct tegra_mc *tegra_get_memory_controller(void)
> -{
> -       struct platform_device *pdev;
> -       struct device_node *np;
> -       struct tegra_mc *mc;
> -
> -       np = of_find_compatible_node(NULL, NULL, "nvidia,tegra20-mc-gart");
> -       if (!np)
> -               return ERR_PTR(-ENOENT);
> -
> -       pdev = of_find_device_by_node(np);
> -       of_node_put(np);
> -       if (!pdev)
> -               return ERR_PTR(-ENODEV);
> -
> -       mc = platform_get_drvdata(pdev);
> -       if (!mc)
> -               return ERR_PTR(-EPROBE_DEFER);
> -
> -       return mc;
> -}
> -
>  static int tegra_devfreq_probe(struct platform_device *pdev)
>  {
> +       struct device_node *emc_np = pdev->dev.parent->of_node;
>         struct tegra_devfreq *tegra;
> -       struct tegra_mc *mc;
> -       unsigned long max_rate;
> -       unsigned long rate;
>         int err;
>
> -       mc = tegra_get_memory_controller();
> -       if (IS_ERR(mc)) {
> -               err = PTR_ERR(mc);
> -               dev_err(&pdev->dev, "failed to get memory controller: %d\n",
> -                       err);
> -               return err;
> -       }
> -
>         tegra = devm_kzalloc(&pdev->dev, sizeof(*tegra), GFP_KERNEL);
>         if (!tegra)
>                 return -ENOMEM;
>
>         /* EMC is a system-critical clock that is always enabled */
> -       tegra->emc_clock = devm_clk_get(&pdev->dev, "emc");
> +       tegra->emc_clock = devm_get_clk_from_child(&pdev->dev, emc_np, NULL);
>         if (IS_ERR(tegra->emc_clock))
>                 return dev_err_probe(&pdev->dev, PTR_ERR(tegra->emc_clock),
>                                      "failed to get emc clock\n");
>
> -       tegra->regs = mc->regs;
> -
> -       max_rate = clk_round_rate(tegra->emc_clock, ULONG_MAX);
> +       tegra->rmap = device_node_to_regmap(emc_np);
> +       if (IS_ERR(tegra->rmap))
> +               return dev_err_probe(&pdev->dev, PTR_ERR(tegra->rmap),
> +                                    "failed to get emc regmap\n");
>
> -       for (rate = 0; rate <= max_rate; rate++) {
> -               rate = clk_round_rate(tegra->emc_clock, rate);
> +       tegra->opp_table = dev_pm_opp_get_opp_table(&pdev->dev);
> +       if (IS_ERR(tegra->opp_table))
> +               return dev_err_probe(&pdev->dev, PTR_ERR(tegra->opp_table),
> +                                    "failed to prepare opp table\n");
>
> -               err = dev_pm_opp_add(&pdev->dev, rate, 0);
> -               if (err) {
> -                       dev_err(&pdev->dev, "failed to add opp: %d\n", err);
> -                       goto remove_opps;
> -               }
> +       err = dev_pm_opp_of_add_table(&pdev->dev);
> +       if (err) {
> +               dev_err(&pdev->dev, "failed to add opp table: %d\n", err);
> +               goto put_table;
>         }
>
> +       /*
> +        * PWR_COUNT is 1/2 of PWR_CLOCKS at max, and thus, the up-threshold
> +        * should be less than 50.  Secondly, multiple active memory clients
> +        * may cause over 20 of lost clock cycles due to stalls caused by
> +        * competing memory accesses.  This means that threshold should be
> +        * set to a less than 30 in order to have a properly working governor.
> +        */
> +       tegra->ondemand_data.upthreshold = 20;
> +
>         /*
>          * Reset statistic gathers state, select global bandwidth for the
>          * statistics collection mode and set clocks counter saturation
>          * limit to maximum.
>          */
> -       writel_relaxed(0x00000000, tegra->regs + MC_STAT_CONTROL);
> -       writel_relaxed(0x00000000, tegra->regs + MC_STAT_EMC_CONTROL);
> -       writel_relaxed(0xffffffff, tegra->regs + MC_STAT_EMC_CLOCK_LIMIT);
> +       regmap_write(tegra->rmap, EMC_STAT_CONTROL, 0x00000000);
> +       regmap_write(tegra->rmap, EMC_STAT_LLMC_CONTROL, 0x00000000);
> +       regmap_write(tegra->rmap, EMC_STAT_PWR_CLOCK_LIMIT, 0xffffffff);
>
>         platform_set_drvdata(pdev, tegra);
>
>         tegra->devfreq = devfreq_add_device(&pdev->dev, &tegra_devfreq_profile,
> -                                           DEVFREQ_GOV_SIMPLE_ONDEMAND, NULL);
> +                                           DEVFREQ_GOV_SIMPLE_ONDEMAND,
> +                                           &tegra->ondemand_data);
>         if (IS_ERR(tegra->devfreq)) {
>                 err = PTR_ERR(tegra->devfreq);
> -               goto remove_opps;
> +               goto put_table;
>         }
>
>         return 0;
>
> -remove_opps:
> -       dev_pm_opp_remove_all_dynamic(&pdev->dev);
> +put_table:
> +       dev_pm_opp_put_opp_table(tegra->opp_table);
>
>         return err;
>  }
> @@ -190,21 +158,27 @@ static int tegra_devfreq_remove(struct platform_device *pdev)
>         struct tegra_devfreq *tegra = platform_get_drvdata(pdev);
>
>         devfreq_remove_device(tegra->devfreq);
> -       dev_pm_opp_remove_all_dynamic(&pdev->dev);
> +       dev_pm_opp_of_remove_table(&pdev->dev);
> +       dev_pm_opp_put_opp_table(tegra->opp_table);
>
>         return 0;
>  }
>
> +static const struct of_device_id tegra_devfreq_of_match[] = {
> +       { .compatible = "nvidia,tegra20-emc-statistics" },
> +       { },
> +};
> +
>  static struct platform_driver tegra_devfreq_driver = {
>         .probe          = tegra_devfreq_probe,
>         .remove         = tegra_devfreq_remove,
>         .driver         = {
>                 .name   = "tegra20-devfreq",
> +               .of_match_table = tegra_devfreq_of_match,
>         },
>  };
>  module_platform_driver(tegra_devfreq_driver);
>
> -MODULE_ALIAS("platform:tegra20-devfreq");
>  MODULE_AUTHOR("Dmitry Osipenko <digetx@gmail.com>");
>  MODULE_DESCRIPTION("NVIDIA Tegra20 devfreq driver");
>  MODULE_LICENSE("GPL v2");
> --
> 2.27.0
>
Dmitry Osipenko Nov. 1, 2020, 2:12 p.m. UTC | #39
01.11.2020 16:31, Chanwoo Choi пишет:
> Hi Dmitry,
> 
> This patch contains the three features as following:
> 1. Use interconnect interface for controlling the clock instead of
> controlling it direclty
> 2. Use EMC_STAT instead of IMC_STAT
> 3. Change polling_interval and upthreshold for more fast responsiveness
> 
> I think you need to make the separate patches for each role.
> But, if it is difficult or not proper to split out 1,2 roles, you can
> make two patches for 1,2 and 3 roles.

Hello Chanwoo,

We will probably move the Tegra20 EMC_STAT devfreq driver into the
memory driver and remove the older IMC_STAT driver in v7, like it was
suggested by Thierry Reding. This will be a much less invasive code change.

> Also, if you want to get more responsiveness, you could use delayed timer
> instead of deferrable timer by editing the devfreq_dev_profile structure.

Thanks, I'll try the deferrable timer.
Chanwoo Choi Nov. 1, 2020, 2:39 p.m. UTC | #40
Hi Dmitry,

On Mon, Oct 26, 2020 at 7:22 AM Dmitry Osipenko <digetx@gmail.com> wrote:
>
> This patch moves ACTMON driver away from generating OPP table by itself,
> transitioning it to use the table which comes from device-tree. This
> change breaks compatibility with older device-trees in order to bring
> support for the interconnect framework to the driver. This is a mandatory
> change which needs to be done in order to implement interconnect-based
> memory DVFS. Users of legacy device-trees will get a message telling that
> theirs DT needs to be upgraded. Now ACTMON issues memory bandwidth request
> using dev_pm_opp_set_bw(), instead of driving EMC clock rate directly.
>
> Tested-by: Peter Geis <pgwipeout@gmail.com>
> Tested-by: Nicolas Chauvet <kwizart@gmail.com>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/devfreq/tegra30-devfreq.c | 91 ++++++++++++++++---------------
>  1 file changed, 48 insertions(+), 43 deletions(-)
>
> diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c
> index 3f732ab53573..1b0b91a71886 100644
> --- a/drivers/devfreq/tegra30-devfreq.c
> +++ b/drivers/devfreq/tegra30-devfreq.c
> @@ -19,6 +19,8 @@
>  #include <linux/reset.h>
>  #include <linux/workqueue.h>
>
> +#include <soc/tegra/fuse.h>
> +
>  #include "governor.h"
>
>  #define ACTMON_GLB_STATUS                                      0x0
> @@ -155,6 +157,7 @@ struct tegra_devfreq_device {
>
>  struct tegra_devfreq {
>         struct devfreq          *devfreq;
> +       struct opp_table        *opp_table;
>
>         struct reset_control    *reset;
>         struct clk              *clock;
> @@ -612,34 +615,19 @@ static void tegra_actmon_stop(struct tegra_devfreq *tegra)
>  static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
>                                 u32 flags)
>  {
> -       struct tegra_devfreq *tegra = dev_get_drvdata(dev);
> -       struct devfreq *devfreq = tegra->devfreq;
>         struct dev_pm_opp *opp;
> -       unsigned long rate;
> -       int err;
> +       int ret;
>
>         opp = devfreq_recommended_opp(dev, freq, flags);
>         if (IS_ERR(opp)) {
> -               dev_err(dev, "Failed to find opp for %lu Hz\n", *freq);
> +               dev_err(dev, "failed to find opp for %lu Hz\n", *freq);
>                 return PTR_ERR(opp);
>         }
> -       rate = dev_pm_opp_get_freq(opp);
> -       dev_pm_opp_put(opp);
> -
> -       err = clk_set_min_rate(tegra->emc_clock, rate * KHZ);
> -       if (err)
> -               return err;
> -
> -       err = clk_set_rate(tegra->emc_clock, 0);
> -       if (err)
> -               goto restore_min_rate;
>
> -       return 0;
> -
> -restore_min_rate:
> -       clk_set_min_rate(tegra->emc_clock, devfreq->previous_freq);
> +       ret = dev_pm_opp_set_bw(dev, opp);
> +       dev_pm_opp_put(opp);
>
> -       return err;
> +       return ret;
>  }
>
>  static int tegra_devfreq_get_dev_status(struct device *dev,
> @@ -655,7 +643,7 @@ static int tegra_devfreq_get_dev_status(struct device *dev,
>         stat->private_data = tegra;
>
>         /* The below are to be used by the other governors */
> -       stat->current_frequency = cur_freq;
> +       stat->current_frequency = cur_freq * KHZ;

I can't find any change related to the frequency unit on this patch.
Do you fix the previous bug of the frequency unit?

>
>         actmon_dev = &tegra->devices[MCALL];
>
> @@ -705,7 +693,7 @@ static int tegra_governor_get_target(struct devfreq *devfreq,
>                 target_freq = max(target_freq, dev->target_freq);
>         }
>
> -       *freq = target_freq;
> +       *freq = target_freq * KHZ;

ditto.

>
>         return 0;
>  }
> @@ -773,13 +761,22 @@ static struct devfreq_governor tegra_devfreq_governor = {
>
>  static int tegra_devfreq_probe(struct platform_device *pdev)
>  {
> +       u32 hw_version = BIT(tegra_sku_info.soc_speedo_id);
>         struct tegra_devfreq_device *dev;
>         struct tegra_devfreq *tegra;
> +       struct opp_table *opp_table;
>         struct devfreq *devfreq;
>         unsigned int i;
>         long rate;
>         int err;
>
> +       /* legacy device-trees don't have OPP table and must be updated */
> +       if (!device_property_present(&pdev->dev, "operating-points-v2")) {
> +               dev_err(&pdev->dev, "OPP table not found, cannot continue\n");
> +               dev_err(&pdev->dev, "please update your device tree\n");
> +               return -ENODEV;
> +       }

As you mentioned, it breaks the old dtb. I have no objection to improving
the driver. Instead, you need confirmation from the Devicetree maintainer.

And,
I recommend that you use dev_pm_opp_of_get_opp_desc_node(&pdev->dev)
to check whether a device contains opp-table or not.

> +
>         tegra = devm_kzalloc(&pdev->dev, sizeof(*tegra), GFP_KERNEL);
>         if (!tegra)
>                 return -ENOMEM;
> @@ -821,11 +818,29 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
>                 return err;
>         }
>
> +       tegra->opp_table = dev_pm_opp_get_opp_table(&pdev->dev);
> +       if (IS_ERR(tegra->opp_table))
> +               return dev_err_probe(&pdev->dev, PTR_ERR(tegra->opp_table),
> +                                    "Failed to prepare OPP table\n");

As I knew, each device can contain the opp_table on devicetree.
It means that opp_table has not depended to another device driver.
Did you see this exception case with EPROBE_DEFER error?

> +
> +       opp_table = dev_pm_opp_set_supported_hw(&pdev->dev, &hw_version, 1);
> +       err = PTR_ERR_OR_ZERO(opp_table);
> +       if (err) {
> +               dev_err(&pdev->dev, "Failed to set supported HW: %d\n", err);
> +               goto put_table;
> +       }
> +
> +       err = dev_pm_opp_of_add_table(&pdev->dev);
> +       if (err) {
> +               dev_err(&pdev->dev, "Failed to add OPP table: %d\n", err);
> +               goto put_hw;
> +       }
> +
>         err = clk_prepare_enable(tegra->clock);
>         if (err) {
>                 dev_err(&pdev->dev,
>                         "Failed to prepare and enable ACTMON clock\n");
> -               return err;
> +               goto remove_table;
>         }
>
>         err = reset_control_reset(tegra->reset);
> @@ -849,23 +864,6 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
>                 dev->regs = tegra->regs + dev->config->offset;
>         }
>
> -       for (rate = 0; rate <= tegra->max_freq * KHZ; rate++) {
> -               rate = clk_round_rate(tegra->emc_clock, rate);
> -
> -               if (rate < 0) {
> -                       dev_err(&pdev->dev,
> -                               "Failed to round clock rate: %ld\n", rate);
> -                       err = rate;
> -                       goto remove_opps;
> -               }
> -
> -               err = dev_pm_opp_add(&pdev->dev, rate / KHZ, 0);
> -               if (err) {
> -                       dev_err(&pdev->dev, "Failed to add OPP: %d\n", err);
> -                       goto remove_opps;
> -               }
> -       }
> -
>         platform_set_drvdata(pdev, tegra);
>
>         tegra->clk_rate_change_nb.notifier_call = tegra_actmon_clk_notify_cb;
> @@ -881,7 +879,6 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
>         }
>
>         tegra_devfreq_profile.initial_freq = clk_get_rate(tegra->emc_clock);
> -       tegra_devfreq_profile.initial_freq /= KHZ;
>
>         devfreq = devfreq_add_device(&pdev->dev, &tegra_devfreq_profile,
>                                      "tegra_actmon", NULL);
> @@ -901,6 +898,12 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
>         reset_control_reset(tegra->reset);
>  disable_clk:
>         clk_disable_unprepare(tegra->clock);
> +remove_table:
> +       dev_pm_opp_of_remove_table(&pdev->dev);
> +put_hw:
> +       dev_pm_opp_put_supported_hw(tegra->opp_table);
> +put_table:
> +       dev_pm_opp_put_opp_table(tegra->opp_table);
>
>         return err;
>  }
> @@ -912,11 +915,13 @@ static int tegra_devfreq_remove(struct platform_device *pdev)
>         devfreq_remove_device(tegra->devfreq);
>         devfreq_remove_governor(&tegra_devfreq_governor);
>
> -       dev_pm_opp_remove_all_dynamic(&pdev->dev);
> -
>         reset_control_reset(tegra->reset);
>         clk_disable_unprepare(tegra->clock);
>
> +       dev_pm_opp_of_remove_table(&pdev->dev);
> +       dev_pm_opp_put_supported_hw(tegra->opp_table);
> +       dev_pm_opp_put_opp_table(tegra->opp_table);
> +
>         return 0;
>  }
>
> --
> 2.27.0
>
Chanwoo Choi Nov. 1, 2020, 2:44 p.m. UTC | #41
Hi Dmitry,

On Mon, Oct 26, 2020 at 7:22 AM Dmitry Osipenko <digetx@gmail.com> wrote:
>
> This patch moves ACTMON driver away from generating OPP table by itself,
> transitioning it to use the table which comes from device-tree. This
> change breaks compatibility with older device-trees in order to bring
> support for the interconnect framework to the driver. This is a mandatory
> change which needs to be done in order to implement interconnect-based
> memory DVFS. Users of legacy device-trees will get a message telling that
> theirs DT needs to be upgraded. Now ACTMON issues memory bandwidth request
> using dev_pm_opp_set_bw(), instead of driving EMC clock rate directly.
>
> Tested-by: Peter Geis <pgwipeout@gmail.com>
> Tested-by: Nicolas Chauvet <kwizart@gmail.com>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/devfreq/tegra30-devfreq.c | 91 ++++++++++++++++---------------
>  1 file changed, 48 insertions(+), 43 deletions(-)
>
> diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c
> index 3f732ab53573..1b0b91a71886 100644
> --- a/drivers/devfreq/tegra30-devfreq.c
> +++ b/drivers/devfreq/tegra30-devfreq.c
> @@ -19,6 +19,8 @@
>  #include <linux/reset.h>
>  #include <linux/workqueue.h>
>
> +#include <soc/tegra/fuse.h>
> +

This patch touches the OPP. Is it related to this change?

>  #include "governor.h"
>
>  #define ACTMON_GLB_STATUS                                      0x0
> @@ -155,6 +157,7 @@ struct tegra_devfreq_device {
>
>  struct tegra_devfreq {
>         struct devfreq          *devfreq;
> +       struct opp_table        *opp_table;
>
>         struct reset_control    *reset;
>         struct clk              *clock;
> @@ -612,34 +615,19 @@ static void tegra_actmon_stop(struct tegra_devfreq *tegra)
>  static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
>                                 u32 flags)
>  {
> -       struct tegra_devfreq *tegra = dev_get_drvdata(dev);
> -       struct devfreq *devfreq = tegra->devfreq;
>         struct dev_pm_opp *opp;
> -       unsigned long rate;
> -       int err;
> +       int ret;
>
>         opp = devfreq_recommended_opp(dev, freq, flags);
>         if (IS_ERR(opp)) {
> -               dev_err(dev, "Failed to find opp for %lu Hz\n", *freq);
> +               dev_err(dev, "failed to find opp for %lu Hz\n", *freq);

You used the 'Failed to' format in almost every error case.
Don't need to change it.
(snip)

Best Regards,
Chanwoo Choi
Chanwoo Choi Nov. 1, 2020, 3:20 p.m. UTC | #42
Hi Dmitry,

I added the review about 'ARRAY_SIZE(tegra124_device_configs)'.
Except for this, it looks good to me.

On Mon, Oct 26, 2020 at 7:21 AM Dmitry Osipenko <digetx@gmail.com> wrote:
>
> Previously we were using count-weight of the T124 for T30 in order to
> get EMC clock rate that was reasonable for T30. In fact the count-weight
> should be x2 times smaller on T30, but then devfreq was producing a bit
> too low EMC clock rate for ISO memory clients, like display controller
> for example.
>
> Now both Tegra ACTMON and Tegra DRM display drivers support interconnect
> framework and display driver tells to ICC what a minimum memory bandwidth
> is needed, preventing FIFO underflows. Thus, now we can use a proper
> count-weight value for Tegra30 and MC_ALL device config needs a bit more
> aggressive boosting.
>
> This patch adds a separate ACTMON driver configuration that is specific
> to Tegra30.
>
> Tested-by: Peter Geis <pgwipeout@gmail.com>
> Tested-by: Nicolas Chauvet <kwizart@gmail.com>
> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> ---
>  drivers/devfreq/tegra30-devfreq.c | 68 ++++++++++++++++++++++++-------
>  1 file changed, 54 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c
> index 1b0b91a71886..95aba89eae88 100644
> --- a/drivers/devfreq/tegra30-devfreq.c
> +++ b/drivers/devfreq/tegra30-devfreq.c
> @@ -57,13 +57,6 @@
>  #define ACTMON_BELOW_WMARK_WINDOW                              3
>  #define ACTMON_BOOST_FREQ_STEP                                 16000
>
> -/*
> - * Activity counter is incremented every 256 memory transactions, and each
> - * transaction takes 4 EMC clocks for Tegra124; So the COUNT_WEIGHT is
> - * 4 * 256 = 1024.
> - */
> -#define ACTMON_COUNT_WEIGHT                                    0x400
> -
>  /*
>   * ACTMON_AVERAGE_WINDOW_LOG2: default value for @DEV_CTRL_K_VAL, which
>   * translates to 2 ^ (K_VAL + 1). ex: 2 ^ (6 + 1) = 128
> @@ -111,7 +104,7 @@ enum tegra_actmon_device {
>         MCCPU,
>  };
>
> -static const struct tegra_devfreq_device_config actmon_device_configs[] = {
> +static const struct tegra_devfreq_device_config tegra124_device_configs[] = {
>         {
>                 /* MCALL: All memory accesses (including from the CPUs) */
>                 .offset = 0x1c0,
> @@ -133,6 +126,28 @@ static const struct tegra_devfreq_device_config actmon_device_configs[] = {
>         },
>  };
>
> +static const struct tegra_devfreq_device_config tegra30_device_configs[] = {
> +       {
> +               /* MCALL: All memory accesses (including from the CPUs) */
> +               .offset = 0x1c0,
> +               .irq_mask = 1 << 26,
> +               .boost_up_coeff = 200,
> +               .boost_down_coeff = 50,
> +               .boost_up_threshold = 20,
> +               .boost_down_threshold = 10,
> +       },
> +       {
> +               /* MCCPU: memory accesses from the CPUs */
> +               .offset = 0x200,
> +               .irq_mask = 1 << 25,
> +               .boost_up_coeff = 800,
> +               .boost_down_coeff = 40,
> +               .boost_up_threshold = 27,
> +               .boost_down_threshold = 10,
> +               .avg_dependency_threshold = 16000, /* 16MHz in kHz units */
> +       },
> +};
> +
>  /**
>   * struct tegra_devfreq_device - state specific to an ACTMON device
>   *
> @@ -155,6 +170,12 @@ struct tegra_devfreq_device {
>         unsigned long target_freq;
>  };
>
> +struct tegra_devfreq_soc_data {
> +       const struct tegra_devfreq_device_config *configs;
> +       /* Weight value for count measurements */
> +       unsigned int count_weight;
> +};
> +
>  struct tegra_devfreq {
>         struct devfreq          *devfreq;
>         struct opp_table        *opp_table;
> @@ -171,11 +192,13 @@ struct tegra_devfreq {
>         struct delayed_work     cpufreq_update_work;
>         struct notifier_block   cpu_rate_change_nb;
>
> -       struct tegra_devfreq_device devices[ARRAY_SIZE(actmon_device_configs)];
> +       struct tegra_devfreq_device devices[ARRAY_SIZE(tegra124_device_configs)];

When there is one tegra_devfreq_device_config[] array, this style is not wrong.
But, after adding the specific config array for each device,
you better specify the correct array size for each case.

Even if there is no runtime error on tegra30 soc, it is not proper to use
ARRAY_SIZE(tegra124_device_configs). You can allocate the array
of tegra_devfreq_device[] or use fixed the array size (2).

>
>         unsigned int            irq;
>
>         bool                    started;
> +
> +       const struct tegra_devfreq_soc_data *soc;
>  };
>
>  struct tegra_actmon_emc_ratio {
> @@ -488,7 +511,7 @@ static void tegra_actmon_configure_device(struct tegra_devfreq *tegra,
>         tegra_devfreq_update_avg_wmark(tegra, dev);
>         tegra_devfreq_update_wmark(tegra, dev);
>
> -       device_writel(dev, ACTMON_COUNT_WEIGHT, ACTMON_DEV_COUNT_WEIGHT);
> +       device_writel(dev, tegra->soc->count_weight, ACTMON_DEV_COUNT_WEIGHT);
>         device_writel(dev, ACTMON_INTR_STATUS_CLEAR, ACTMON_DEV_INTR_STATUS);
>
>         val |= ACTMON_DEV_CTRL_ENB_PERIODIC;
> @@ -781,6 +804,8 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
>         if (!tegra)
>                 return -ENOMEM;
>
> +       tegra->soc = of_device_get_match_data(&pdev->dev);
> +
>         tegra->regs = devm_platform_ioremap_resource(pdev, 0);
>         if (IS_ERR(tegra->regs))
>                 return PTR_ERR(tegra->regs);
> @@ -858,9 +883,9 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
>
>         tegra->max_freq = rate / KHZ;
>
> -       for (i = 0; i < ARRAY_SIZE(actmon_device_configs); i++) {
> +       for (i = 0; i < ARRAY_SIZE(tegra124_device_configs); i++) {

ditto. Use ARRARY_SIZE(soc->configs) instead of
ARRAY_SIZE(tegra124_device_configs).

>                 dev = tegra->devices + i;
> -               dev->config = actmon_device_configs + i;
> +               dev->config = tegra->soc->configs + i;
>                 dev->regs = tegra->regs + dev->config->offset;
>         }
>
> @@ -925,9 +950,24 @@ static int tegra_devfreq_remove(struct platform_device *pdev)
>         return 0;
>  }
>
> +static const struct tegra_devfreq_soc_data tegra124_soc = {
> +       .configs = tegra124_device_configs,
> +
> +       /*
> +        * Activity counter is incremented every 256 memory transactions,
> +        * and each transaction takes 4 EMC clocks.
> +        */
> +       .count_weight = 4 * 256,
> +};
> +
> +static const struct tegra_devfreq_soc_data tegra30_soc = {
> +       .configs = tegra30_device_configs,
> +       .count_weight = 2 * 256,
> +};
> +
>  static const struct of_device_id tegra_devfreq_of_match[] = {
> -       { .compatible = "nvidia,tegra30-actmon" },
> -       { .compatible = "nvidia,tegra124-actmon" },
> +       { .compatible = "nvidia,tegra30-actmon",  .data = &tegra30_soc, },
> +       { .compatible = "nvidia,tegra124-actmon", .data = &tegra124_soc, },
>         { },
>  };
>
> --
> 2.27.0
>
Dmitry Osipenko Nov. 1, 2020, 3:23 p.m. UTC | #43
01.11.2020 17:39, Chanwoo Choi пишет:
> Hi Dmitry,
> 
> On Mon, Oct 26, 2020 at 7:22 AM Dmitry Osipenko <digetx@gmail.com> wrote:
>>
>> This patch moves ACTMON driver away from generating OPP table by itself,
>> transitioning it to use the table which comes from device-tree. This
>> change breaks compatibility with older device-trees in order to bring
>> support for the interconnect framework to the driver. This is a mandatory
>> change which needs to be done in order to implement interconnect-based
>> memory DVFS. Users of legacy device-trees will get a message telling that
>> theirs DT needs to be upgraded. Now ACTMON issues memory bandwidth request
>> using dev_pm_opp_set_bw(), instead of driving EMC clock rate directly.
>>
>> Tested-by: Peter Geis <pgwipeout@gmail.com>
>> Tested-by: Nicolas Chauvet <kwizart@gmail.com>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
...
>>  static int tegra_devfreq_get_dev_status(struct device *dev,
>> @@ -655,7 +643,7 @@ static int tegra_devfreq_get_dev_status(struct device *dev,
>>         stat->private_data = tegra;
>>
>>         /* The below are to be used by the other governors */
>> -       stat->current_frequency = cur_freq;
>> +       stat->current_frequency = cur_freq * KHZ;
> 
> I can't find any change related to the frequency unit on this patch.
> Do you fix the previous bug of the frequency unit?

Previously, OPP entries that were generated by the driver used KHz
units. Now, OPP entries are coming from a device-tree and they have Hz
units. This driver operates with KHz internally, hence we need to
convert the units now.

>>
>>         actmon_dev = &tegra->devices[MCALL];
>>
>> @@ -705,7 +693,7 @@ static int tegra_governor_get_target(struct devfreq *devfreq,
>>                 target_freq = max(target_freq, dev->target_freq);
>>         }
>>
>> -       *freq = target_freq;
>> +       *freq = target_freq * KHZ;
> 
> ditto.
> 
>>
>>         return 0;
>>  }
>> @@ -773,13 +761,22 @@ static struct devfreq_governor tegra_devfreq_governor = {
>>
>>  static int tegra_devfreq_probe(struct platform_device *pdev)
>>  {
>> +       u32 hw_version = BIT(tegra_sku_info.soc_speedo_id);
>>         struct tegra_devfreq_device *dev;
>>         struct tegra_devfreq *tegra;
>> +       struct opp_table *opp_table;
>>         struct devfreq *devfreq;
>>         unsigned int i;
>>         long rate;
>>         int err;
>>
>> +       /* legacy device-trees don't have OPP table and must be updated */
>> +       if (!device_property_present(&pdev->dev, "operating-points-v2")) {
>> +               dev_err(&pdev->dev, "OPP table not found, cannot continue\n");
>> +               dev_err(&pdev->dev, "please update your device tree\n");
>> +               return -ENODEV;
>> +       }
> 
> As you mentioned, it breaks the old dtb. I have no objection to improving
> the driver. Instead, you need confirmation from the Devicetree maintainer.

Older DTBs will continue to work, but devfreq driver won't. We already
did this for other Tegra drivers before (CPUFREQ driver for example) and
Rob Herring confirmed that there is no problem here.

> And,
> I recommend that you use dev_pm_opp_of_get_opp_desc_node(&pdev->dev)
> to check whether a device contains opp-table or not.

I'm not sure what are the benefits, this will make code less
expressive/readable and we will need to add extra of_node_put(), which
device_property_present() handles for us.

Could you please give the rationale?

>> +
>>         tegra = devm_kzalloc(&pdev->dev, sizeof(*tegra), GFP_KERNEL);
>>         if (!tegra)
>>                 return -ENOMEM;
>> @@ -821,11 +818,29 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
>>                 return err;
>>         }
>>
>> +       tegra->opp_table = dev_pm_opp_get_opp_table(&pdev->dev);
>> +       if (IS_ERR(tegra->opp_table))
>> +               return dev_err_probe(&pdev->dev, PTR_ERR(tegra->opp_table),
>> +                                    "Failed to prepare OPP table\n");
> 
> As I knew, each device can contain the opp_table on devicetree.
> It means that opp_table has not depended to another device driver.
> Did you see this exception case with EPROBE_DEFER error?

OPP core will try to grab the clock reference for the table and it may
cause EPROBE_DEFER. Although, it shouldn't happen here because we have
devm_clk_get() before the get_opp_table(), hence seems the deferred
probe indeed shouldn't happen in this case.
Dmitry Osipenko Nov. 1, 2020, 3:24 p.m. UTC | #44
01.11.2020 17:44, Chanwoo Choi пишет:
> Hi Dmitry,
> 
> On Mon, Oct 26, 2020 at 7:22 AM Dmitry Osipenko <digetx@gmail.com> wrote:
>>
>> This patch moves ACTMON driver away from generating OPP table by itself,
>> transitioning it to use the table which comes from device-tree. This
>> change breaks compatibility with older device-trees in order to bring
>> support for the interconnect framework to the driver. This is a mandatory
>> change which needs to be done in order to implement interconnect-based
>> memory DVFS. Users of legacy device-trees will get a message telling that
>> theirs DT needs to be upgraded. Now ACTMON issues memory bandwidth request
>> using dev_pm_opp_set_bw(), instead of driving EMC clock rate directly.
>>
>> Tested-by: Peter Geis <pgwipeout@gmail.com>
>> Tested-by: Nicolas Chauvet <kwizart@gmail.com>
>> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
>> ---
>>  drivers/devfreq/tegra30-devfreq.c | 91 ++++++++++++++++---------------
>>  1 file changed, 48 insertions(+), 43 deletions(-)
>>
>> diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c
>> index 3f732ab53573..1b0b91a71886 100644
>> --- a/drivers/devfreq/tegra30-devfreq.c
>> +++ b/drivers/devfreq/tegra30-devfreq.c
>> @@ -19,6 +19,8 @@
>>  #include <linux/reset.h>
>>  #include <linux/workqueue.h>
>>
>> +#include <soc/tegra/fuse.h>
>> +
> 
> This patch touches the OPP. Is it related to this change?

Yes, this is needed for the dev_pm_opp_set_supported_hw().

>>  #include "governor.h"
>>
>>  #define ACTMON_GLB_STATUS                                      0x0
>> @@ -155,6 +157,7 @@ struct tegra_devfreq_device {
>>
>>  struct tegra_devfreq {
>>         struct devfreq          *devfreq;
>> +       struct opp_table        *opp_table;
>>
>>         struct reset_control    *reset;
>>         struct clk              *clock;
>> @@ -612,34 +615,19 @@ static void tegra_actmon_stop(struct tegra_devfreq *tegra)
>>  static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
>>                                 u32 flags)
>>  {
>> -       struct tegra_devfreq *tegra = dev_get_drvdata(dev);
>> -       struct devfreq *devfreq = tegra->devfreq;
>>         struct dev_pm_opp *opp;
>> -       unsigned long rate;
>> -       int err;
>> +       int ret;
>>
>>         opp = devfreq_recommended_opp(dev, freq, flags);
>>         if (IS_ERR(opp)) {
>> -               dev_err(dev, "Failed to find opp for %lu Hz\n", *freq);
>> +               dev_err(dev, "failed to find opp for %lu Hz\n", *freq);
> 
> You used the 'Failed to' format in almost every error case.
> Don't need to change it.
> (snip)

Good catch, thanks.
Chanwoo Choi Nov. 1, 2020, 3:44 p.m. UTC | #45
Hi Dmitry,

On Mon, Nov 2, 2020 at 12:23 AM Dmitry Osipenko <digetx@gmail.com> wrote:
>
> 01.11.2020 17:39, Chanwoo Choi пишет:
> > Hi Dmitry,
> >
> > On Mon, Oct 26, 2020 at 7:22 AM Dmitry Osipenko <digetx@gmail.com> wrote:
> >>
> >> This patch moves ACTMON driver away from generating OPP table by itself,
> >> transitioning it to use the table which comes from device-tree. This
> >> change breaks compatibility with older device-trees in order to bring
> >> support for the interconnect framework to the driver. This is a mandatory
> >> change which needs to be done in order to implement interconnect-based
> >> memory DVFS. Users of legacy device-trees will get a message telling that
> >> theirs DT needs to be upgraded. Now ACTMON issues memory bandwidth request
> >> using dev_pm_opp_set_bw(), instead of driving EMC clock rate directly.
> >>
> >> Tested-by: Peter Geis <pgwipeout@gmail.com>
> >> Tested-by: Nicolas Chauvet <kwizart@gmail.com>
> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >> ---
> ...
> >>  static int tegra_devfreq_get_dev_status(struct device *dev,
> >> @@ -655,7 +643,7 @@ static int tegra_devfreq_get_dev_status(struct device *dev,
> >>         stat->private_data = tegra;
> >>
> >>         /* The below are to be used by the other governors */
> >> -       stat->current_frequency = cur_freq;
> >> +       stat->current_frequency = cur_freq * KHZ;
> >
> > I can't find any change related to the frequency unit on this patch.
> > Do you fix the previous bug of the frequency unit?
>
> Previously, OPP entries that were generated by the driver used KHz
> units. Now, OPP entries are coming from a device-tree and they have Hz
> units. This driver operates with KHz internally, hence we need to
> convert the units now.

OK. Thanks.

>
> >>
> >>         actmon_dev = &tegra->devices[MCALL];
> >>
> >> @@ -705,7 +693,7 @@ static int tegra_governor_get_target(struct devfreq *devfreq,
> >>                 target_freq = max(target_freq, dev->target_freq);
> >>         }
> >>
> >> -       *freq = target_freq;
> >> +       *freq = target_freq * KHZ;
> >
> > ditto.
> >
> >>
> >>         return 0;
> >>  }
> >> @@ -773,13 +761,22 @@ static struct devfreq_governor tegra_devfreq_governor = {
> >>
> >>  static int tegra_devfreq_probe(struct platform_device *pdev)
> >>  {
> >> +       u32 hw_version = BIT(tegra_sku_info.soc_speedo_id);
> >>         struct tegra_devfreq_device *dev;
> >>         struct tegra_devfreq *tegra;
> >> +       struct opp_table *opp_table;
> >>         struct devfreq *devfreq;
> >>         unsigned int i;
> >>         long rate;
> >>         int err;
> >>
> >> +       /* legacy device-trees don't have OPP table and must be updated */
> >> +       if (!device_property_present(&pdev->dev, "operating-points-v2")) {
> >> +               dev_err(&pdev->dev, "OPP table not found, cannot continue\n");
> >> +               dev_err(&pdev->dev, "please update your device tree\n");
> >> +               return -ENODEV;
> >> +       }
> >
> > As you mentioned, it breaks the old dtb. I have no objection to improving
> > the driver. Instead, you need confirmation from the Devicetree maintainer.
>
> Older DTBs will continue to work, but devfreq driver won't. We already
> did this for other Tegra drivers before (CPUFREQ driver for example) and
> Rob Herring confirmed that there is no problem here.

OK.

>
> > And,
> > I recommend that you use dev_pm_opp_of_get_opp_desc_node(&pdev->dev)
> > to check whether a device contains opp-table or not.
>
> I'm not sure what are the benefits, this will make code less
> expressive/readable and we will need to add extra of_node_put(), which
> device_property_present() handles for us.
>
> Could you please give the rationale?

IMO, 'operating-points-v2' word was defined on OPP core. I think that
the external user
of OPP better to use the public helper function instead of using the
interval definition
or value of OPP core directly. Basically, I prefer the provided helper
function if there.
But, it is not critical and doesn't affect the operation. If you want
to keep, it is ok.

>
> >> +
> >>         tegra = devm_kzalloc(&pdev->dev, sizeof(*tegra), GFP_KERNEL);
> >>         if (!tegra)
> >>                 return -ENOMEM;
> >> @@ -821,11 +818,29 @@ static int tegra_devfreq_probe(struct platform_device *pdev)
> >>                 return err;
> >>         }
> >>
> >> +       tegra->opp_table = dev_pm_opp_get_opp_table(&pdev->dev);
> >> +       if (IS_ERR(tegra->opp_table))
> >> +               return dev_err_probe(&pdev->dev, PTR_ERR(tegra->opp_table),
> >> +                                    "Failed to prepare OPP table\n");
> >
> > As I knew, each device can contain the opp_table on devicetree.
> > It means that opp_table has not depended to another device driver.
> > Did you see this exception case with EPROBE_DEFER error?
>
> OPP core will try to grab the clock reference for the table and it may
> cause EPROBE_DEFER. Although, it shouldn't happen here because we have
> devm_clk_get() before the get_opp_table(), hence seems the deferred
> probe indeed shouldn't happen in this case.

It is my missing point. If there, you're right.
Could you provide the code point to check the clock reference on the OPP core?
Chanwoo Choi Nov. 1, 2020, 3:45 p.m. UTC | #46
Hi Dmitry,

On Mon, Nov 2, 2020 at 12:24 AM Dmitry Osipenko <digetx@gmail.com> wrote:
>
> 01.11.2020 17:44, Chanwoo Choi пишет:
> > Hi Dmitry,
> >
> > On Mon, Oct 26, 2020 at 7:22 AM Dmitry Osipenko <digetx@gmail.com> wrote:
> >>
> >> This patch moves ACTMON driver away from generating OPP table by itself,
> >> transitioning it to use the table which comes from device-tree. This
> >> change breaks compatibility with older device-trees in order to bring
> >> support for the interconnect framework to the driver. This is a mandatory
> >> change which needs to be done in order to implement interconnect-based
> >> memory DVFS. Users of legacy device-trees will get a message telling that
> >> theirs DT needs to be upgraded. Now ACTMON issues memory bandwidth request
> >> using dev_pm_opp_set_bw(), instead of driving EMC clock rate directly.
> >>
> >> Tested-by: Peter Geis <pgwipeout@gmail.com>
> >> Tested-by: Nicolas Chauvet <kwizart@gmail.com>
> >> Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
> >> ---
> >>  drivers/devfreq/tegra30-devfreq.c | 91 ++++++++++++++++---------------
> >>  1 file changed, 48 insertions(+), 43 deletions(-)
> >>
> >> diff --git a/drivers/devfreq/tegra30-devfreq.c b/drivers/devfreq/tegra30-devfreq.c
> >> index 3f732ab53573..1b0b91a71886 100644
> >> --- a/drivers/devfreq/tegra30-devfreq.c
> >> +++ b/drivers/devfreq/tegra30-devfreq.c
> >> @@ -19,6 +19,8 @@
> >>  #include <linux/reset.h>
> >>  #include <linux/workqueue.h>
> >>
> >> +#include <soc/tegra/fuse.h>
> >> +
> >
> > This patch touches the OPP. Is it related to this change?
>
> Yes, this is needed for the dev_pm_opp_set_supported_hw().

OK.

>
> >>  #include "governor.h"
> >>
> >>  #define ACTMON_GLB_STATUS                                      0x0
> >> @@ -155,6 +157,7 @@ struct tegra_devfreq_device {
> >>
> >>  struct tegra_devfreq {
> >>         struct devfreq          *devfreq;
> >> +       struct opp_table        *opp_table;
> >>
> >>         struct reset_control    *reset;
> >>         struct clk              *clock;
> >> @@ -612,34 +615,19 @@ static void tegra_actmon_stop(struct tegra_devfreq *tegra)
> >>  static int tegra_devfreq_target(struct device *dev, unsigned long *freq,
> >>                                 u32 flags)
> >>  {
> >> -       struct tegra_devfreq *tegra = dev_get_drvdata(dev);
> >> -       struct devfreq *devfreq = tegra->devfreq;
> >>         struct dev_pm_opp *opp;
> >> -       unsigned long rate;
> >> -       int err;
> >> +       int ret;
> >>
> >>         opp = devfreq_recommended_opp(dev, freq, flags);
> >>         if (IS_ERR(opp)) {
> >> -               dev_err(dev, "Failed to find opp for %lu Hz\n", *freq);
> >> +               dev_err(dev, "failed to find opp for %lu Hz\n", *freq);
> >
> > You used the 'Failed to' format in almost every error case.
> > Don't need to change it.
> > (snip)
>
> Good catch, thanks.
Dmitry Osipenko Nov. 1, 2020, 3:49 p.m. UTC | #47
01.11.2020 18:44, Chanwoo Choi пишет:
>> OPP core will try to grab the clock reference for the table and it may
>> cause EPROBE_DEFER. Although, it shouldn't happen here because we have
>> devm_clk_get() before the get_opp_table(), hence seems the deferred
>> probe indeed shouldn't happen in this case.
> It is my missing point. If there, you're right.
> Could you provide the code point to check the clock reference on the OPP core?

Please see it here:

https://elixir.bootlin.com/linux/v5.10-rc1/source/drivers/opp/core.c#L1101
Chanwoo Choi Nov. 1, 2020, 3:57 p.m. UTC | #48
On Mon, Nov 2, 2020 at 12:49 AM Dmitry Osipenko <digetx@gmail.com> wrote:
>
> 01.11.2020 18:44, Chanwoo Choi пишет:
> >> OPP core will try to grab the clock reference for the table and it may
> >> cause EPROBE_DEFER. Although, it shouldn't happen here because we have
> >> devm_clk_get() before the get_opp_table(), hence seems the deferred
> >> probe indeed shouldn't happen in this case.
> > It is my missing point. If there, you're right.
> > Could you provide the code point to check the clock reference on the OPP core?
>
> Please see it here:
>
> https://elixir.bootlin.com/linux/v5.10-rc1/source/drivers/opp/core.c#L1101

Thanks. It seems that if device tree node contains the any node,
it is not EPROBE_DEFER because of just using "clk_get(dev, NULL)".

The patch[1] used the 'dev_err_probe' for getting the "emc" clock.
Do you need to check it more?

[1] https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-next&id=09d56d92ad25b58113f4ec677e9b1ac1e2c3072b
Dmitry Osipenko Nov. 2, 2020, 7:58 p.m. UTC | #49
01.11.2020 18:57, Chanwoo Choi пишет:
> On Mon, Nov 2, 2020 at 12:49 AM Dmitry Osipenko <digetx@gmail.com> wrote:
>>
>> 01.11.2020 18:44, Chanwoo Choi пишет:
>>>> OPP core will try to grab the clock reference for the table and it may
>>>> cause EPROBE_DEFER. Although, it shouldn't happen here because we have
>>>> devm_clk_get() before the get_opp_table(), hence seems the deferred
>>>> probe indeed shouldn't happen in this case.
>>> It is my missing point. If there, you're right.
>>> Could you provide the code point to check the clock reference on the OPP core?
>>
>> Please see it here:
>>
>> https://elixir.bootlin.com/linux/v5.10-rc1/source/drivers/opp/core.c#L1101
> 
> Thanks. It seems that if device tree node contains the any node,
> it is not EPROBE_DEFER because of just using "clk_get(dev, NULL)".
> 
> The patch[1] used the 'dev_err_probe' for getting the "emc" clock.
> Do you need to check it more?
> 
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux.git/commit/?h=devfreq-next&id=09d56d92ad25b58113f4ec677e9b1ac1e2c3072b

It should be safe to assume that the EPROBE_DEFER won't happen for
dev_pm_opp_get_opp_table(). I'll improve it in v7, thanks.
Dmitry Osipenko Nov. 2, 2020, 8 p.m. UTC | #50
01.11.2020 18:44, Chanwoo Choi пишет:
>>> I recommend that you use dev_pm_opp_of_get_opp_desc_node(&pdev->dev)
>>> to check whether a device contains opp-table or not.
>> I'm not sure what are the benefits, this will make code less
>> expressive/readable and we will need to add extra of_node_put(), which
>> device_property_present() handles for us.
>>
>> Could you please give the rationale?
> IMO, 'operating-points-v2' word was defined on OPP core. I think that
> the external user
> of OPP better to use the public helper function instead of using the
> interval definition
> or value of OPP core directly. Basically, I prefer the provided helper
> function if there.
> But, it is not critical and doesn't affect the operation. If you want
> to keep, it is ok.
> 

I'll prefer to keep it since it's better for the readability of the
code, thanks.
Dmitry Osipenko Nov. 2, 2020, 8:08 p.m. UTC | #51
01.11.2020 17:12, Dmitry Osipenko пишет:
...
> We will probably move the Tegra20 EMC_STAT devfreq driver into the
> memory driver and remove the older IMC_STAT driver in v7, like it was
> suggested by Thierry Reding. This will be a much less invasive code change.
> 
>> Also, if you want to get more responsiveness, you could use delayed timer
>> instead of deferrable timer by editing the devfreq_dev_profile structure.
> 
> Thanks, I'll try the deferrable timer.

I took a brief look at the delayed timer and I think the deferrable
timer should be more a preferred option because this devfreq drive is
more an assistance for the optimal bandwidth selection and it will be
more preferred to keep system idling whenever possible.

My primary concern is the initial performance lag in a case of
multimedia applications. But this will be resolved by hooking up
performance voting to all drivers, once we will get around to it.
Chanwoo Choi Nov. 3, 2020, 2:22 a.m. UTC | #52
On 11/3/20 5:08 AM, Dmitry Osipenko wrote:
> 01.11.2020 17:12, Dmitry Osipenko пишет:
> ...
>> We will probably move the Tegra20 EMC_STAT devfreq driver into the
>> memory driver and remove the older IMC_STAT driver in v7, like it was
>> suggested by Thierry Reding. This will be a much less invasive code change.
>>
>>> Also, if you want to get more responsiveness, you could use delayed timer
>>> instead of deferrable timer by editing the devfreq_dev_profile structure.
>>
>> Thanks, I'll try the deferrable timer.
> 
> I took a brief look at the delayed timer and I think the deferrable
> timer should be more a preferred option because this devfreq drive is
> more an assistance for the optimal bandwidth selection and it will be
> more preferred to keep system idling whenever possible.
> 
> My primary concern is the initial performance lag in a case of
> multimedia applications. But this will be resolved by hooking up
> performance voting to all drivers, once we will get around to it.

OK. You can choice the type of timer on both probe
and via sysfs file on the runtime.