[00/18] ARM: qcom: apq8064: support CPU frequency scaling

Message ID	20230612053922.3284394-1-dmitry.baryshkov@linaro.org
Headers	show Return-Path: <devicetree-owner@vger.kernel.org> From: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> To: Rob Herring <robh+dt@kernel.org>, Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>, Conor Dooley <conor+dt@kernel.org>, Andy Gross <agross@kernel.org>, Bjorn Andersson <andersson@kernel.org>, Konrad Dybcio <konrad.dybcio@linaro.org>, Ilia Lin <ilia.lin@kernel.org>, Viresh Kumar <vireshk@kernel.org>, Nishanth Menon <nm@ti.com>, Stephen Boyd <sboyd@kernel.org>, Michael Turquette <mturquette@baylibre.com>, "Rafael J. Wysocki" <rafael@kernel.org> Cc: linux-arm-msm@vger.kernel.org, devicetree@vger.kernel.org, linux-pm@vger.kernel.org, linux-clk@vger.kernel.org, Christian Marangi <ansuelsmth@gmail.com> Subject: [PATCH 00/18] ARM: qcom: apq8064: support CPU frequency scaling Date: Mon, 12 Jun 2023 08:39:04 +0300 Message-Id: <20230612053922.3284394-1-dmitry.baryshkov@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	ARM: qcom: apq8064: support CPU frequency scaling \| expand [00/18] ARM: qcom: apq8064: support CPU frequency scaling [01/18] dt-bindings: opp: opp-v2-kryo-cpu: support Qualcomm Krait SoCs [02/18] dt-bindings: soc: qcom: merge qcom,saw2.txt into qcom,spm.yaml [03/18] dt-bindings: soc: qcom: qcom,saw2: define optional regulator node [04/18] dt-bindings: clock: qcom,krait-cc: Krait core clock controller

Dmitry Baryshkov June 12, 2023, 5:39 a.m. UTC

Implement CPUFreq support for one of the oldest supported Qualcomm
platforms, APQ8064. Each core has independent power and frequency
control. Additionally the L2 cache is scaled to follow the CPU
frequencies (failure to do so results in strange semi-random crashes).

Core voltage is controlled through the SAW2 devices, one for each core.
The L2 has two regulators, vdd-mem and vdd-dig.

Depenency: [1] for interconnect-clk implementation

https://lore.kernel.org/linux-arm-msm/20230512001334.2983048-3-dmitry.baryshkov@linaro.org/

Dmitry Baryshkov (18):
  dt-bindings: opp: opp-v2-kryo-cpu: support Qualcomm Krait SoCs
  dt-bindings: soc: qcom: merge qcom,saw2.txt into qcom,spm.yaml
  dt-bindings: soc: qcom: qcom,saw2: define optional regulator node
  dt-bindings: clock: qcom,krait-cc: Krait core clock controller
  clk: qcom: krait-cc: rewrite driver to use clk_hw instead of clk
  clk: qcom: krait-cc: export L2 clock as an interconnect
  soc: qcom: spm: add support for voltage regulator
  cpufreq: qcom-nvmem: also accept operating-points-v2-krait-cpu
  cpufreq: qcom-nvmem: Add support for voltage scaling
  cpufreq: qcom-nvmem: drop pvs_ver for format a fuses
  cpufreq: qcom-nvmem: provide separate configuration data for apq8064
  ARM: dts: qcom: apq8064: rename SAW nodes to power-manager
  ARM: dts: qcom: apq8064: declare SAW2 regulators
  ARM: dts: qcom: apq8064: add simple CPUFreq support
  ARM: dts: qcom: apq8064: provide voltage scaling tables
  ARM: dts: qcom: apq8064: enable passive CPU cooling
  ARM: dts: qcom: apq8064-asus-nexus7-flo: constraint cpufreq regulators
  ARM: dts: qcom: apq8064-ifc6410: constraint cpufreq regulators

 .../devicetree/bindings/arm/msm/qcom,saw2.txt |   58 -
 .../bindings/opp/opp-v2-kryo-cpu.yaml         |   11 +-
 .../qcom/{qcom,spm.yaml => qcom,saw2.yaml}    |   39 +-
 .../boot/dts/qcom-apq8064-asus-nexus7-flo.dts |   14 +-
 arch/arm/boot/dts/qcom-apq8064-ifc6410.dts    |   18 +-
 arch/arm/boot/dts/qcom-apq8064.dtsi           | 1247 ++++++++++++++++-
 drivers/clk/qcom/Kconfig                      |    1 +
 drivers/clk/qcom/krait-cc.c                   |  185 +--
 drivers/cpufreq/qcom-cpufreq-nvmem.c          |  164 ++-
 drivers/soc/qcom/spm.c                        |  205 ++-
 include/dt-bindings/clock/qcom,krait-cc.h     |   20 +
 include/soc/qcom/spm.h                        |    9 +
 12 files changed, 1806 insertions(+), 165 deletions(-)
 delete mode 100644 Documentation/devicetree/bindings/arm/msm/qcom,saw2.txt
 rename Documentation/devicetree/bindings/soc/qcom/{qcom,spm.yaml => qcom,saw2.yaml} (58%)
 create mode 100644 include/dt-bindings/clock/qcom,krait-cc.h

Christian Marangi June 11, 2023, 4:27 p.m. UTC | #1

On Mon, Jun 12, 2023 at 08:39:04AM +0300, Dmitry Baryshkov wrote:
> Implement CPUFreq support for one of the oldest supported Qualcomm
> platforms, APQ8064. Each core has independent power and frequency
> control. Additionally the L2 cache is scaled to follow the CPU
> frequencies (failure to do so results in strange semi-random crashes).

Hi, can we talk, maybe in private about this interconnect-cpu thing?

I see you follow the original implementation of the msm_bus where in
practice with the use of the kbps the correct clock and voltage was set.
(and this was also used to set the fabric clock from nominal to fast)

On ipq806x and I assume other SoC there isn't always a 1:1 map of CPU
freq and L2 freq. For example on ipq8064 we have max CPU freq of 1.4GHz
and L2 freq of 1.2GHz, on ipq8065 we have CPU 1.7GHz and L2 of 1.4GHz.
(and even that is curious since I used the debug regs and the cxo
crystal to measure the clock by hardware (yes i ported the very ancient
clk-debug to modern kernel and it works and discovered all sort of
things) the L2 (I assume due to climitation of the hfpll) actually can't
never reach that frequency (1.4GHz in reality results to something like
1.2GHz from what I notice a stable clock is there only with frequency of
max 1GHz))

So my idea was to introduce a simple devfreq driver and use the PASSIVE
governor where it was added the possibility to link to a CPU frequency
and with interpolation select the L2 frequency (and voltage)

From some old comments in ancient qsdk code it was pointed out that due
to a hw limitation the secondary cpu can't stay at a high clock if L2
was at the idle clock. (no idea if this is specific to IPQ806x) So this
might be a cause of your crash? (I also have random crash with L2
scaling and we are planning to just force the L2 at max frequency)

But sorry for all of this (maybe) useless info. I checked the other
patch and I didn't understand how the different L2 frequency are
declared and even the voltage. Is this something that will come later?
I'm very interested in this implementation.

> 
> Core voltage is controlled through the SAW2 devices, one for each core.
> The L2 has two regulators, vdd-mem and vdd-dig.
> 
> Depenency: [1] for interconnect-clk implementation
> 
> https://lore.kernel.org/linux-arm-msm/20230512001334.2983048-3-dmitry.baryshkov@linaro.org/
>

Christian Marangi June 11, 2023, 10:16 p.m. UTC | #2

On Mon, Jun 12, 2023 at 04:33:09PM +0300, Dmitry Baryshkov wrote:
> On 12/06/2023 12:01, Stephan Gerhold wrote:
> > On Mon, Jun 12, 2023 at 08:39:19AM +0300, Dmitry Baryshkov wrote:
> > > APQ8064 has 4 speed bins, each of them having from 4 to 6 categorization
> > > kinds. Provide tables necessary to handle voltage scaling on this SoC.
> > > 
> > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> > > ---
> > >   arch/arm/boot/dts/qcom-apq8064.dtsi | 1017 +++++++++++++++++++++++++++
> > >   1 file changed, 1017 insertions(+)
> > > 
> > > diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi b/arch/arm/boot/dts/qcom-apq8064.dtsi
> > > index 4ef13f3d702b..f35853b59544 100644
> > > --- a/arch/arm/boot/dts/qcom-apq8064.dtsi
> > > +++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
> > > @@ -49,6 +49,9 @@ CPU0: cpu@0 {
> > >   			clocks = <&kraitcc KRAIT_CPU_0>;
> > >   			clock-names = "cpu";
> > >   			clock-latency = <100000>;
> > > +			vdd-mem-supply = <&pm8921_l24>;
> > > +			vdd-dig-supply = <&pm8921_s3>;
> > > +			vdd-core-supply = <&saw0_vreg>;
> > >   			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
> > >   			operating-points-v2 = <&cpu_opp_table>;
> > >   			#cooling-cells = <2>;
> > > @@ -66,6 +69,9 @@ CPU1: cpu@1 {
> > >   			clocks = <&kraitcc KRAIT_CPU_1>;
> > >   			clock-names = "cpu";
> > >   			clock-latency = <100000>;
> > > +			vdd-mem-supply = <&pm8921_l24>;
> > > +			vdd-dig-supply = <&pm8921_s3>;
> > > +			vdd-core-supply = <&saw1_vreg>;
> > >   			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
> > >   			operating-points-v2 = <&cpu_opp_table>;
> > >   			#cooling-cells = <2>;
> > > @@ -83,6 +89,9 @@ CPU2: cpu@2 {
> > >   			clocks = <&kraitcc KRAIT_CPU_2>;
> > >   			clock-names = "cpu";
> > >   			clock-latency = <100000>;
> > > +			vdd-mem-supply = <&pm8921_l24>;
> > > +			vdd-dig-supply = <&pm8921_s3>;
> > > +			vdd-core-supply = <&saw2_vreg>;
> > >   			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
> > >   			operating-points-v2 = <&cpu_opp_table>;
> > >   			#cooling-cells = <2>;
> > > @@ -100,6 +109,9 @@ CPU3: cpu@3 {
> > >   			clocks = <&kraitcc KRAIT_CPU_3>;
> > >   			clock-names = "cpu";
> > >   			clock-latency = <100000>;
> > > +			vdd-mem-supply = <&pm8921_l24>;
> > > +			vdd-dig-supply = <&pm8921_s3>;
> > > +			vdd-core-supply = <&saw3_vreg>;
> > >   			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
> > >   			operating-points-v2 = <&cpu_opp_table>;
> > >   			#cooling-cells = <2>;
> > > @@ -132,6 +144,81 @@ cpu_opp_table: opp-table-cpu {
> > >   		opp-384000000 {
> > >   			opp-hz = /bits/ 64 <384000000>;
> > >   			opp-peak-kBps = <384000>;
> > > +			opp-microvolt-speed0-pvs0 = <1050000 1050000 1150000>,
> > > +						    <950000 950000 1150000>,
> > > +						    <950000 950000 975000>;
> > 
> > I think this won't result in the correct switch order without making
> > some changes to the OPP core. In _set_opp() the OPP core does
> > 
> > 	/* Scaling up? Configure required OPPs before frequency */
> > 	if (!scaling_down) {
> > 		_set_required_opps();
> > 		_set_opp_bw();
> > 		opp_table->config_regulators();
> > 	}
> > 
> > 	opp_table->config_clks();
> > 
> > 	/* Scaling down? Configure required OPPs after frequency */
> > 	if (scaling_down) {
> > 		opp_table->config_regulators();
> > 		_set_opp_bw();
> > 		_set_required_opps();
> > 	}
> > 
> > Since the "bandwidth" for the L2 cache is set before the regulators
> > there is a short window where the L2 clock is running at a high
> > frequency with too low voltage, which could potentially cause
> > instability. On downstream this seems to be done in the proper order [1].
> > 
> > I'm not sure if the order in the OPP core is on purpose. If not, you
> > could propose moving the config_regulators() first (for scaling up)
> > and last (for scaling down). This would resolve the problem.
> 
> Nice catch, I missed this ordering point.
> 
> > 
> > The alternative that I've already argued for on IRC in #linux-msm a
> > couple of days ago would be to give the L2 cache (here: "interconnect")
> > an own OPP table where it can describe its voltage requirements,
> > independent from the CPU. That way the icc_set_bw() would be guaranteed
> > to apply the correct voltage before adjusting the L2 cache clock. It
> > looks like the "l2_level" voltages for vdd_dig and vdd_mem are not
> > speedbin/PVS-specific [2] so this would also significantly reduce the DT
> > size, since you wouldn't need to repeat the same vdd_dig/vdd_mem
> > voltages for all of them.
> 
> Yes. I fact our discussion triggered me to do this patchset.
> 
> So, another option would be to have something like the following snippet. Do
> you know if we are allowed to squish additional data into the L2 cache DT
> node?
>

I have a similar implementation with the l2 devfreq driver where I need
to put a compatible in the l2-cache node. From what I observed, keeping
the l2-cache node in the cpus node makes the extra compile not work
(nothing is probed) but moving the l2-cache node in the soc node and
referencing the phandle makes the compatible correctly works and that
doesn't seems to cause any problem. IMHO it would be better to have a
separate opp table for l2, should keep things more organized.

> CPU0: cpu@0 {
>     vdd-core-supply = <&saw0_vreg>;
>     interconnects = <&L2 MASTER_KRAIT_L2 &L2 SLAVE_KRAIT_L2>;
>     operating-points-v2 = <&cpu_opp_table>;
> };
> 
> L2: l2-cache {
>     compatible = "qcom,apq8064-l2-cache", "cache";
> 
>     clocks = <&kraitcc KRAIT_L2>;
>     vdd-mem-supply = <&pm8921_l24>;
>     vdd-dig-supply = <&pm8921_s3>;
>     operating-points-v2 = <&l2_opp_table>;
> 
>     l2_opp_table {
>         compatible = "operating-points-v2";
>         opp-384000000 {
>             opp-hz = /bits/ 64 <384000000>;
>             opp-microvolt = <1050000 1050000 1150000>,
>                             <950000 950000 1150000>;
>         };
> 
>         opp-648000000 {
>             opp-hz = /bits/ 64 <648000000>;
>             opp-microvolt = <1050000 1050000 1150000>,
>                             <1050000 1050000 1150000>;
>         };
> 
>         opp-1134000000 {
>             opp-hz = /bits/ 64 <1134000000>;
>             opp-microvolt = <1150000 1150000 1150000>,
>                             <1150000 1150000 1150000>;
>         };
>     };
> };
> 
> > 
> > Thanks,
> > Stephan
> > 
> > [1]: https://git.codelinaro.org/clo/la/kernel/msm/-/blob/LA.AF.1.2.1-08410-8064.0/arch/arm/mach-msm/acpuclock-krait.c#L529-588
> > [2]: https://git.codelinaro.org/clo/la/kernel/msm/-/blob/LA.AF.1.2.1-08410-8064.0/arch/arm/mach-msm/acpuclock-8064.c#L118-135
> 
> -- 
> With best wishes
> Dmitry
>

Stephan Gerhold June 12, 2023, 9:01 a.m. UTC | #3

On Mon, Jun 12, 2023 at 08:39:19AM +0300, Dmitry Baryshkov wrote:
> APQ8064 has 4 speed bins, each of them having from 4 to 6 categorization
> kinds. Provide tables necessary to handle voltage scaling on this SoC.
> 
> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> ---
>  arch/arm/boot/dts/qcom-apq8064.dtsi | 1017 +++++++++++++++++++++++++++
>  1 file changed, 1017 insertions(+)
> 
> diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi b/arch/arm/boot/dts/qcom-apq8064.dtsi
> index 4ef13f3d702b..f35853b59544 100644
> --- a/arch/arm/boot/dts/qcom-apq8064.dtsi
> +++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
> @@ -49,6 +49,9 @@ CPU0: cpu@0 {
>  			clocks = <&kraitcc KRAIT_CPU_0>;
>  			clock-names = "cpu";
>  			clock-latency = <100000>;
> +			vdd-mem-supply = <&pm8921_l24>;
> +			vdd-dig-supply = <&pm8921_s3>;
> +			vdd-core-supply = <&saw0_vreg>;
>  			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
>  			operating-points-v2 = <&cpu_opp_table>;
>  			#cooling-cells = <2>;
> @@ -66,6 +69,9 @@ CPU1: cpu@1 {
>  			clocks = <&kraitcc KRAIT_CPU_1>;
>  			clock-names = "cpu";
>  			clock-latency = <100000>;
> +			vdd-mem-supply = <&pm8921_l24>;
> +			vdd-dig-supply = <&pm8921_s3>;
> +			vdd-core-supply = <&saw1_vreg>;
>  			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
>  			operating-points-v2 = <&cpu_opp_table>;
>  			#cooling-cells = <2>;
> @@ -83,6 +89,9 @@ CPU2: cpu@2 {
>  			clocks = <&kraitcc KRAIT_CPU_2>;
>  			clock-names = "cpu";
>  			clock-latency = <100000>;
> +			vdd-mem-supply = <&pm8921_l24>;
> +			vdd-dig-supply = <&pm8921_s3>;
> +			vdd-core-supply = <&saw2_vreg>;
>  			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
>  			operating-points-v2 = <&cpu_opp_table>;
>  			#cooling-cells = <2>;
> @@ -100,6 +109,9 @@ CPU3: cpu@3 {
>  			clocks = <&kraitcc KRAIT_CPU_3>;
>  			clock-names = "cpu";
>  			clock-latency = <100000>;
> +			vdd-mem-supply = <&pm8921_l24>;
> +			vdd-dig-supply = <&pm8921_s3>;
> +			vdd-core-supply = <&saw3_vreg>;
>  			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
>  			operating-points-v2 = <&cpu_opp_table>;
>  			#cooling-cells = <2>;
> @@ -132,6 +144,81 @@ cpu_opp_table: opp-table-cpu {
>  		opp-384000000 {
>  			opp-hz = /bits/ 64 <384000000>;
>  			opp-peak-kBps = <384000>;
> +			opp-microvolt-speed0-pvs0 = <1050000 1050000 1150000>,
> +						    <950000 950000 1150000>,
> +						    <950000 950000 975000>;

I think this won't result in the correct switch order without making
some changes to the OPP core. In _set_opp() the OPP core does

	/* Scaling up? Configure required OPPs before frequency */
	if (!scaling_down) {
		_set_required_opps();
		_set_opp_bw();
		opp_table->config_regulators();
	}

	opp_table->config_clks();

	/* Scaling down? Configure required OPPs after frequency */
	if (scaling_down) {
		opp_table->config_regulators();
		_set_opp_bw();
		_set_required_opps();
	}

Since the "bandwidth" for the L2 cache is set before the regulators
there is a short window where the L2 clock is running at a high
frequency with too low voltage, which could potentially cause
instability. On downstream this seems to be done in the proper order [1].

I'm not sure if the order in the OPP core is on purpose. If not, you
could propose moving the config_regulators() first (for scaling up)
and last (for scaling down). This would resolve the problem.

The alternative that I've already argued for on IRC in #linux-msm a
couple of days ago would be to give the L2 cache (here: "interconnect")
an own OPP table where it can describe its voltage requirements,
independent from the CPU. That way the icc_set_bw() would be guaranteed
to apply the correct voltage before adjusting the L2 cache clock. It
looks like the "l2_level" voltages for vdd_dig and vdd_mem are not
speedbin/PVS-specific [2] so this would also significantly reduce the DT
size, since you wouldn't need to repeat the same vdd_dig/vdd_mem
voltages for all of them.

Thanks,
Stephan

[1]: https://git.codelinaro.org/clo/la/kernel/msm/-/blob/LA.AF.1.2.1-08410-8064.0/arch/arm/mach-msm/acpuclock-krait.c#L529-588
[2]: https://git.codelinaro.org/clo/la/kernel/msm/-/blob/LA.AF.1.2.1-08410-8064.0/arch/arm/mach-msm/acpuclock-8064.c#L118-135

Dmitry Baryshkov June 12, 2023, 1:33 p.m. UTC | #4

On 12/06/2023 12:01, Stephan Gerhold wrote:
> On Mon, Jun 12, 2023 at 08:39:19AM +0300, Dmitry Baryshkov wrote:
>> APQ8064 has 4 speed bins, each of them having from 4 to 6 categorization
>> kinds. Provide tables necessary to handle voltage scaling on this SoC.
>>
>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>> ---
>>   arch/arm/boot/dts/qcom-apq8064.dtsi | 1017 +++++++++++++++++++++++++++
>>   1 file changed, 1017 insertions(+)
>>
>> diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi b/arch/arm/boot/dts/qcom-apq8064.dtsi
>> index 4ef13f3d702b..f35853b59544 100644
>> --- a/arch/arm/boot/dts/qcom-apq8064.dtsi
>> +++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
>> @@ -49,6 +49,9 @@ CPU0: cpu@0 {
>>   			clocks = <&kraitcc KRAIT_CPU_0>;
>>   			clock-names = "cpu";
>>   			clock-latency = <100000>;
>> +			vdd-mem-supply = <&pm8921_l24>;
>> +			vdd-dig-supply = <&pm8921_s3>;
>> +			vdd-core-supply = <&saw0_vreg>;
>>   			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
>>   			operating-points-v2 = <&cpu_opp_table>;
>>   			#cooling-cells = <2>;
>> @@ -66,6 +69,9 @@ CPU1: cpu@1 {
>>   			clocks = <&kraitcc KRAIT_CPU_1>;
>>   			clock-names = "cpu";
>>   			clock-latency = <100000>;
>> +			vdd-mem-supply = <&pm8921_l24>;
>> +			vdd-dig-supply = <&pm8921_s3>;
>> +			vdd-core-supply = <&saw1_vreg>;
>>   			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
>>   			operating-points-v2 = <&cpu_opp_table>;
>>   			#cooling-cells = <2>;
>> @@ -83,6 +89,9 @@ CPU2: cpu@2 {
>>   			clocks = <&kraitcc KRAIT_CPU_2>;
>>   			clock-names = "cpu";
>>   			clock-latency = <100000>;
>> +			vdd-mem-supply = <&pm8921_l24>;
>> +			vdd-dig-supply = <&pm8921_s3>;
>> +			vdd-core-supply = <&saw2_vreg>;
>>   			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
>>   			operating-points-v2 = <&cpu_opp_table>;
>>   			#cooling-cells = <2>;
>> @@ -100,6 +109,9 @@ CPU3: cpu@3 {
>>   			clocks = <&kraitcc KRAIT_CPU_3>;
>>   			clock-names = "cpu";
>>   			clock-latency = <100000>;
>> +			vdd-mem-supply = <&pm8921_l24>;
>> +			vdd-dig-supply = <&pm8921_s3>;
>> +			vdd-core-supply = <&saw3_vreg>;
>>   			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
>>   			operating-points-v2 = <&cpu_opp_table>;
>>   			#cooling-cells = <2>;
>> @@ -132,6 +144,81 @@ cpu_opp_table: opp-table-cpu {
>>   		opp-384000000 {
>>   			opp-hz = /bits/ 64 <384000000>;
>>   			opp-peak-kBps = <384000>;
>> +			opp-microvolt-speed0-pvs0 = <1050000 1050000 1150000>,
>> +						    <950000 950000 1150000>,
>> +						    <950000 950000 975000>;
> 
> I think this won't result in the correct switch order without making
> some changes to the OPP core. In _set_opp() the OPP core does
> 
> 	/* Scaling up? Configure required OPPs before frequency */
> 	if (!scaling_down) {
> 		_set_required_opps();
> 		_set_opp_bw();
> 		opp_table->config_regulators();
> 	}
> 
> 	opp_table->config_clks();
> 
> 	/* Scaling down? Configure required OPPs after frequency */
> 	if (scaling_down) {
> 		opp_table->config_regulators();
> 		_set_opp_bw();
> 		_set_required_opps();
> 	}
> 
> Since the "bandwidth" for the L2 cache is set before the regulators
> there is a short window where the L2 clock is running at a high
> frequency with too low voltage, which could potentially cause
> instability. On downstream this seems to be done in the proper order [1].
> 
> I'm not sure if the order in the OPP core is on purpose. If not, you
> could propose moving the config_regulators() first (for scaling up)
> and last (for scaling down). This would resolve the problem.

Nice catch, I missed this ordering point.

> 
> The alternative that I've already argued for on IRC in #linux-msm a
> couple of days ago would be to give the L2 cache (here: "interconnect")
> an own OPP table where it can describe its voltage requirements,
> independent from the CPU. That way the icc_set_bw() would be guaranteed
> to apply the correct voltage before adjusting the L2 cache clock. It
> looks like the "l2_level" voltages for vdd_dig and vdd_mem are not
> speedbin/PVS-specific [2] so this would also significantly reduce the DT
> size, since you wouldn't need to repeat the same vdd_dig/vdd_mem
> voltages for all of them.

Yes. I fact our discussion triggered me to do this patchset.

So, another option would be to have something like the following 
snippet. Do you know if we are allowed to squish additional data into 
the L2 cache DT node?

CPU0: cpu@0 {
     vdd-core-supply = <&saw0_vreg>;
     interconnects = <&L2 MASTER_KRAIT_L2 &L2 SLAVE_KRAIT_L2>;
     operating-points-v2 = <&cpu_opp_table>;
};

L2: l2-cache {
     compatible = "qcom,apq8064-l2-cache", "cache";

     clocks = <&kraitcc KRAIT_L2>;
     vdd-mem-supply = <&pm8921_l24>;
     vdd-dig-supply = <&pm8921_s3>;
     operating-points-v2 = <&l2_opp_table>;

     l2_opp_table {
         compatible = "operating-points-v2";
         opp-384000000 {
             opp-hz = /bits/ 64 <384000000>;
             opp-microvolt = <1050000 1050000 1150000>,
                             <950000 950000 1150000>;
         };

         opp-648000000 {
             opp-hz = /bits/ 64 <648000000>;
             opp-microvolt = <1050000 1050000 1150000>,
                             <1050000 1050000 1150000>;
         };

         opp-1134000000 {
             opp-hz = /bits/ 64 <1134000000>;
             opp-microvolt = <1150000 1150000 1150000>,
                             <1150000 1150000 1150000>;
         };
     };
};

> 
> Thanks,
> Stephan
> 
> [1]: https://git.codelinaro.org/clo/la/kernel/msm/-/blob/LA.AF.1.2.1-08410-8064.0/arch/arm/mach-msm/acpuclock-krait.c#L529-588
> [2]: https://git.codelinaro.org/clo/la/kernel/msm/-/blob/LA.AF.1.2.1-08410-8064.0/arch/arm/mach-msm/acpuclock-8064.c#L118-135

Stephan Gerhold June 12, 2023, 1:59 p.m. UTC | #5

On Mon, Jun 12, 2023 at 04:33:09PM +0300, Dmitry Baryshkov wrote:
> On 12/06/2023 12:01, Stephan Gerhold wrote:
> > On Mon, Jun 12, 2023 at 08:39:19AM +0300, Dmitry Baryshkov wrote:
> > > APQ8064 has 4 speed bins, each of them having from 4 to 6 categorization
> > > kinds. Provide tables necessary to handle voltage scaling on this SoC.
> > > 
> > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
> > > ---
> > >   arch/arm/boot/dts/qcom-apq8064.dtsi | 1017 +++++++++++++++++++++++++++
> > >   1 file changed, 1017 insertions(+)
> > > 
> > > diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi b/arch/arm/boot/dts/qcom-apq8064.dtsi
> > > index 4ef13f3d702b..f35853b59544 100644
> > > --- a/arch/arm/boot/dts/qcom-apq8064.dtsi
> > > +++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
> > > @@ -49,6 +49,9 @@ CPU0: cpu@0 {
> > >   			clocks = <&kraitcc KRAIT_CPU_0>;
> > >   			clock-names = "cpu";
> > >   			clock-latency = <100000>;
> > > +			vdd-mem-supply = <&pm8921_l24>;
> > > +			vdd-dig-supply = <&pm8921_s3>;
> > > +			vdd-core-supply = <&saw0_vreg>;
> > >   			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
> > >   			operating-points-v2 = <&cpu_opp_table>;
> > >   			#cooling-cells = <2>;
> > > @@ -66,6 +69,9 @@ CPU1: cpu@1 {
> > >   			clocks = <&kraitcc KRAIT_CPU_1>;
> > >   			clock-names = "cpu";
> > >   			clock-latency = <100000>;
> > > +			vdd-mem-supply = <&pm8921_l24>;
> > > +			vdd-dig-supply = <&pm8921_s3>;
> > > +			vdd-core-supply = <&saw1_vreg>;
> > >   			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
> > >   			operating-points-v2 = <&cpu_opp_table>;
> > >   			#cooling-cells = <2>;
> > > @@ -83,6 +89,9 @@ CPU2: cpu@2 {
> > >   			clocks = <&kraitcc KRAIT_CPU_2>;
> > >   			clock-names = "cpu";
> > >   			clock-latency = <100000>;
> > > +			vdd-mem-supply = <&pm8921_l24>;
> > > +			vdd-dig-supply = <&pm8921_s3>;
> > > +			vdd-core-supply = <&saw2_vreg>;
> > >   			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
> > >   			operating-points-v2 = <&cpu_opp_table>;
> > >   			#cooling-cells = <2>;
> > > @@ -100,6 +109,9 @@ CPU3: cpu@3 {
> > >   			clocks = <&kraitcc KRAIT_CPU_3>;
> > >   			clock-names = "cpu";
> > >   			clock-latency = <100000>;
> > > +			vdd-mem-supply = <&pm8921_l24>;
> > > +			vdd-dig-supply = <&pm8921_s3>;
> > > +			vdd-core-supply = <&saw3_vreg>;
> > >   			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
> > >   			operating-points-v2 = <&cpu_opp_table>;
> > >   			#cooling-cells = <2>;
> > > @@ -132,6 +144,81 @@ cpu_opp_table: opp-table-cpu {
> > >   		opp-384000000 {
> > >   			opp-hz = /bits/ 64 <384000000>;
> > >   			opp-peak-kBps = <384000>;
> > > +			opp-microvolt-speed0-pvs0 = <1050000 1050000 1150000>,
> > > +						    <950000 950000 1150000>,
> > > +						    <950000 950000 975000>;
> > 
> > I think this won't result in the correct switch order without making
> > some changes to the OPP core. In _set_opp() the OPP core does
> > 
> > 	/* Scaling up? Configure required OPPs before frequency */
> > 	if (!scaling_down) {
> > 		_set_required_opps();
> > 		_set_opp_bw();
> > 		opp_table->config_regulators();
> > 	}
> > 
> > 	opp_table->config_clks();
> > 
> > 	/* Scaling down? Configure required OPPs after frequency */
> > 	if (scaling_down) {
> > 		opp_table->config_regulators();
> > 		_set_opp_bw();
> > 		_set_required_opps();
> > 	}
> > 
> > Since the "bandwidth" for the L2 cache is set before the regulators
> > there is a short window where the L2 clock is running at a high
> > frequency with too low voltage, which could potentially cause
> > instability. On downstream this seems to be done in the proper order [1].
> > 
> > I'm not sure if the order in the OPP core is on purpose. If not, you
> > could propose moving the config_regulators() first (for scaling up)
> > and last (for scaling down). This would resolve the problem.
> 
> Nice catch, I missed this ordering point.
> 
> > 
> > The alternative that I've already argued for on IRC in #linux-msm a
> > couple of days ago would be to give the L2 cache (here: "interconnect")
> > an own OPP table where it can describe its voltage requirements,
> > independent from the CPU. That way the icc_set_bw() would be guaranteed
> > to apply the correct voltage before adjusting the L2 cache clock. It
> > looks like the "l2_level" voltages for vdd_dig and vdd_mem are not
> > speedbin/PVS-specific [2] so this would also significantly reduce the DT
> > size, since you wouldn't need to repeat the same vdd_dig/vdd_mem
> > voltages for all of them.
> 
> Yes. I fact our discussion triggered me to do this patchset.
> 
> So, another option would be to have something like the following snippet. Do
> you know if we are allowed to squish additional data into the L2 cache DT
> node?
> 

I suspect no one has tried this before, so only the DT maintainers could
answer this. I would say that it just follows the existing design of
clocks/-supply/OPPs on the CPU nodes. vdd-mem-supply isn't a property of
the CPU, it's a property of the L2 cache so it actually fits better there.

I think the more controversial questions might be:

  - Is a L2 cache really an "interconnect"? I suppose one could argue it
    connects multiple CPU cores to a cluster (similar how a CCI connects
    multiple clusters to a system).

  - What would bind to the l2-cache node? A separate driver? Does that
    work if it sits below the /cpus node?

Thanks,
Stephan

Dmitry Baryshkov June 12, 2023, 2:20 p.m. UTC | #6

On 11/06/2023 19:27, Christian Marangi wrote:
> On Mon, Jun 12, 2023 at 08:39:04AM +0300, Dmitry Baryshkov wrote:
>> Implement CPUFreq support for one of the oldest supported Qualcomm
>> platforms, APQ8064. Each core has independent power and frequency
>> control. Additionally the L2 cache is scaled to follow the CPU
>> frequencies (failure to do so results in strange semi-random crashes).
> 
> Hi, can we talk, maybe in private about this interconnect-cpu thing?

Hi, sure. Feel free to ping me on IRC (lumag) or via email. Or we can 
just continue our discussion here, as it might be interesting to other 
people too.

> I see you follow the original implementation of the msm_bus where in
> practice with the use of the kbps the correct clock and voltage was set.
> (and this was also used to set the fabric clock from nominal to fast)
> 
> On ipq806x and I assume other SoC there isn't always a 1:1 map of CPU
> freq and L2 freq. For example on ipq8064 we have max CPU freq of 1.4GHz
> and L2 freq of 1.2GHz, on ipq8065 we have CPU 1.7GHz and L2 of 1.4GHz.

This is also the case for apq8064. The vendor kernel defines 15 
frequencies for L2 cache clock, but then for some reasons all PVS tables 
use just 3 entries from these 15.

> (and even that is curious since I used the debug regs and the cxo
> crystal to measure the clock by hardware (yes i ported the very ancient
> clk-debug to modern kernel and it works and discovered all sort of
> things) the L2 (I assume due to climitation of the hfpll) actually can't
> never reach that frequency (1.4GHz in reality results to something like
> 1.2GHz from what I notice a stable clock is there only with frequency of
> max 1GHz))

I would like to point you to https://github.com/andersson/debugcc/, 
which is a userspace reimplementation of clk-debug. We'd appreciate your 
patches there.

> So my idea was to introduce a simple devfreq driver and use the PASSIVE
> governor where it was added the possibility to link to a CPU frequency
> and with interpolation select the L2 frequency (and voltage)

I stumbled upon this idea, when I was working on the msm8996 and it's 
CBF clock (CBF = interconnect between two core clusters). While it 
should be possible to use DEVFREQ in simple cases (e.g. L2 clock >= 
max(CPU clock), if possible). However real configurations are slightly 
harder.
E.g. for the purpose of this patchset, the relationship for apq8064 is 
the following (in MHz):

  CPU    L2
  384    384
  486    648
  594    648
  702    648
....    ...
1026    648
1134   1134
....   ....
1512   1134
....   ....

It should be noted that msm8960 also used just three values for the L2 
cache frequencies. From what I can see, only msm8x60 made L2 freq 
tightly follow the CPU frequency.

>  From some old comments in ancient qsdk code it was pointed out that due
> to a hw limitation the secondary cpu can't stay at a high clock if L2
> was at the idle clock. (no idea if this is specific to IPQ806x) So this
> might be a cause of your crash? (I also have random crash with L2
> scaling and we are planning to just force the L2 at max frequency)

It might be related. It was more or less the same story with msm8996, 
which was either 'maxcpus=2' or scaling the CBF clock.

> But sorry for all of this (maybe) useless info. I checked the other
> patch and I didn't understand how the different L2 frequency are
> declared and even the voltage. Is this something that will come later?
> I'm very interested in this implementation.

The L2 frequency (<&kraitcc 4>) is converted into bandwidth vote, which 
then goes into the OPP tables. But please also see the discussion 
started at the patch 15.

> 
>>
>> Core voltage is controlled through the SAW2 devices, one for each core.
>> The L2 has two regulators, vdd-mem and vdd-dig.
>>
>> Depenency: [1] for interconnect-clk implementation
>>
>> https://lore.kernel.org/linux-arm-msm/20230512001334.2983048-3-dmitry.baryshkov@linaro.org/
>>
>

Dmitry Baryshkov June 12, 2023, 3:38 p.m. UTC | #7

On 12/06/2023 16:59, Stephan Gerhold wrote:
> On Mon, Jun 12, 2023 at 04:33:09PM +0300, Dmitry Baryshkov wrote:
>> On 12/06/2023 12:01, Stephan Gerhold wrote:
>>> On Mon, Jun 12, 2023 at 08:39:19AM +0300, Dmitry Baryshkov wrote:
>>>> APQ8064 has 4 speed bins, each of them having from 4 to 6 categorization
>>>> kinds. Provide tables necessary to handle voltage scaling on this SoC.
>>>>
>>>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
>>>> ---
>>>>    arch/arm/boot/dts/qcom-apq8064.dtsi | 1017 +++++++++++++++++++++++++++
>>>>    1 file changed, 1017 insertions(+)
>>>>
>>>> diff --git a/arch/arm/boot/dts/qcom-apq8064.dtsi b/arch/arm/boot/dts/qcom-apq8064.dtsi
>>>> index 4ef13f3d702b..f35853b59544 100644
>>>> --- a/arch/arm/boot/dts/qcom-apq8064.dtsi
>>>> +++ b/arch/arm/boot/dts/qcom-apq8064.dtsi
>>>> @@ -49,6 +49,9 @@ CPU0: cpu@0 {
>>>>    			clocks = <&kraitcc KRAIT_CPU_0>;
>>>>    			clock-names = "cpu";
>>>>    			clock-latency = <100000>;
>>>> +			vdd-mem-supply = <&pm8921_l24>;
>>>> +			vdd-dig-supply = <&pm8921_s3>;
>>>> +			vdd-core-supply = <&saw0_vreg>;
>>>>    			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
>>>>    			operating-points-v2 = <&cpu_opp_table>;
>>>>    			#cooling-cells = <2>;
>>>> @@ -66,6 +69,9 @@ CPU1: cpu@1 {
>>>>    			clocks = <&kraitcc KRAIT_CPU_1>;
>>>>    			clock-names = "cpu";
>>>>    			clock-latency = <100000>;
>>>> +			vdd-mem-supply = <&pm8921_l24>;
>>>> +			vdd-dig-supply = <&pm8921_s3>;
>>>> +			vdd-core-supply = <&saw1_vreg>;
>>>>    			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
>>>>    			operating-points-v2 = <&cpu_opp_table>;
>>>>    			#cooling-cells = <2>;
>>>> @@ -83,6 +89,9 @@ CPU2: cpu@2 {
>>>>    			clocks = <&kraitcc KRAIT_CPU_2>;
>>>>    			clock-names = "cpu";
>>>>    			clock-latency = <100000>;
>>>> +			vdd-mem-supply = <&pm8921_l24>;
>>>> +			vdd-dig-supply = <&pm8921_s3>;
>>>> +			vdd-core-supply = <&saw2_vreg>;
>>>>    			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
>>>>    			operating-points-v2 = <&cpu_opp_table>;
>>>>    			#cooling-cells = <2>;
>>>> @@ -100,6 +109,9 @@ CPU3: cpu@3 {
>>>>    			clocks = <&kraitcc KRAIT_CPU_3>;
>>>>    			clock-names = "cpu";
>>>>    			clock-latency = <100000>;
>>>> +			vdd-mem-supply = <&pm8921_l24>;
>>>> +			vdd-dig-supply = <&pm8921_s3>;
>>>> +			vdd-core-supply = <&saw3_vreg>;
>>>>    			interconnects = <&kraitcc MASTER_KRAIT_L2 &kraitcc SLAVE_KRAIT_L2>;
>>>>    			operating-points-v2 = <&cpu_opp_table>;
>>>>    			#cooling-cells = <2>;
>>>> @@ -132,6 +144,81 @@ cpu_opp_table: opp-table-cpu {
>>>>    		opp-384000000 {
>>>>    			opp-hz = /bits/ 64 <384000000>;
>>>>    			opp-peak-kBps = <384000>;
>>>> +			opp-microvolt-speed0-pvs0 = <1050000 1050000 1150000>,
>>>> +						    <950000 950000 1150000>,
>>>> +						    <950000 950000 975000>;
>>>

[skipped the OPP voltage vs bw ordering]

>>
>>>
>>> The alternative that I've already argued for on IRC in #linux-msm a
>>> couple of days ago would be to give the L2 cache (here: "interconnect")
>>> an own OPP table where it can describe its voltage requirements,
>>> independent from the CPU. That way the icc_set_bw() would be guaranteed
>>> to apply the correct voltage before adjusting the L2 cache clock. It
>>> looks like the "l2_level" voltages for vdd_dig and vdd_mem are not
>>> speedbin/PVS-specific [2] so this would also significantly reduce the DT
>>> size, since you wouldn't need to repeat the same vdd_dig/vdd_mem
>>> voltages for all of them.
>>
>> Yes. I fact our discussion triggered me to do this patchset.
>>
>> So, another option would be to have something like the following snippet. Do
>> you know if we are allowed to squish additional data into the L2 cache DT
>> node?
>>
> 
> I suspect no one has tried this before, so only the DT maintainers could
> answer this. I would say that it just follows the existing design of
> clocks/-supply/OPPs on the CPU nodes. vdd-mem-supply isn't a property of
> the CPU, it's a property of the L2 cache so it actually fits better there. >
> I think the more controversial questions might be:
> 
>    - Is a L2 cache really an "interconnect"? I suppose one could argue it
>      connects multiple CPU cores to a cluster (similar how a CCI connects
>      multiple clusters to a system).

Yes. This was my reasoning for CBF clock as well as for this L2 clock. 
The separate L2 cache device is also an interconnect from my POV. It 
connects all CPU cores and we have to vote on its frequency.

>    - What would bind to the l2-cache node? A separate driver? Does that
>      work if it sits below the /cpus node?

In the worst case we'd have to populate that manually. E.g. from the 
qcom-cpufreq-nvmem.c

Christian Marangi June 13, 2023, 4:19 p.m. UTC | #8

On Mon, Jun 12, 2023 at 05:20:02PM +0300, Dmitry Baryshkov wrote:
> On 11/06/2023 19:27, Christian Marangi wrote:
> > On Mon, Jun 12, 2023 at 08:39:04AM +0300, Dmitry Baryshkov wrote:
> > > Implement CPUFreq support for one of the oldest supported Qualcomm
> > > platforms, APQ8064. Each core has independent power and frequency
> > > control. Additionally the L2 cache is scaled to follow the CPU
> > > frequencies (failure to do so results in strange semi-random crashes).
> > 
> > Hi, can we talk, maybe in private about this interconnect-cpu thing?
> 
> Hi, sure. Feel free to ping me on IRC (lumag) or via email. Or we can just
> continue our discussion here, as it might be interesting to other people
> too.
>

Don't know if here is the right place to discuss my concern and problem
with L2 scaling on ipq8064...

> > I see you follow the original implementation of the msm_bus where in
> > practice with the use of the kbps the correct clock and voltage was set.
> > (and this was also used to set the fabric clock from nominal to fast)
> > 
> > On ipq806x and I assume other SoC there isn't always a 1:1 map of CPU
> > freq and L2 freq. For example on ipq8064 we have max CPU freq of 1.4GHz
> > and L2 freq of 1.2GHz, on ipq8065 we have CPU 1.7GHz and L2 of 1.4GHz.
> 
> This is also the case for apq8064. The vendor kernel defines 15 frequencies
> for L2 cache clock, but then for some reasons all PVS tables use just 3
> entries from these 15.
> 

Eh who knows why they did this... Probably the hfpll was limited or they
notice no temp/power benefits were present with scaling with that much
of steps?

> > (and even that is curious since I used the debug regs and the cxo
> > crystal to measure the clock by hardware (yes i ported the very ancient
> > clk-debug to modern kernel and it works and discovered all sort of
> > things) the L2 (I assume due to climitation of the hfpll) actually can't
> > never reach that frequency (1.4GHz in reality results to something like
> > 1.2GHz from what I notice a stable clock is there only with frequency of
> > max 1GHz))
> 
> I would like to point you to https://github.com/andersson/debugcc/, which is
> a userspace reimplementation of clk-debug. We'd appreciate your patches
> there.
> 

Hi, I wasted some good time on the implementation but manage to make it
work and proposed a pr! I assume the thing can be reused for apq8064 if
someone ever wants to have fun with that.

> > So my idea was to introduce a simple devfreq driver and use the PASSIVE
> > governor where it was added the possibility to link to a CPU frequency
> > and with interpolation select the L2 frequency (and voltage)
> 
> I stumbled upon this idea, when I was working on the msm8996 and it's CBF
> clock (CBF = interconnect between two core clusters). While it should be
> possible to use DEVFREQ in simple cases (e.g. L2 clock >= max(CPU clock), if
> possible). However real configurations are slightly harder.
> E.g. for the purpose of this patchset, the relationship for apq8064 is the
> following (in MHz):
> 
>  CPU    L2
>  384    384
>  486    648
>  594    648
>  702    648
> ....    ...
> 1026    648
> 1134   1134
> ....   ....
> 1512   1134
> ....   ....
> 
> It should be noted that msm8960 also used just three values for the L2 cache
> frequencies. From what I can see, only msm8x60 made L2 freq tightly follow
> the CPU frequency.
> 

Happy to test and found a common path... With the merge of the cpu opp
and nvmem work, I was just about to send the L2 devfreq driver... And
also the fabric devfreq driver. But I wonder if I can use this
interconnect thing for the 2 task.

> >  From some old comments in ancient qsdk code it was pointed out that due
> > to a hw limitation the secondary cpu can't stay at a high clock if L2
> > was at the idle clock. (no idea if this is specific to IPQ806x) So this
> > might be a cause of your crash? (I also have random crash with L2
> > scaling and we are planning to just force the L2 at max frequency)
> 
> It might be related. It was more or less the same story with msm8996, which
> was either 'maxcpus=2' or scaling the CBF clock.
> 

Might be a krait defect... and this is pretty bad...

> > But sorry for all of this (maybe) useless info. I checked the other
> > patch and I didn't understand how the different L2 frequency are
> > declared and even the voltage. Is this something that will come later?
> > I'm very interested in this implementation.
> 
> The L2 frequency (<&kraitcc 4>) is converted into bandwidth vote, which then
> goes into the OPP tables. But please also see the discussion started at the
> patch 15.
> 

I didn't notice you were defining multiple supply, scaling the voltage
under the hood with that trick. It's not a bad idea but as pointed out
it might be problematic, since is seems krait is very sensible with L2
frequency and voltage so we should simulate the original implementation
as close as possible...

> > 
> > > 
> > > Core voltage is controlled through the SAW2 devices, one for each core.
> > > The L2 has two regulators, vdd-mem and vdd-dig.
> > > 
> > > Depenency: [1] for interconnect-clk implementation
> > > 
> > > https://lore.kernel.org/linux-arm-msm/20230512001334.2983048-3-dmitry.baryshkov@linaro.org/
> > > 
> > 
> 
> -- 
> With best wishes
> Dmitry
>

Dmitry Baryshkov June 14, 2023, 8:18 p.m. UTC | #9

On 13/06/2023 19:19, Christian Marangi wrote:
> On Mon, Jun 12, 2023 at 05:20:02PM +0300, Dmitry Baryshkov wrote:
>> On 11/06/2023 19:27, Christian Marangi wrote:
>>> On Mon, Jun 12, 2023 at 08:39:04AM +0300, Dmitry Baryshkov wrote:
>>>> Implement CPUFreq support for one of the oldest supported Qualcomm
>>>> platforms, APQ8064. Each core has independent power and frequency
>>>> control. Additionally the L2 cache is scaled to follow the CPU
>>>> frequencies (failure to do so results in strange semi-random crashes).
>>>
>>> Hi, can we talk, maybe in private about this interconnect-cpu thing?
>>
>> Hi, sure. Feel free to ping me on IRC (lumag) or via email. Or we can just
>> continue our discussion here, as it might be interesting to other people
>> too.
>>
> 
> Don't know if here is the right place to discuss my concern and problem
> with L2 scaling on ipq8064...

I think I will try segregating L2 data to l2-cache device node (I saw 
your comment that it is not populated by default. I'll have to fix this).

> 
>>> I see you follow the original implementation of the msm_bus where in
>>> practice with the use of the kbps the correct clock and voltage was set.
>>> (and this was also used to set the fabric clock from nominal to fast)
>>>
>>> On ipq806x and I assume other SoC there isn't always a 1:1 map of CPU
>>> freq and L2 freq. For example on ipq8064 we have max CPU freq of 1.4GHz
>>> and L2 freq of 1.2GHz, on ipq8065 we have CPU 1.7GHz and L2 of 1.4GHz.
>>
>> This is also the case for apq8064. The vendor kernel defines 15 frequencies
>> for L2 cache clock, but then for some reasons all PVS tables use just 3
>> entries from these 15.
>>
> 
> Eh who knows why they did this... Probably the hfpll was limited or they
> notice no temp/power benefits were present with scaling with that much
> of steps?
> 
>>> (and even that is curious since I used the debug regs and the cxo
>>> crystal to measure the clock by hardware (yes i ported the very ancient
>>> clk-debug to modern kernel and it works and discovered all sort of
>>> things) the L2 (I assume due to climitation of the hfpll) actually can't
>>> never reach that frequency (1.4GHz in reality results to something like
>>> 1.2GHz from what I notice a stable clock is there only with frequency of
>>> max 1GHz))
>>
>> I would like to point you to https://github.com/andersson/debugcc/, which is
>> a userspace reimplementation of clk-debug. We'd appreciate your patches
>> there.
>>
> 
> Hi, I wasted some good time on the implementation but manage to make it
> work and proposed a pr! I assume the thing can be reused for apq8064 if
> someone ever wants to have fun with that.

Thanks a lot! Generally I think that debugcc is a very valuable 
debugging tool and it should be getting more attention from the 
community. With the chips newer than 8064 it is easy enough to add new 
platform data.

> 
>>> So my idea was to introduce a simple devfreq driver and use the PASSIVE
>>> governor where it was added the possibility to link to a CPU frequency
>>> and with interpolation select the L2 frequency (and voltage)
>>
>> I stumbled upon this idea, when I was working on the msm8996 and it's CBF
>> clock (CBF = interconnect between two core clusters). While it should be
>> possible to use DEVFREQ in simple cases (e.g. L2 clock >= max(CPU clock), if
>> possible). However real configurations are slightly harder.
>> E.g. for the purpose of this patchset, the relationship for apq8064 is the
>> following (in MHz):
>>
>>   CPU    L2
>>   384    384
>>   486    648
>>   594    648
>>   702    648
>> ....    ...
>> 1026    648
>> 1134   1134
>> ....   ....
>> 1512   1134
>> ....   ....
>>
>> It should be noted that msm8960 also used just three values for the L2 cache
>> frequencies. From what I can see, only msm8x60 made L2 freq tightly follow
>> the CPU frequency.
>>
> 
> Happy to test and found a common path... With the merge of the cpu opp
> and nvmem work, I was just about to send the L2 devfreq driver... And
> also the fabric devfreq driver. But I wonder if I can use this
> interconnect thing for the 2 task.
> 
>>>   From some old comments in ancient qsdk code it was pointed out that due
>>> to a hw limitation the secondary cpu can't stay at a high clock if L2
>>> was at the idle clock. (no idea if this is specific to IPQ806x) So this
>>> might be a cause of your crash? (I also have random crash with L2
>>> scaling and we are planning to just force the L2 at max frequency)
>>
>> It might be related. It was more or less the same story with msm8996, which
>> was either 'maxcpus=2' or scaling the CBF clock.
>>
> 
> Might be a krait defect... and this is pretty bad...

I don't know if it is a defect or just a misfeature. Anyway, we know 
that L2 should be clocked high enough and we can cope with it.

> 
>>> But sorry for all of this (maybe) useless info. I checked the other
>>> patch and I didn't understand how the different L2 frequency are
>>> declared and even the voltage. Is this something that will come later?
>>> I'm very interested in this implementation.
>>
>> The L2 frequency (<&kraitcc 4>) is converted into bandwidth vote, which then
>> goes into the OPP tables. But please also see the discussion started at the
>> patch 15.
>>
> 
> I didn't notice you were defining multiple supply, scaling the voltage
> under the hood with that trick. It's not a bad idea but as pointed out
> it might be problematic, since is seems krait is very sensible with L2
> frequency and voltage so we should simulate the original implementation
> as close as possible...

It was my original intention,as the vendor kernel does it in the 
vdd-mem, vdd-dig, vdd-core, L2-freq, core freq order. I did not expect 
that voltages are scaled after BW casts. (this describes freq-increase 
case, in case of decreasing frequency the order is inverted).

> 
>>>
>>>>
>>>> Core voltage is controlled through the SAW2 devices, one for each core.
>>>> The L2 has two regulators, vdd-mem and vdd-dig.
>>>>
>>>> Depenency: [1] for interconnect-clk implementation
>>>>
>>>> https://lore.kernel.org/linux-arm-msm/20230512001334.2983048-3-dmitry.baryshkov@linaro.org/
>>>>
>>>
>>
>> -- 
>> With best wishes
>> Dmitry
>>
>

[00/18] ARM: qcom: apq8064: support CPU frequency scaling

Message

Comments