diff mbox series

ARM: tegra: fix ulpi regression on tegra20

Message ID 20180219151252.29289-1-marcel@ziswiler.com
State Superseded
Headers show
Series ARM: tegra: fix ulpi regression on tegra20 | expand

Commit Message

Marcel Ziswiler Feb. 19, 2018, 3:12 p.m. UTC
From: Marcel Ziswiler <marcel.ziswiler@toradex.com>

Since commit f8f8f1d04494 ("clk: Don't touch hardware when reparenting
during registration") ULPI has been broken on Tegra20 leading to the
following error message during boot:

[    1.974698] ulpi_phy_power_on: ulpi write failed
[    1.979384] tegra-ehci c5004000.usb: Failed to power on the phy
[    1.985434] tegra-ehci: probe of c5004000.usb failed with error -110

Debugging through the changes and finally also consulting the TRM
revealed that rather than the CDEV2 clock off OSC requiring such pin
muxing actually the PLL_P_OUT4 clock is in use. It looks like so far it
just worked by chance of that one having been enabled which Stephen's
commit now changed when reparenting sclk away from pll_p_out4 leaving
that one disabled. Fix this by properly assigning the PLL_P_OUT4 clock
as the ULPI PHY clock.

Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>

---

 arch/arm/boot/dts/tegra20.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Marc Dietrich April 20, 2018, 8:52 a.m. UTC | #1
Hi Marcel,

Am Montag, 19. Februar 2018, 16:12:52 CEST schrieb Marcel Ziswiler:
> From: Marcel Ziswiler <marcel.ziswiler@toradex.com>
> 
> Since commit f8f8f1d04494 ("clk: Don't touch hardware when reparenting
> during registration") ULPI has been broken on Tegra20 leading to the
> following error message during boot:
> 
> [    1.974698] ulpi_phy_power_on: ulpi write failed
> [    1.979384] tegra-ehci c5004000.usb: Failed to power on the phy
> [    1.985434] tegra-ehci: probe of c5004000.usb failed with error -110
> 
> Debugging through the changes and finally also consulting the TRM
> revealed that rather than the CDEV2 clock off OSC requiring such pin
> muxing actually the PLL_P_OUT4 clock is in use. It looks like so far it
> just worked by chance of that one having been enabled which Stephen's
> commit now changed when reparenting sclk away from pll_p_out4 leaving
> that one disabled. Fix this by properly assigning the PLL_P_OUT4 clock
> as the ULPI PHY clock.

I booted 4.17-rc1 (which includes this fix) on an AC100 (T20 paz00 board) and 
the error above is still there. Surprisingly the error vanishes when I revert 
your patch. So this patch actually *causes* the problem above on my board. 
Could it be, that we need all four clocks? Dimitry mentioned on IRC that it 
could also be a problem in the clock init table. I don't have the technical 
background myself to fix it, but I still wonder what could be so different 
between TrimSlice and AC100.

Marc


> Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
> 
> ---
> 
>  arch/arm/boot/dts/tegra20.dtsi | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
> index 864a95872b8d..e05b6bb2599f 100644
> --- a/arch/arm/boot/dts/tegra20.dtsi
> +++ b/arch/arm/boot/dts/tegra20.dtsi
> @@ -741,7 +741,7 @@
>  		phy_type = "ulpi";
>  		clocks = <&tegra_car TEGRA20_CLK_USB2>,
>  			 <&tegra_car TEGRA20_CLK_PLL_U>,
> -			 <&tegra_car TEGRA20_CLK_CDEV2>;
> +			 <&tegra_car TEGRA20_CLK_PLL_P_OUT4>;
>  		clock-names = "reg", "pll_u", "ulpi-link";
>  		resets = <&tegra_car 58>, <&tegra_car 22>;
>  		reset-names = "usb", "utmi-pads";




--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dmitry Osipenko April 20, 2018, 10:50 a.m. UTC | #2
On 20.04.2018 11:52, Marc Dietrich wrote:
> Hi Marcel,
> 
> Am Montag, 19. Februar 2018, 16:12:52 CEST schrieb Marcel Ziswiler:
>> From: Marcel Ziswiler <marcel.ziswiler@toradex.com>
>>
>> Since commit f8f8f1d04494 ("clk: Don't touch hardware when reparenting
>> during registration") ULPI has been broken on Tegra20 leading to the
>> following error message during boot:
>>
>> [    1.974698] ulpi_phy_power_on: ulpi write failed
>> [    1.979384] tegra-ehci c5004000.usb: Failed to power on the phy
>> [    1.985434] tegra-ehci: probe of c5004000.usb failed with error -110
>>
>> Debugging through the changes and finally also consulting the TRM
>> revealed that rather than the CDEV2 clock off OSC requiring such pin
>> muxing actually the PLL_P_OUT4 clock is in use. It looks like so far it
>> just worked by chance of that one having been enabled which Stephen's
>> commit now changed when reparenting sclk away from pll_p_out4 leaving
>> that one disabled. Fix this by properly assigning the PLL_P_OUT4 clock
>> as the ULPI PHY clock.
> 
> I booted 4.17-rc1 (which includes this fix) on an AC100 (T20 paz00 board) and 
> the error above is still there. Surprisingly the error vanishes when I revert 
> your patch. So this patch actually *causes* the problem above on my board. 
> Could it be, that we need all four clocks? Dimitry mentioned on IRC that it 
> could also be a problem in the clock init table. I don't have the technical 
> background myself to fix it, but I still wonder what could be so different 
> between TrimSlice and AC100.

I managed to find CDEV clocks in TRM this time. Seems assigning CDEV2 clock to
"ulpi-link" was correct and both CDEV2 and PLL_P_OUT4 should be enabled, CDEV2
should gate the PLL_P_OUT4 that feeds USB HW and PLL_P_OUT4 should be
always-enabled because it is enabled by init_table, but apparently it is getting
disabled erroneously.

Marcel, could you please revert your patch, add
"trace_event=clk_enable,clk_disable,clk_set_parent tp_printk" to kernels cmdline
and post the log?

It looks like there is some clk framework bug, but just in case please also try
to apply this patch:

diff --git a/drivers/clk/tegra/clk-tegra-periph.c
b/drivers/clk/tegra/clk-tegra-periph.c
index 2acba2986bc6..407bd0c0ac2f 100644
--- a/drivers/clk/tegra/clk-tegra-periph.c
+++ b/drivers/clk/tegra/clk-tegra-periph.c
@@ -1024,7 +1024,7 @@ static void __init init_pllp(void __iomem *clk_base, void
__iomem *pmc_base,
                if (dt_clk) {
                        clk = tegra_clk_register_pll_out("pll_p_out4",
                                        "pll_p_out4_div", clk_base + PLLP_OUTB,
-                                       17, 16, CLK_IGNORE_UNUSED |
+                                       17, 16, CLK_IS_CRITICAL |
                                        CLK_SET_RATE_PARENT, 0,
                                        &PLLP_OUTB_lock);
                        *dt_clk = clk;
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Marcel Ziswiler April 23, 2018, 3:42 p.m. UTC | #3
Hi Marc

On Fri, 2018-04-20 at 10:52 +0200, Marc Dietrich wrote:
> 
> ...
> I booted 4.17-rc1 (which includes this fix) on an AC100 (T20 paz00
> board) and 
> the error above is still there. Surprisingly the error vanishes when
> I revert 
> your patch. So this patch actually *causes* the problem above on my
> board.

That's really strange.

I believe I do have one of them paz00 boards laying around somewhere as
well. Just need to dig it out again and will give it a try. Looking at
their schematics at least reveals the exact same circuit as found on
all other T20 based boards using DAP_MCLK2 as REFCLK to the USB3315C
which BTW is 24 MHz and not 26 MHz as CDEV2 claims!

> Could it be, that we need all four clocks? Dimitry mentioned on IRC
> that it 
> could also be a problem in the clock init table. I don't have the
> technical 
> background myself to fix it, but I still wonder what could be so
> different 
> between TrimSlice and AC100.

I am wondering the same. However I still suspect that something is
completely wrong in that area as that CDEV2 clock is completely bogus.
It really does not exist!

> Marc

Cheers

Marcel
Dmitry Osipenko April 24, 2018, 2:38 p.m. UTC | #4
Hi Marcel,

On 24.04.2018 01:05, Marcel Ziswiler wrote:
> Hi Dmitry
> 
> On Fri, 2018-04-20 at 13:50 +0300, Dmitry Osipenko wrote:
>> ...
>> I managed to find CDEV clocks in TRM this time.
> 
> And where exactly in which TRM? In all my attempts at finding anything
> CDEV2 is just a pad group which includes the DAP_MCLK2 pad in question. 

That's the DEV2 clock you're talking about below.

>> Seems assigning CDEV2 clock to
>> "ulpi-link" was correct
> 
> Sorry, but I do really not see how you can get to any such conclusion.
> 
> Whatever that CDEV2 clock exactly is. Its offset 93 points towards the
> CLK_RST_CONTROLLER_CLK_OUT_ENB_U_0 register which has CLK_ENB_DEV2_OUT
> at bit position 29 reading "Enable clock to DEV2 pad". While the TRM
> misses further documenting what exactly that DEV2 pad should be I guess
> it may point towards our suspected DAP_MCLK2. So for some reason
> besides PLL_P_OUT4 which is what that pad is actually muxed to also
> some additional DEV2 pad clock needs enabling. In addition there would
> be a CLK_RST_CONTROLLER_MISC_CLK_ENB_0 register where one could also
> specify a DEV2_OSC_DIV_SEL but I would assume that one only applies if
> the pad in question being muxed to OSC which is not the case at least
> in all device trees I have looked at.
> 
>> and both CDEV2 and PLL_P_OUT4 should be enabled,
> 
> Agreed, basically the DAP_MCLK2 pad seems gated by an additional enable
> called CLK_ENB_DEV2_OUT.
> 
>> CDEV2
>> should gate the PLL_P_OUT4 that feeds USB HW and PLL_P_OUT4 should be
>> always-enabled because it is enabled by init_table, but apparently it
>> is getting
>> disabled erroneously.
> 
> At least that was the case back when I had trouble getting ULPI to work
> on T20. Strangely latest -next right now does no longer seem to suffer
> from that same issue even if my patch is reverted but as mentioned
> before nobody stops the required PLL_P_OUT4 which is what is actually
> driving DAP_MCLK2 to not be changed behind the scenes breaking the
> whole thing again.
> 
>> Marcel, could you please revert your patch, add
>> "trace_event=clk_enable,clk_disable,clk_set_parent tp_printk" to
>> kernels cmdline
>> and post the log?
> 
> Yes, the only difference in those traces is whether or not that non-
> existent CDEV2 clock is enabled or not:
> 
> [    1.897521] clk_enable: cdev2_fixed
> [    1.901008] clk_enable: cdev2
> 
> I also do have an explanation why it kept working in my case. Probably
> the boot ROM or U-Boot is already setting that bit.
> 
>> It looks like there is some clk framework bug,
> 
> My conclusion is that there should be a DAP_MCLK2 clock which is gated
> by that CLK_ENB_DEV2_OUT but really outputs a T20 internal clock
> according to its pin muxing which if set to OSC may be further divided
> down by DEV2_OSC_DIV_SEL.

Could be that DEV2_OSC_DIV_SEL is only relevant when DAP_MCLK2 is mux'ed to OSC.
Maybe Peter could clarify everything.

>> but just in case please also try
>> to apply this patch:
>>
>> diff --git a/drivers/clk/tegra/clk-tegra-periph.c
>> b/drivers/clk/tegra/clk-tegra-periph.c
>> index 2acba2986bc6..407bd0c0ac2f 100644
>> --- a/drivers/clk/tegra/clk-tegra-periph.c
>> +++ b/drivers/clk/tegra/clk-tegra-periph.c
>> @@ -1024,7 +1024,7 @@ static void __init init_pllp(void __iomem
>> *clk_base, void
>> __iomem *pmc_base,
>>                 if (dt_clk) {
>>                         clk =
>> tegra_clk_register_pll_out("pll_p_out4",
>>                                         "pll_p_out4_div", clk_base +
>> PLLP_OUTB,
>> -                                       17, 16, CLK_IGNORE_UNUSED |
>> +                                       17, 16, CLK_IS_CRITICAL |
>>                                         CLK_SET_RATE_PARENT, 0,
>>                                         &PLLP_OUTB_lock);
>>                         *dt_clk = clk;
> 
> I did not have to do any such but maybe that would at least prevent any
> further issues on PLL_P_OUT4. However I still believe this is kind of
> wrong as PLL_P_OUT4 is what drives DAP_MCLK2 in its current muxing
> which is connected to the ULPI transceivers REFCLK pin.
> 
> BTW: That pin is usually to be driven at 24 MHz and not 26 MHz which
> CDEV2 claims to be at.

Pin is driven by the PLL_P_OUT4, which is set to 24 MHz via the init_table. The
clock tree is invalid in that regards because Tegra's clock driver defines
non-existent "cdev2_fixed" 26 MHz clock source and sets it as a parent for the
"cdev2" clock, while "cdev2" should be a gate (and maybe divider?) for the real
parent clock (PLL_P_OUT4 in our case).

> What do you think?

It looks to me that a clock signal coming from the mux'ed CDEV2 pin is routed to
the DEV2 of CLK controller (which can gate it) and then it goes out to DAP_MCLK2.

PLL_P_OUT4 ---> PINMUX_CDEV2 ---> CLK_DEV2 ---> DAP_MCLK2

But it also could be that CLK_ENB_DEV2_OUT is indeed some internal
pinmux-related clock-gate. I think Peter should have a clue.

Anyway the implementation details do not really matter for us, what matters is
that PLL_P_OUT4 clk and DEV2 clk-gate must be enabled.

Seems indeed ideally would be to have DAP_MCLK2 for the USB controller's
"ulpi-link" clock and then DEV2 will be enable bit for the DAP_MCLK2. But also
information about parent clock of the DAP_MCLK2 needs to be conducted to the clk
driver via DT, that probably would require backward-incompatible change of the
binding which is undesirable.

One tolerable solution could be to remove fake "cdev2_fixed" clock in the driver
and hardcode PLL_P_OUT4 for the parent of "cdev2". (Note that cdev1 also has a
fake parent)

index 0ee56dd04cec..4708653dedeb 100644
--- a/drivers/clk/tegra/clk-tegra20.c
+++ b/drivers/clk/tegra/clk-tegra20.c
@@ -838,8 +838,7 @@ static void __init tegra20_periph_clk_init(void)
        clks[TEGRA20_CLK_CDEV1] = clk;

        /* cdev2 */
-       clk = clk_register_fixed_rate(NULL, "cdev2_fixed", NULL, 0, 26000000);
-       clk = tegra_clk_register_periph_gate("cdev2", "cdev2_fixed", 0,
+       clk = tegra_clk_register_periph_gate("cdev2", "pll_p_out4", 0,
                                    clk_base, 0, 93, periph_clk_enb_refcnt);
        clks[TEGRA20_CLK_CDEV2] = clk;


Now the real problem is that PLL_P_OUT4 was getting disabled. We had CDEV2 clock
in the DT for "ulpi-link" and PLL_P_OUT4 was permanently enabled (supposedly) by
the Tegra's clock driver using init_table. Apparently PLL_P_OUT4 was disabled on
SCLK re-parenting, as you mentioned in the commit description. This should be
considered as a bug because one of two purposes (permanently enable and setup
default clock rate) of the init_table is defeated.

Could you please try to bisect to the point when erroneous PLL_P_OUT4 disable
issue is fixed->broken? It is quite important to know what commit changed the
behaviour. If it was some common clk issue that got fixed - that should be a
good case for us, if it was some unrelated change that obscured the issue -
that's not good and we should go back and fix the real bug.
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Peter De Schrijver April 26, 2018, 11:39 a.m. UTC | #5
On Tue, Apr 24, 2018 at 05:38:41PM +0300, Dmitry Osipenko wrote:
> Hi Marcel,
> 
> On 24.04.2018 01:05, Marcel Ziswiler wrote:
> > Hi Dmitry
> > 
> > On Fri, 2018-04-20 at 13:50 +0300, Dmitry Osipenko wrote:
> >> ...
> >> I managed to find CDEV clocks in TRM this time.
> > 
> > And where exactly in which TRM? In all my attempts at finding anything
> > CDEV2 is just a pad group which includes the DAP_MCLK2 pad in question. 
> 
> That's the DEV2 clock you're talking about below.
> 
> >> Seems assigning CDEV2 clock to
> >> "ulpi-link" was correct
> > 
> > Sorry, but I do really not see how you can get to any such conclusion.
> > 
> > Whatever that CDEV2 clock exactly is. Its offset 93 points towards the
> > CLK_RST_CONTROLLER_CLK_OUT_ENB_U_0 register which has CLK_ENB_DEV2_OUT
> > at bit position 29 reading "Enable clock to DEV2 pad". While the TRM
> > misses further documenting what exactly that DEV2 pad should be I guess
> > it may point towards our suspected DAP_MCLK2. So for some reason
> > besides PLL_P_OUT4 which is what that pad is actually muxed to also
> > some additional DEV2 pad clock needs enabling. In addition there would
> > be a CLK_RST_CONTROLLER_MISC_CLK_ENB_0 register where one could also
> > specify a DEV2_OSC_DIV_SEL but I would assume that one only applies if
> > the pad in question being muxed to OSC which is not the case at least
> > in all device trees I have looked at.
> > 
> >> and both CDEV2 and PLL_P_OUT4 should be enabled,
> > 
> > Agreed, basically the DAP_MCLK2 pad seems gated by an additional enable
> > called CLK_ENB_DEV2_OUT.
> > 
> >> CDEV2
> >> should gate the PLL_P_OUT4 that feeds USB HW and PLL_P_OUT4 should be
> >> always-enabled because it is enabled by init_table, but apparently it
> >> is getting
> >> disabled erroneously.
> > 
> > At least that was the case back when I had trouble getting ULPI to work
> > on T20. Strangely latest -next right now does no longer seem to suffer
> > from that same issue even if my patch is reverted but as mentioned
> > before nobody stops the required PLL_P_OUT4 which is what is actually
> > driving DAP_MCLK2 to not be changed behind the scenes breaking the
> > whole thing again.
> > 
> >> Marcel, could you please revert your patch, add
> >> "trace_event=clk_enable,clk_disable,clk_set_parent tp_printk" to
> >> kernels cmdline
> >> and post the log?
> > 
> > Yes, the only difference in those traces is whether or not that non-
> > existent CDEV2 clock is enabled or not:
> > 
> > [    1.897521] clk_enable: cdev2_fixed
> > [    1.901008] clk_enable: cdev2
> > 
> > I also do have an explanation why it kept working in my case. Probably
> > the boot ROM or U-Boot is already setting that bit.
> > 
> >> It looks like there is some clk framework bug,
> > 
> > My conclusion is that there should be a DAP_MCLK2 clock which is gated
> > by that CLK_ENB_DEV2_OUT but really outputs a T20 internal clock
> > according to its pin muxing which if set to OSC may be further divided
> > down by DEV2_OSC_DIV_SEL.
> 
> Could be that DEV2_OSC_DIV_SEL is only relevant when DAP_MCLK2 is mux'ed to OSC.
> Maybe Peter could clarify everything.
> 

Yes. This looks likely. Problem here is that the source selection for cdev1
and cdev2 is done via pinmuxing rather than via a regular mux control in CAR.
So probably the "OSC" input is not really OSC but the output of the divider
configured in CLK_RST_CONTROLLER_MISC_CLK_ENB_0.

Possible solutions I see are:

1) make the tegra20 pinctrl driver a clock provider for the cdev1 and cdev2
   clocks. We would then need to introduce 2 new clocks cdev1_mux and
   cdev2_mux which represent pinmuxes. cdev1 and cdev2 would use those as
   a parent. cdev1_fixed and cdev2_fixed would need to be renamed to refelect
   the settings in CLK_RST_CONTROLLER_MISC_CLK_ENB_0 and can be inputs for
   cdev1_mux and cdev2_mux

2) read the pinmux configuration for cdev1 and cdev2 at boot. this would
   require mapping the apbmisc aperture in the clock driver, but we already
   map the PMC aperture, so there's a precedent for mappings outside CAR.
   Then define the cdev1 and cdev2 parent accordingly. Also cdev1_fixed and
   cdev2_fixed could be defined as the divided version of OSC according to
   the settings in CLK_RST_CONTROLLER_MISC_CLK_ENB_0. Limitation of this
   approach is that we assume these settings are never changed at runtime.
   This seems acceptable to me though. Or do we know of a use case where the
   setting is changed at runtime?

Solution 1 is nicer, but would require backwards-incompatible changes to the
DT bindings.

Peter.
--
To unsubscribe from this list: send the line "unsubscribe linux-tegra" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox series

Patch

diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index 864a95872b8d..e05b6bb2599f 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -741,7 +741,7 @@ 
 		phy_type = "ulpi";
 		clocks = <&tegra_car TEGRA20_CLK_USB2>,
 			 <&tegra_car TEGRA20_CLK_PLL_U>,
-			 <&tegra_car TEGRA20_CLK_CDEV2>;
+			 <&tegra_car TEGRA20_CLK_PLL_P_OUT4>;
 		clock-names = "reg", "pll_u", "ulpi-link";
 		resets = <&tegra_car 58>, <&tegra_car 22>;
 		reset-names = "usb", "utmi-pads";