[v2] rtc: ac100: Fix ac100 determine rate bug

Message ID 20180213121414.7000-1-embed3d@gmail.com
State Superseded
Headers show
Series
  • [v2] rtc: ac100: Fix ac100 determine rate bug
Related show

Commit Message

Philipp Rossak Feb. 13, 2018, 12:14 p.m.
This patch fixes a bug, that prevents the Allwinner A83T and the A80
from a successful boot. You can find the shortend trace below:

Unable to handle kernel NULL pointer dereference at virtual address
00000000
pgd = (ptrval)
[00000000] *pgd=00000000
Internal error: Oops: 5 [#1] SMP ARM
Modules linked in:
CPU: 0 PID: 49 Comm: kworker/0:1 Not tainted 4.15.0-10190-gb89e32ccd1be #2
Hardware name: Allwinner sun8i Family
Workqueue: events deferred_probe_work_func
PC is at clk_hw_get_rate+0x0/0x34
LR is at ac100_clkout_determine_rate+0x48/0x19c

[ ... ]

(clk_hw_get_rate) from (ac100_clkout_determine_rate+0x48/0x19c)
(ac100_clkout_determine_rate) from  (clk_core_set_rate_nolock+0x3c/0x1a0)
(clk_core_set_rate_nolock) from (clk_set_rate+0x30/0x88)
(clk_set_rate) from (of_clk_set_defaults+0x200/0x364)
(of_clk_set_defaults) from (platform_drv_probe+0x18/0xb0)

To fix that bug, we first check if the return of the
clk_hw_get_parent_by_index is non zero. If it is zero we skip that
clock parent.

The BUG report could be found here: https://lkml.org/lkml/2018/2/10/198

Fixes: 04940631b8d2 ("rtc: ac100: Add clk output support")

Signed-off-by: Philipp Rossak <embed3d@gmail.com>
---

Changes in v2:
	* add tag Fixes: ... to commit message
	* add comment to if statement why we are doing this check

 drivers/rtc/rtc-ac100.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)

Comments

Maxime Ripard Feb. 13, 2018, 1:32 p.m. | #1
On Tue, Feb 13, 2018 at 01:14:14PM +0100, Philipp Rossak wrote:
> This patch fixes a bug, that prevents the Allwinner A83T and the A80
> from a successful boot. You can find the shortend trace below:

Since when is it there?

> Unable to handle kernel NULL pointer dereference at virtual address
> 00000000
> pgd = (ptrval)
> [00000000] *pgd=00000000
> Internal error: Oops: 5 [#1] SMP ARM
> Modules linked in:
> CPU: 0 PID: 49 Comm: kworker/0:1 Not tainted 4.15.0-10190-gb89e32ccd1be #2
> Hardware name: Allwinner sun8i Family
> Workqueue: events deferred_probe_work_func
> PC is at clk_hw_get_rate+0x0/0x34
> LR is at ac100_clkout_determine_rate+0x48/0x19c
> 
> [ ... ]
> 
> (clk_hw_get_rate) from (ac100_clkout_determine_rate+0x48/0x19c)
> (ac100_clkout_determine_rate) from  (clk_core_set_rate_nolock+0x3c/0x1a0)
> (clk_core_set_rate_nolock) from (clk_set_rate+0x30/0x88)
> (clk_set_rate) from (of_clk_set_defaults+0x200/0x364)
> (of_clk_set_defaults) from (platform_drv_probe+0x18/0xb0)
> 
> To fix that bug, we first check if the return of the
> clk_hw_get_parent_by_index is non zero. If it is zero we skip that
> clock parent.
> 
> The BUG report could be found here: https://lkml.org/lkml/2018/2/10/198
> 
> Fixes: 04940631b8d2 ("rtc: ac100: Add clk output support")

Should it be sent to stable?

> Signed-off-by: Philipp Rossak <embed3d@gmail.com>
> ---
> 
> Changes in v2:
> 	* add tag Fixes: ... to commit message
> 	* add comment to if statement why we are doing this check
> 
>  drivers/rtc/rtc-ac100.c | 12 +++++++++++-
>  1 file changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/rtc/rtc-ac100.c b/drivers/rtc/rtc-ac100.c
> index 8ff9dc3fe5bf..ba73201d8cc1 100644
> --- a/drivers/rtc/rtc-ac100.c
> +++ b/drivers/rtc/rtc-ac100.c
> @@ -183,7 +183,17 @@ static int ac100_clkout_determine_rate(struct clk_hw *hw,
>  
>  	for (i = 0; i < num_parents; i++) {
>  		struct clk_hw *parent = clk_hw_get_parent_by_index(hw, i);
> -		unsigned long tmp, prate = clk_hw_get_rate(parent);
> +		unsigned long tmp, prate;
> +
> +		/*
> +		 * We purposefully left open the possibility to use the clock
> +		 * from the codec side but it is not implemented right now.
> +		 * Thus we need to check if the parent exists.
> +		 */
> +		if (!parent)
> +			continue;
> +
> +		prate = clk_hw_get_rate(parent);

clk_hw_get_num_parents should return the exact number of parents,
which is going to be 1 if you only have one parent, like all DTS seems
to have.

If not, then it should be explained in the comment and / or fixed
properly.

Maxime
Chen-Yu Tsai Feb. 13, 2018, 1:44 p.m. | #2
On Tue, Feb 13, 2018 at 9:32 PM, Maxime Ripard
<maxime.ripard@bootlin.com> wrote:
> On Tue, Feb 13, 2018 at 01:14:14PM +0100, Philipp Rossak wrote:
>> This patch fixes a bug, that prevents the Allwinner A83T and the A80
>> from a successful boot. You can find the shortend trace below:
>
> Since when is it there?
>
>> Unable to handle kernel NULL pointer dereference at virtual address
>> 00000000
>> pgd = (ptrval)
>> [00000000] *pgd=00000000
>> Internal error: Oops: 5 [#1] SMP ARM
>> Modules linked in:
>> CPU: 0 PID: 49 Comm: kworker/0:1 Not tainted 4.15.0-10190-gb89e32ccd1be #2
>> Hardware name: Allwinner sun8i Family
>> Workqueue: events deferred_probe_work_func
>> PC is at clk_hw_get_rate+0x0/0x34
>> LR is at ac100_clkout_determine_rate+0x48/0x19c
>>
>> [ ... ]
>>
>> (clk_hw_get_rate) from (ac100_clkout_determine_rate+0x48/0x19c)
>> (ac100_clkout_determine_rate) from  (clk_core_set_rate_nolock+0x3c/0x1a0)
>> (clk_core_set_rate_nolock) from (clk_set_rate+0x30/0x88)
>> (clk_set_rate) from (of_clk_set_defaults+0x200/0x364)
>> (of_clk_set_defaults) from (platform_drv_probe+0x18/0xb0)
>>
>> To fix that bug, we first check if the return of the
>> clk_hw_get_parent_by_index is non zero. If it is zero we skip that
>> clock parent.
>>
>> The BUG report could be found here: https://lkml.org/lkml/2018/2/10/198
>>
>> Fixes: 04940631b8d2 ("rtc: ac100: Add clk output support")
>
> Should it be sent to stable?
>
>> Signed-off-by: Philipp Rossak <embed3d@gmail.com>
>> ---
>>
>> Changes in v2:
>>       * add tag Fixes: ... to commit message
>>       * add comment to if statement why we are doing this check
>>
>>  drivers/rtc/rtc-ac100.c | 12 +++++++++++-
>>  1 file changed, 11 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/rtc/rtc-ac100.c b/drivers/rtc/rtc-ac100.c
>> index 8ff9dc3fe5bf..ba73201d8cc1 100644
>> --- a/drivers/rtc/rtc-ac100.c
>> +++ b/drivers/rtc/rtc-ac100.c
>> @@ -183,7 +183,17 @@ static int ac100_clkout_determine_rate(struct clk_hw *hw,
>>
>>       for (i = 0; i < num_parents; i++) {
>>               struct clk_hw *parent = clk_hw_get_parent_by_index(hw, i);
>> -             unsigned long tmp, prate = clk_hw_get_rate(parent);
>> +             unsigned long tmp, prate;
>> +
>> +             /*
>> +              * We purposefully left open the possibility to use the clock
>> +              * from the codec side but it is not implemented right now.
>> +              * Thus we need to check if the parent exists.
>> +              */
>> +             if (!parent)
>> +                     continue;
>> +
>> +             prate = clk_hw_get_rate(parent);
>
> clk_hw_get_num_parents should return the exact number of parents,
> which is going to be 1 if you only have one parent, like all DTS seems
> to have.
>
> If not, then it should be explained in the comment and / or fixed
> properly.

The clock has two parents. One is a fixed clock internally registered
by the driver. This is actually an external crystal, and we should
probably add a device node and the works for it. The other parent
is a clock from the codec side, which we properly declare and
reference in the device tree. This clock, though defined, is not
implemented in any driver (because we don't have any ATM).

This second missing clock is what's causing issues here. The clk core
looks for the parent by name, can't find one that is registered, and
returns NULL.

I guess the comment above is still not clear enough?

ChenYu

> Maxime
>
> --
> Maxime Ripard, Bootlin (formerly Free Electrons)
> Embedded Linux and Kernel engineering
> http://bootlin.com
Philipp Rossak Feb. 13, 2018, 3:23 p.m. | #3
On 13.02.2018 14:44, Chen-Yu Tsai wrote:
> On Tue, Feb 13, 2018 at 9:32 PM, Maxime Ripard
> <maxime.ripard@bootlin.com> wrote:
>> On Tue, Feb 13, 2018 at 01:14:14PM +0100, Philipp Rossak wrote:
>>> This patch fixes a bug, that prevents the Allwinner A83T and the A80
>>> from a successful boot. You can find the shortend trace below:
>>
>> Since when is it there?
>>
The bug is there since v4.16-rc1 and appeared after the clk branch was 
merged.

^^ Should I add this info also in the commit message?

>>> Unable to handle kernel NULL pointer dereference at virtual address
>>> 00000000
>>> pgd = (ptrval)
>>> [00000000] *pgd=00000000
>>> Internal error: Oops: 5 [#1] SMP ARM
>>> Modules linked in:
>>> CPU: 0 PID: 49 Comm: kworker/0:1 Not tainted 4.15.0-10190-gb89e32ccd1be #2
>>> Hardware name: Allwinner sun8i Family
>>> Workqueue: events deferred_probe_work_func
>>> PC is at clk_hw_get_rate+0x0/0x34
>>> LR is at ac100_clkout_determine_rate+0x48/0x19c
>>>
>>> [ ... ]
>>>
>>> (clk_hw_get_rate) from (ac100_clkout_determine_rate+0x48/0x19c)
>>> (ac100_clkout_determine_rate) from  (clk_core_set_rate_nolock+0x3c/0x1a0)
>>> (clk_core_set_rate_nolock) from (clk_set_rate+0x30/0x88)
>>> (clk_set_rate) from (of_clk_set_defaults+0x200/0x364)
>>> (of_clk_set_defaults) from (platform_drv_probe+0x18/0xb0)
>>>
>>> To fix that bug, we first check if the return of the
>>> clk_hw_get_parent_by_index is non zero. If it is zero we skip that
>>> clock parent.
>>>
>>> The BUG report could be found here: https://lkml.org/lkml/2018/2/10/198
>>>
>>> Fixes: 04940631b8d2 ("rtc: ac100: Add clk output support")
>>
>> Should it be sent to stable?
>>
>>> Signed-off-by: Philipp Rossak <embed3d@gmail.com>
>>> ---
>>>
>>> Changes in v2:
>>>        * add tag Fixes: ... to commit message
>>>        * add comment to if statement why we are doing this check
>>>
>>>   drivers/rtc/rtc-ac100.c | 12 +++++++++++-
>>>   1 file changed, 11 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/rtc/rtc-ac100.c b/drivers/rtc/rtc-ac100.c
>>> index 8ff9dc3fe5bf..ba73201d8cc1 100644
>>> --- a/drivers/rtc/rtc-ac100.c
>>> +++ b/drivers/rtc/rtc-ac100.c
>>> @@ -183,7 +183,17 @@ static int ac100_clkout_determine_rate(struct clk_hw *hw,
>>>
>>>        for (i = 0; i < num_parents; i++) {
>>>                struct clk_hw *parent = clk_hw_get_parent_by_index(hw, i);
>>> -             unsigned long tmp, prate = clk_hw_get_rate(parent);
>>> +             unsigned long tmp, prate;
>>> +
>>> +             /*
>>> +              * We purposefully left open the possibility to use the clock
>>> +              * from the codec side but it is not implemented right now.
>>> +              * Thus we need to check if the parent exists.
>>> +              */
>>> +             if (!parent)
>>> +                     continue;
>>> +
>>> +             prate = clk_hw_get_rate(parent);
>>
>> clk_hw_get_num_parents should return the exact number of parents,
>> which is going to be 1 if you only have one parent, like all DTS seems
>> to have.
>>
>> If not, then it should be explained in the comment and / or fixed
>> properly.
> 
> The clock has two parents. One is a fixed clock internally registered
> by the driver. This is actually an external crystal, and we should
> probably add a device node and the works for it. The other parent
> is a clock from the codec side, which we properly declare and
> reference in the device tree. This clock, though defined, is not
> implemented in any driver (because we don't have any ATM).
> 
> This second missing clock is what's causing issues here. The clk core
> looks for the parent by name, can't find one that is registered, and
> returns NULL.
> 
> I guess the comment above is still not clear enough?

I can get more detailed in the comment. I thought about this:

The clock has two parents, one is a fixed clock which is internally 
registered by the ac100 driver. The other parent is a clock from the 
codec side of the chip, which we properly declare and reference in the 
devicetree and is not implemented in any driver right now.
If the clock core looks for the parent of that second missing clock, it 
can't one that is registered and returns NULL.
Thus we need to check if the parent exists before we get the parent rate.

Is that ok for you?

Philipp

> 
> ChenYu
> 
>> Maxime
>>
>> --
>> Maxime Ripard, Bootlin (formerly Free Electrons)
>> Embedded Linux and Kernel engineering
>> http://bootlin.com
Maxime Ripard Feb. 14, 2018, 12:15 p.m. | #4
On Tue, Feb 13, 2018 at 04:23:00PM +0100, Philipp Rossak wrote:
> 
> 
> On 13.02.2018 14:44, Chen-Yu Tsai wrote:
> > On Tue, Feb 13, 2018 at 9:32 PM, Maxime Ripard
> > <maxime.ripard@bootlin.com> wrote:
> > > On Tue, Feb 13, 2018 at 01:14:14PM +0100, Philipp Rossak wrote:
> > > > This patch fixes a bug, that prevents the Allwinner A83T and the A80
> > > > from a successful boot. You can find the shortend trace below:
> > > 
> > > Since when is it there?
> > > 
> The bug is there since v4.16-rc1 and appeared after the clk branch was
> merged.
> 
> ^^ Should I add this info also in the commit message?

Yes

> > > > Unable to handle kernel NULL pointer dereference at virtual address
> > > > 00000000
> > > > pgd = (ptrval)
> > > > [00000000] *pgd=00000000
> > > > Internal error: Oops: 5 [#1] SMP ARM
> > > > Modules linked in:
> > > > CPU: 0 PID: 49 Comm: kworker/0:1 Not tainted 4.15.0-10190-gb89e32ccd1be #2
> > > > Hardware name: Allwinner sun8i Family
> > > > Workqueue: events deferred_probe_work_func
> > > > PC is at clk_hw_get_rate+0x0/0x34
> > > > LR is at ac100_clkout_determine_rate+0x48/0x19c
> > > > 
> > > > [ ... ]
> > > > 
> > > > (clk_hw_get_rate) from (ac100_clkout_determine_rate+0x48/0x19c)
> > > > (ac100_clkout_determine_rate) from  (clk_core_set_rate_nolock+0x3c/0x1a0)
> > > > (clk_core_set_rate_nolock) from (clk_set_rate+0x30/0x88)
> > > > (clk_set_rate) from (of_clk_set_defaults+0x200/0x364)
> > > > (of_clk_set_defaults) from (platform_drv_probe+0x18/0xb0)
> > > > 
> > > > To fix that bug, we first check if the return of the
> > > > clk_hw_get_parent_by_index is non zero. If it is zero we skip that
> > > > clock parent.
> > > > 
> > > > The BUG report could be found here: https://lkml.org/lkml/2018/2/10/198
> > > > 
> > > > Fixes: 04940631b8d2 ("rtc: ac100: Add clk output support")
> > > 
> > > Should it be sent to stable?
> > > 
> > > > Signed-off-by: Philipp Rossak <embed3d@gmail.com>
> > > > ---
> > > > 
> > > > Changes in v2:
> > > >        * add tag Fixes: ... to commit message
> > > >        * add comment to if statement why we are doing this check
> > > > 
> > > >   drivers/rtc/rtc-ac100.c | 12 +++++++++++-
> > > >   1 file changed, 11 insertions(+), 1 deletion(-)
> > > > 
> > > > diff --git a/drivers/rtc/rtc-ac100.c b/drivers/rtc/rtc-ac100.c
> > > > index 8ff9dc3fe5bf..ba73201d8cc1 100644
> > > > --- a/drivers/rtc/rtc-ac100.c
> > > > +++ b/drivers/rtc/rtc-ac100.c
> > > > @@ -183,7 +183,17 @@ static int ac100_clkout_determine_rate(struct clk_hw *hw,
> > > > 
> > > >        for (i = 0; i < num_parents; i++) {
> > > >                struct clk_hw *parent = clk_hw_get_parent_by_index(hw, i);
> > > > -             unsigned long tmp, prate = clk_hw_get_rate(parent);
> > > > +             unsigned long tmp, prate;
> > > > +
> > > > +             /*
> > > > +              * We purposefully left open the possibility to use the clock
> > > > +              * from the codec side but it is not implemented right now.
> > > > +              * Thus we need to check if the parent exists.
> > > > +              */
> > > > +             if (!parent)
> > > > +                     continue;
> > > > +
> > > > +             prate = clk_hw_get_rate(parent);
> > > 
> > > clk_hw_get_num_parents should return the exact number of parents,
> > > which is going to be 1 if you only have one parent, like all DTS seems
> > > to have.
> > > 
> > > If not, then it should be explained in the comment and / or fixed
> > > properly.
> > 
> > The clock has two parents. One is a fixed clock internally registered
> > by the driver. This is actually an external crystal, and we should
> > probably add a device node and the works for it. The other parent
> > is a clock from the codec side, which we properly declare and
> > reference in the device tree. This clock, though defined, is not
> > implemented in any driver (because we don't have any ATM).
> > 
> > This second missing clock is what's causing issues here. The clk core
> > looks for the parent by name, can't find one that is registered, and
> > returns NULL.
> > 
> > I guess the comment above is still not clear enough?
> 
> I can get more detailed in the comment. I thought about this:
> 
> The clock has two parents, one is a fixed clock which is internally
> registered by the ac100 driver. The other parent is a clock from the codec
> side of the chip, which we properly declare and reference in the devicetree
> and is not implemented in any driver right now.
> If the clock core looks for the parent of that second missing clock, it
> can't one that is registered and returns NULL.
> Thus we need to check if the parent exists before we get the parent rate.
> 
> Is that ok for you?

Yep, much better thanks!
Maxime

Patch

diff --git a/drivers/rtc/rtc-ac100.c b/drivers/rtc/rtc-ac100.c
index 8ff9dc3fe5bf..ba73201d8cc1 100644
--- a/drivers/rtc/rtc-ac100.c
+++ b/drivers/rtc/rtc-ac100.c
@@ -183,7 +183,17 @@  static int ac100_clkout_determine_rate(struct clk_hw *hw,
 
 	for (i = 0; i < num_parents; i++) {
 		struct clk_hw *parent = clk_hw_get_parent_by_index(hw, i);
-		unsigned long tmp, prate = clk_hw_get_rate(parent);
+		unsigned long tmp, prate;
+
+		/*
+		 * We purposefully left open the possibility to use the clock
+		 * from the codec side but it is not implemented right now.
+		 * Thus we need to check if the parent exists.
+		 */
+		if (!parent)
+			continue;
+
+		prate = clk_hw_get_rate(parent);
 
 		tmp = ac100_clkout_round_rate(hw, req->rate, prate);