diff mbox series

[v2,03/15] pwm: lpss: Add range limit check for the base_unit register value

Message ID 20200607181840.13536-4-hdegoede@redhat.com
State Superseded
Headers show
Series pwm/i915: Convert pwm-crc and i915 driver's PWM code to use the atomic PWM API | expand

Commit Message

Hans de Goede June 7, 2020, 6:18 p.m. UTC
When the user requests a high enough period ns value, then the
calculations in pwm_lpss_prepare() might result in a base_unit value of 0.

But according to the data-sheet the way the PWM controller works is that
each input clock-cycle the base_unit gets added to a N bit counter and
that counter overflowing determines the PWM output frequency. Adding 0
to the counter is a no-op. The data-sheet even explicitly states that
writing 0 to the base_unit bits will result in the PWM outputting a
continuous 0 signal.

base_unit values > (base_unit_range / 256), or iow base_unit values using
the 8 most significant bits, cause loss of resolution of the duty-cycle.
E.g. assuming a base_unit_range of 65536 steps, then a base_unit value of
768 (256 * 3), limits the duty-cycle resolution to 65536 / 768 = 85 steps.
Clamp the max base_unit value to base_unit_range / 32 to ensure a
duty-cycle resolution of at least 32 steps. This limits the maximum
output frequency to 600 KHz / 780 KHz depending on the base clock.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 drivers/pwm/pwm-lpss.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Andy Shevchenko June 8, 2020, 3:50 a.m. UTC | #1
On Sun, Jun 07, 2020 at 08:18:28PM +0200, Hans de Goede wrote:
> When the user requests a high enough period ns value, then the
> calculations in pwm_lpss_prepare() might result in a base_unit value of 0.
> 
> But according to the data-sheet the way the PWM controller works is that
> each input clock-cycle the base_unit gets added to a N bit counter and
> that counter overflowing determines the PWM output frequency. Adding 0
> to the counter is a no-op. The data-sheet even explicitly states that
> writing 0 to the base_unit bits will result in the PWM outputting a
> continuous 0 signal.

So, and why it's a problem?

> base_unit values > (base_unit_range / 256), or iow base_unit values using
> the 8 most significant bits, cause loss of resolution of the duty-cycle.
> E.g. assuming a base_unit_range of 65536 steps, then a base_unit value of
> 768 (256 * 3), limits the duty-cycle resolution to 65536 / 768 = 85 steps.
> Clamp the max base_unit value to base_unit_range / 32 to ensure a
> duty-cycle resolution of at least 32 steps. This limits the maximum
> output frequency to 600 KHz / 780 KHz depending on the base clock.

This part I don't understand. Why we limiting base unit? I seems like a
deliberate regression.
Hans de Goede June 8, 2020, 11:07 a.m. UTC | #2
Hi,

On 6/8/20 5:50 AM, Andy Shevchenko wrote:
> On Sun, Jun 07, 2020 at 08:18:28PM +0200, Hans de Goede wrote:
>> When the user requests a high enough period ns value, then the
>> calculations in pwm_lpss_prepare() might result in a base_unit value of 0.
>>
>> But according to the data-sheet the way the PWM controller works is that
>> each input clock-cycle the base_unit gets added to a N bit counter and
>> that counter overflowing determines the PWM output frequency. Adding 0
>> to the counter is a no-op. The data-sheet even explicitly states that
>> writing 0 to the base_unit bits will result in the PWM outputting a
>> continuous 0 signal.
> 
> So, and why it's a problem?

Lets sya the user requests a PWM output frequency of 100Hz on Cherry Trail
which has a 19200000 Hz clock this will result in 100 * 65536 / 19200000 =
0.3 -> 0 as base-unit value. So instead of getting 100 Hz the user will
now get a pin which is always outputting low.

OTOH if we clamp to 1 as lowest value, the user will get 192000000 / 65536
= 292 Hz as output frequency which is as close to the requested value as
we can get while actually still working as a PWM controller.

>> base_unit values > (base_unit_range / 256), or iow base_unit values using
>> the 8 most significant bits, cause loss of resolution of the duty-cycle.
>> E.g. assuming a base_unit_range of 65536 steps, then a base_unit value of
>> 768 (256 * 3), limits the duty-cycle resolution to 65536 / 768 = 85 steps.
>> Clamp the max base_unit value to base_unit_range / 32 to ensure a
>> duty-cycle resolution of at least 32 steps. This limits the maximum
>> output frequency to 600 KHz / 780 KHz depending on the base clock.
> 
> This part I don't understand. Why we limiting base unit? I seems like a
> deliberate regression.

The way the PWM controller works is that the base-unit gets added to
say a 16 bit (on CHT) counter each input clock and then the highest 8
bits of that counter get compared to the value programmed into the
ON_TIME_DIV bits.

Lets say we do not clamp and allow any value and lets say the user
selects an output frequency of half the input clock, so base-unit
value is 32768, then the counter will only have 2 values:
0 and 32768 after that it will wrap around again. So any on time-div
value < 128 will result in the output being always high and any
value > 128 will result in the output being high/low 50% of the time
and a value of 255 will make the output always low.

So in essence we now only have 3 duty cycle levels, which seems like
a bad idea to me / not what a pwm controller is supposed to do.

So I decided to put a cut of at having at least 32 steps.

The mean reason I wrote this patch though is to avoid a base-unit
value of 0 which really results in a completely non working PWM
output. I personally believe clamping on the high side is a good
idea too. But if you are against that I can drop that part.

Note that the clamping on the high side will not affect the
primary user of the LPSS-pwm driver which is the i915 backlight
code, that never asks for such high frequencies.  But it could
help to avoid an user shooting themselves in the foot when using
the PWM on a dev board through the sysfs interface.

Regards,

Hans
Andy Shevchenko June 8, 2020, 12:51 p.m. UTC | #3
On Mon, Jun 08, 2020 at 01:07:12PM +0200, Hans de Goede wrote:
> On 6/8/20 5:50 AM, Andy Shevchenko wrote:
> > On Sun, Jun 07, 2020 at 08:18:28PM +0200, Hans de Goede wrote:
> > > When the user requests a high enough period ns value, then the
> > > calculations in pwm_lpss_prepare() might result in a base_unit value of 0.
> > > 
> > > But according to the data-sheet the way the PWM controller works is that
> > > each input clock-cycle the base_unit gets added to a N bit counter and
> > > that counter overflowing determines the PWM output frequency. Adding 0
> > > to the counter is a no-op. The data-sheet even explicitly states that
> > > writing 0 to the base_unit bits will result in the PWM outputting a
> > > continuous 0 signal.
> > 
> > So, and why it's a problem?
> 
> Lets sya the user requests a PWM output frequency of 100Hz on Cherry Trail
> which has a 19200000 Hz clock this will result in 100 * 65536 / 19200000 =
> 0.3 -> 0 as base-unit value. So instead of getting 100 Hz the user will
> now get a pin which is always outputting low.
> 
> OTOH if we clamp to 1 as lowest value, the user will get 192000000 / 65536
> = 292 Hz as output frequency which is as close to the requested value as
> we can get while actually still working as a PWM controller.

So, we should basically divide and round up, no?

At least for 0 we will get 0.

> > > base_unit values > (base_unit_range / 256), or iow base_unit values using
> > > the 8 most significant bits, cause loss of resolution of the duty-cycle.
> > > E.g. assuming a base_unit_range of 65536 steps, then a base_unit value of
> > > 768 (256 * 3), limits the duty-cycle resolution to 65536 / 768 = 85 steps.
> > > Clamp the max base_unit value to base_unit_range / 32 to ensure a
> > > duty-cycle resolution of at least 32 steps. This limits the maximum
> > > output frequency to 600 KHz / 780 KHz depending on the base clock.
> > 
> > This part I don't understand. Why we limiting base unit? I seems like a
> > deliberate regression.
> 
> The way the PWM controller works is that the base-unit gets added to
> say a 16 bit (on CHT) counter each input clock and then the highest 8
> bits of that counter get compared to the value programmed into the
> ON_TIME_DIV bits.
> 
> Lets say we do not clamp and allow any value and lets say the user
> selects an output frequency of half the input clock, so base-unit
> value is 32768, then the counter will only have 2 values:
> 0 and 32768 after that it will wrap around again. So any on time-div
> value < 128 will result in the output being always high and any
> value > 128 will result in the output being high/low 50% of the time
> and a value of 255 will make the output always low.
> 
> So in essence we now only have 3 duty cycle levels, which seems like
> a bad idea to me / not what a pwm controller is supposed to do.

It's exactly what is written in the documentation. I can't buy base unit clamp.
Though, I can buy, perhaps, on time divisor granularity, i.e.
  1/	0% - 25%-1 (0%)
  2/	25% - 50% - 75% (50%)
  3/	75%+1 - 100% (100%)
And so on till we got a maximum resolution (8 bits).
Hans de Goede June 8, 2020, 2:19 p.m. UTC | #4
Hi,

On 6/8/20 2:51 PM, Andy Shevchenko wrote:
> On Mon, Jun 08, 2020 at 01:07:12PM +0200, Hans de Goede wrote:
>> On 6/8/20 5:50 AM, Andy Shevchenko wrote:
>>> On Sun, Jun 07, 2020 at 08:18:28PM +0200, Hans de Goede wrote:
>>>> When the user requests a high enough period ns value, then the
>>>> calculations in pwm_lpss_prepare() might result in a base_unit value of 0.
>>>>
>>>> But according to the data-sheet the way the PWM controller works is that
>>>> each input clock-cycle the base_unit gets added to a N bit counter and
>>>> that counter overflowing determines the PWM output frequency. Adding 0
>>>> to the counter is a no-op. The data-sheet even explicitly states that
>>>> writing 0 to the base_unit bits will result in the PWM outputting a
>>>> continuous 0 signal.
>>>
>>> So, and why it's a problem?
>>
>> Lets sya the user requests a PWM output frequency of 100Hz on Cherry Trail
>> which has a 19200000 Hz clock this will result in 100 * 65536 / 19200000 =
>> 0.3 -> 0 as base-unit value. So instead of getting 100 Hz the user will
>> now get a pin which is always outputting low.
>>
>> OTOH if we clamp to 1 as lowest value, the user will get 192000000 / 65536
>> = 292 Hz as output frequency which is as close to the requested value as
>> we can get while actually still working as a PWM controller.
> 
> So, we should basically divide and round up, no?

Yes, that will work for the low limit of base_unit but it will make
all the other requested period values less accurate.

> At least for 0 we will get 0.

We're dealing with frequency here, but the API is dealing with period,
so to get 0 HZ the API user would have to request a period of > 1s e.g.
request 2s / 0.5 Hz but then the user is still not really requesting 0Hz
(that would correspond with a period of infinity which integers cannot
represent.

>>>> base_unit values > (base_unit_range / 256), or iow base_unit values using
>>>> the 8 most significant bits, cause loss of resolution of the duty-cycle.
>>>> E.g. assuming a base_unit_range of 65536 steps, then a base_unit value of
>>>> 768 (256 * 3), limits the duty-cycle resolution to 65536 / 768 = 85 steps.
>>>> Clamp the max base_unit value to base_unit_range / 32 to ensure a
>>>> duty-cycle resolution of at least 32 steps. This limits the maximum
>>>> output frequency to 600 KHz / 780 KHz depending on the base clock.
>>>
>>> This part I don't understand. Why we limiting base unit? I seems like a
>>> deliberate regression.
>>
>> The way the PWM controller works is that the base-unit gets added to
>> say a 16 bit (on CHT) counter each input clock and then the highest 8
>> bits of that counter get compared to the value programmed into the
>> ON_TIME_DIV bits.
>>
>> Lets say we do not clamp and allow any value and lets say the user
>> selects an output frequency of half the input clock, so base-unit
>> value is 32768, then the counter will only have 2 values:
>> 0 and 32768 after that it will wrap around again. So any on time-div
>> value < 128 will result in the output being always high and any
>> value > 128 will result in the output being high/low 50% of the time
>> and a value of 255 will make the output always low.
>>
>> So in essence we now only have 3 duty cycle levels, which seems like
>> a bad idea to me / not what a pwm controller is supposed to do.
> 
> It's exactly what is written in the documentation. I can't buy base unit clamp.
> Though, I can buy, perhaps, on time divisor granularity, i.e.
>    1/	0% - 25%-1 (0%)
>    2/	25% - 50% - 75% (50%)
>    3/	75%+1 - 100% (100%)
> And so on till we got a maximum resolution (8 bits).

Note that the PWM API does not expose the granularity to the API user,
which is why I went with just putting a minimum on it of 32 steps.

Anyways I don't have a strong opinion on this, so I'm fine with not clamping
the base-unit to preserve granularity. We should still clamp it to avoid
overflow if the user us requesting a really high frequency though!

The old code had:

	base_unit &= base_unit_range;

Which means that if the user requests a too high value, then we first
overflow base_unit and then truncate it to fit leading to a random
frequency.

So if we forget my minimal granularity argument, then at a minimum
we need to replace the above line with:

	base_unit = clamp_t(unsigned long long, base_unit, 1, base_unit_range - 1);

And since we need the clamp anyways we can then keep the current round-closest
behavior.

Regards,

Hans
Uwe Kleine-König June 11, 2020, 10:12 p.m. UTC | #5
On Mon, Jun 08, 2020 at 01:07:12PM +0200, Hans de Goede wrote:
> Hi,
> 
> On 6/8/20 5:50 AM, Andy Shevchenko wrote:
> > On Sun, Jun 07, 2020 at 08:18:28PM +0200, Hans de Goede wrote:
> > > When the user requests a high enough period ns value, then the
> > > calculations in pwm_lpss_prepare() might result in a base_unit value of 0.
> > > 
> > > But according to the data-sheet the way the PWM controller works is that
> > > each input clock-cycle the base_unit gets added to a N bit counter and
> > > that counter overflowing determines the PWM output frequency. Adding 0
> > > to the counter is a no-op. The data-sheet even explicitly states that
> > > writing 0 to the base_unit bits will result in the PWM outputting a
> > > continuous 0 signal.
> > 
> > So, and why it's a problem?
> 
> Lets sya the user requests a PWM output frequency of 100Hz on Cherry Trail
> which has a 19200000 Hz clock this will result in 100 * 65536 / 19200000 =
> 0.3 -> 0 as base-unit value. So instead of getting 100 Hz the user will
> now get a pin which is always outputting low.

I didn't follow the complete discussion but note that the general rule
is:

	round period down to the next possible implementable period
	round duty_cycle down to the next possible implementable duty_cycle

so if a small enough period (and so a small duty_cycle) is requested it
is expected that duty_cycle will be zero.

Best regards
Uwe
Andy Shevchenko June 12, 2020, 11:57 a.m. UTC | #6
On Fri, Jun 12, 2020 at 12:12:42AM +0200, Uwe Kleine-König wrote:
> On Mon, Jun 08, 2020 at 01:07:12PM +0200, Hans de Goede wrote:
> > On 6/8/20 5:50 AM, Andy Shevchenko wrote:
> > > On Sun, Jun 07, 2020 at 08:18:28PM +0200, Hans de Goede wrote:
> > > > When the user requests a high enough period ns value, then the
> > > > calculations in pwm_lpss_prepare() might result in a base_unit value of 0.
> > > > 
> > > > But according to the data-sheet the way the PWM controller works is that
> > > > each input clock-cycle the base_unit gets added to a N bit counter and
> > > > that counter overflowing determines the PWM output frequency. Adding 0
> > > > to the counter is a no-op. The data-sheet even explicitly states that
> > > > writing 0 to the base_unit bits will result in the PWM outputting a
> > > > continuous 0 signal.
> > > 
> > > So, and why it's a problem?
> > 
> > Lets sya the user requests a PWM output frequency of 100Hz on Cherry Trail
> > which has a 19200000 Hz clock this will result in 100 * 65536 / 19200000 =
> > 0.3 -> 0 as base-unit value. So instead of getting 100 Hz the user will
> > now get a pin which is always outputting low.
> 
> I didn't follow the complete discussion but note that the general rule
> is:
> 
> 	round period down to the next possible implementable period
> 	round duty_cycle down to the next possible implementable duty_cycle
> 
> so if a small enough period (and so a small duty_cycle) is requested it
> is expected that duty_cycle will be zero.

...which brings me an idea that PWM framework should expose API to get a
capabilities, like DMA Engine has.

In such capabilities, in particular, caller can get ranges of the correct
frequencies of the underneath hardware.
Uwe Kleine-König June 13, 2020, 8:50 p.m. UTC | #7
Hello Andy,

On Fri, Jun 12, 2020 at 02:57:32PM +0300, Andy Shevchenko wrote:
> On Fri, Jun 12, 2020 at 12:12:42AM +0200, Uwe Kleine-König wrote:
> > I didn't follow the complete discussion but note that the general rule
> > is:
> > 
> > 	round period down to the next possible implementable period
> > 	round duty_cycle down to the next possible implementable duty_cycle
> > 
> > so if a small enough period (and so a small duty_cycle) is requested it
> > is expected that duty_cycle will be zero.
> 
> ...which brings me an idea that PWM framework should expose API to get a
> capabilities, like DMA Engine has.
> 
> In such capabilities, in particular, caller can get ranges of the correct
> frequencies of the underneath hardware.

my idea is to introduce a function pwm_round_state() that has a similar
semantic to clk_round_rate(). But this is only one of several thoughts I
have for the pwm framework. And as there is (AFAIK) no user who would
benefit from pwm_round_state() this is further down on my todo list.

Best regards
Uwe
diff mbox series

Patch

diff --git a/drivers/pwm/pwm-lpss.c b/drivers/pwm/pwm-lpss.c
index 9d965ffe66d1..cae74ce61654 100644
--- a/drivers/pwm/pwm-lpss.c
+++ b/drivers/pwm/pwm-lpss.c
@@ -97,6 +97,14 @@  static void pwm_lpss_prepare(struct pwm_lpss_chip *lpwm, struct pwm_device *pwm,
 	freq *= base_unit_range;
 
 	base_unit = DIV_ROUND_CLOSEST_ULL(freq, c);
+	/*
+	 * base_unit must not be 0 and for values > (base_unit_range / 256)
+	 * (values using the 8 most significant bits) the duty-cycle resolution
+	 * degrades. Clamp the maximum value to base_unit_range / 32 which
+	 * leaves a duty-cycle resolution of 32 steps.
+	 */
+	base_unit = clamp_t(unsigned long long, base_unit, 1,
+			    base_unit_range / 32);
 
 	on_time_div = 255ULL * duty_ns;
 	do_div(on_time_div, period_ns);
@@ -105,7 +113,6 @@  static void pwm_lpss_prepare(struct pwm_lpss_chip *lpwm, struct pwm_device *pwm,
 	orig_ctrl = ctrl = pwm_lpss_read(pwm);
 	ctrl &= ~PWM_ON_TIME_DIV_MASK;
 	ctrl &= ~(base_unit_range << PWM_BASE_UNIT_SHIFT);
-	base_unit &= base_unit_range;
 	ctrl |= (u32) base_unit << PWM_BASE_UNIT_SHIFT;
 	ctrl |= on_time_div;