diff mbox series

pwm: bcm2835: Improve period and duty cycle calculation

Message ID 20201221165501.717101-1-u.kleine-koenig@pengutronix.de
State Changes Requested
Headers show
Series pwm: bcm2835: Improve period and duty cycle calculation | expand

Commit Message

Uwe Kleine-König Dec. 21, 2020, 4:55 p.m. UTC
With an input clk rate bigger than 2000000000, scaler would have been
zero which then would have resulted in a division by zero.

Also the originally implemented algorithm divided by the result of a
division. This nearly always looses precision. Consider a requested period
of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
was configured with an actual period of 983869.007 ns (PERIOD = 32258)
while the hardware can provide 1000003.508 ns (PERIOD = 32787).
And note if the input clock frequency was 32786886 Hz instead, the hardware
was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
setting results in 1000003.477 ns (PERIOD = 32787).

This patch implements proper range checking and only divides once for
the calculation of period (and similar for duty_cycle).

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
Hello,

during review of the bcm2835 driver I noticed this double division.

I think the practical relevance is low however because the clock rate is
fixed to 10 MHz on this platform which doesn't result in these
deviations. (Is this right, what is the actual rate?)

Having said that this patch is only compile tested.

Best regards
Uwe

 drivers/pwm/pwm-bcm2835.c | 33 +++++++++++++++++++++++++--------
 1 file changed, 25 insertions(+), 8 deletions(-)

Comments

Florian Fainelli Dec. 21, 2020, 11:04 p.m. UTC | #1
On 12/21/2020 8:55 AM, Uwe Kleine-König wrote:
> With an input clk rate bigger than 2000000000, scaler would have been
> zero which then would have resulted in a division by zero.
> 
> Also the originally implemented algorithm divided by the result of a
> division. This nearly always looses precision. Consider a requested period
> of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
> was configured with an actual period of 983869.007 ns (PERIOD = 32258)
> while the hardware can provide 1000003.508 ns (PERIOD = 32787).
> And note if the input clock frequency was 32786886 Hz instead, the hardware
> was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
> setting results in 1000003.477 ns (PERIOD = 32787).
> 
> This patch implements proper range checking and only divides once for
> the calculation of period (and similar for duty_cycle).
> 
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> ---
> Hello,
> 
> during review of the bcm2835 driver I noticed this double division.
> 
> I think the practical relevance is low however because the clock rate is
> fixed to 10 MHz on this platform which doesn't result in these
> deviations. (Is this right, what is the actual rate?)

Currently this is correct but the PWM input clock can be configured from
the divider of a PLL that runs at 500MHz so this change is potentially
useful in that regard.
Uwe Kleine-König Dec. 22, 2020, 6:30 a.m. UTC | #2
Hello Florian,

On Mon, Dec 21, 2020 at 03:04:25PM -0800, Florian Fainelli wrote:
> On 12/21/2020 8:55 AM, Uwe Kleine-König wrote:
> > With an input clk rate bigger than 2000000000, scaler would have been
> > zero which then would have resulted in a division by zero.
> > 
> > Also the originally implemented algorithm divided by the result of a
> > division. This nearly always looses precision. Consider a requested period
> > of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
> > was configured with an actual period of 983869.007 ns (PERIOD = 32258)
> > while the hardware can provide 1000003.508 ns (PERIOD = 32787).
> > And note if the input clock frequency was 32786886 Hz instead, the hardware
> > was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
> > setting results in 1000003.477 ns (PERIOD = 32787).
> > 
> > This patch implements proper range checking and only divides once for
> > the calculation of period (and similar for duty_cycle).
> > 
> > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > ---
> > Hello,
> > 
> > during review of the bcm2835 driver I noticed this double division.
> > 
> > I think the practical relevance is low however because the clock rate is
> > fixed to 10 MHz on this platform which doesn't result in these
> > deviations. (Is this right, what is the actual rate?)
> 
> Currently this is correct but the PWM input clock can be configured from
> the divider of a PLL that runs at 500MHz so this change is potentially
> useful in that regard.

Thanks for your feedback; is this an Ack?

Best regards
Uwe
Lino Sanfilippo Dec. 22, 2020, 4:34 p.m. UTC | #3
Hi,

On 21.12.20 at 17:55, Uwe Kleine-König wrote:
> With an input clk rate bigger than 2000000000, scaler would have been
> zero which then would have resulted in a division by zero.
>
> Also the originally implemented algorithm divided by the result of a
> division. This nearly always looses precision. Consider a requested period
> of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
> was configured with an actual period of 983869.007 ns (PERIOD = 32258)
> while the hardware can provide 1000003.508 ns (PERIOD = 32787).
> And note if the input clock frequency was 32786886 Hz instead, the hardware
> was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
> setting results in 1000003.477 ns (PERIOD = 32787).
>
> This patch implements proper range checking and only divides once for
> the calculation of period (and similar for duty_cycle).
>
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> ---
> Hello,
>
> during review of the bcm2835 driver I noticed this double division.
>
> I think the practical relevance is low however because the clock rate is
> fixed to 10 MHz on this platform which doesn't result in these
> deviations. (Is this right, what is the actual rate?)

clk_get_rate() on my Raspberry Pi 4 returns 9999999.

>
> Having said that this patch is only compile tested.

I tested this on the RP and everything still works so far, so FWIW
Tested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>


> +
> +	if (state->period > max_period)
> +		period = max_period;
> +	else
> +		period = state->period;
> +

We return EINVAL if the value for period is too small, so should we not do the
same for the case that the value is too high?

Regards,
Lino
Uwe Kleine-König Dec. 22, 2020, 10:11 p.m. UTC | #4
Hello Lino,

On Tue, Dec 22, 2020 at 05:34:00PM +0100, Lino Sanfilippo wrote:
> On 21.12.20 at 17:55, Uwe Kleine-König wrote:
> > With an input clk rate bigger than 2000000000, scaler would have been
> > zero which then would have resulted in a division by zero.
> >
> > Also the originally implemented algorithm divided by the result of a
> > division. This nearly always looses precision. Consider a requested period
> > of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
> > was configured with an actual period of 983869.007 ns (PERIOD = 32258)
> > while the hardware can provide 1000003.508 ns (PERIOD = 32787).
> > And note if the input clock frequency was 32786886 Hz instead, the hardware
> > was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
> > setting results in 1000003.477 ns (PERIOD = 32787).
> >
> > This patch implements proper range checking and only divides once for
> > the calculation of period (and similar for duty_cycle).
> >
> > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > ---
> > Hello,
> >
> > during review of the bcm2835 driver I noticed this double division.
> >
> > I think the practical relevance is low however because the clock rate is
> > fixed to 10 MHz on this platform which doesn't result in these
> > deviations. (Is this right, what is the actual rate?)
> 
> clk_get_rate() on my Raspberry Pi 4 returns 9999999.

oh, then my patch is a real improvement, but the problems are not as bad
as my examples above. So when a period of 350ns was requested, the old
algorithm provided 400.000040000004 ns and the improved yields
300.000030000003 ns.

> I tested this on the RP and everything still works so far, so FWIW
> Tested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>

Thanks.

> > +	if (state->period > max_period)
> > +		period = max_period;
> > +	else
> > +		period = state->period;
> > +
> 
> We return EINVAL if the value for period is too small, so should we not do the
> same for the case that the value is too high?

Ah, I didn't notice I changed the behaviour here, but indeed this is
what I think is the right behaviour. (i.e. configure the biggest
possible period not bigger than the requested period.)

But this change doesn't belong into this patch, so I will fixup and
resend.

Best regards
Uwe
diff mbox series

Patch

diff --git a/drivers/pwm/pwm-bcm2835.c b/drivers/pwm/pwm-bcm2835.c
index 6ff5f04b3e07..f937db32fa54 100644
--- a/drivers/pwm/pwm-bcm2835.c
+++ b/drivers/pwm/pwm-bcm2835.c
@@ -64,8 +64,9 @@  static int bcm2835_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
 
 	struct bcm2835_pwm *pc = to_bcm2835_pwm(chip);
 	unsigned long rate = clk_get_rate(pc->clk);
-	unsigned long long period;
-	unsigned long scaler;
+	unsigned long long period_cycles;
+	u64 max_period, period, duty_cycle;
+
 	u32 val;
 
 	if (!rate) {
@@ -73,18 +74,34 @@  static int bcm2835_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
 		return -EINVAL;
 	}
 
-	scaler = DIV_ROUND_CLOSEST(NSEC_PER_SEC, rate);
+	/*
+	 * period_cycles must be a 32 bit value, so period * rate / NSEC_PER_SEC
+	 * must be <= U32_MAX. As U32_MAX * NSEC_PER_SEC < U64_MAX the
+	 * multiplication period * rate doesn't overflow.
+	 */
+	max_period = DIV_ROUND_DOWN_ULL((u64)U32_MAX * NSEC_PER_SEC, rate);
+
+	if (state->period > max_period)
+		period = max_period;
+	else
+		period = state->period;
+
+	if (state->duty_cycle > period)
+		duty_cycle = period;
+	else
+		duty_cycle = state->duty_cycle;
+
 	/* set period */
-	period = DIV_ROUND_CLOSEST_ULL(state->period, scaler);
+	period_cycles = DIV_ROUND_CLOSEST_ULL(period * rate, NSEC_PER_SEC);
 
-	/* dont accept a period that is too small or has been truncated */
-	if ((period < PERIOD_MIN) || (period > U32_MAX))
+	/* don't accept a period that is too small */
+	if (period_cycles < PERIOD_MIN)
 		return -EINVAL;
 
-	writel(period, pc->base + PERIOD(pwm->hwpwm));
+	writel(period_cycles, pc->base + PERIOD(pwm->hwpwm));
 
 	/* set duty cycle */
-	val = DIV_ROUND_CLOSEST_ULL(state->duty_cycle, scaler);
+	val = DIV_ROUND_CLOSEST_ULL(duty_cycle * rate, NSEC_PER_SEC);
 	writel(val, pc->base + DUTY(pwm->hwpwm));
 
 	/* set polarity */