diff mbox series

[v3] pwm: bcm2835: Improve period and duty cycle calculation

Message ID 20210114204804.143892-1-u.kleine-koenig@pengutronix.de
State Accepted
Headers show
Series [v3] pwm: bcm2835: Improve period and duty cycle calculation | expand

Commit Message

Uwe Kleine-König Jan. 14, 2021, 8:48 p.m. UTC
With an input clk rate bigger than 2000000000, scaler would have been
zero which then would have resulted in a division by zero.

Also the originally implemented algorithm divided by the result of a
division. This nearly always looses precision. Consider a requested period
of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
was configured with an actual period of 983869.007 ns (PERIOD = 32258)
while the hardware can provide 1000003.508 ns (PERIOD = 32787).
And note if the input clock frequency was 32786886 Hz instead, the hardware
was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
setting results in 1000003.477 ns (PERIOD = 32787).

This patch implements proper range checking and only divides once for
the calculation of period (and similar for duty_cycle).

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
---
Hello,

changes since v2 (Message-Id:
20201222221319.2101107-1-u.kleine-koenig@pengutronix.de):

 - Add my calculation to the comment explaining the max_period formula
   as discussed with Lino.

Best regards
Uwe

 drivers/pwm/pwm-bcm2835.c | 35 +++++++++++++++++++++++++++--------
 1 file changed, 27 insertions(+), 8 deletions(-)

Comments

Thierry Reding Feb. 26, 2021, 11:54 a.m. UTC | #1
On Thu, Jan 14, 2021 at 09:48:04PM +0100, Uwe Kleine-König wrote:
> With an input clk rate bigger than 2000000000, scaler would have been
> zero which then would have resulted in a division by zero.
> 
> Also the originally implemented algorithm divided by the result of a
> division. This nearly always looses precision. Consider a requested period
> of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
> was configured with an actual period of 983869.007 ns (PERIOD = 32258)
> while the hardware can provide 1000003.508 ns (PERIOD = 32787).
> And note if the input clock frequency was 32786886 Hz instead, the hardware
> was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
> setting results in 1000003.477 ns (PERIOD = 32787).
> 
> This patch implements proper range checking and only divides once for
> the calculation of period (and similar for duty_cycle).
> 
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> ---
> Hello,
> 
> changes since v2 (Message-Id:
> 20201222221319.2101107-1-u.kleine-koenig@pengutronix.de):
> 
>  - Add my calculation to the comment explaining the max_period formula
>    as discussed with Lino.
> 
> Best regards
> Uwe
> 
>  drivers/pwm/pwm-bcm2835.c | 35 +++++++++++++++++++++++++++--------
>  1 file changed, 27 insertions(+), 8 deletions(-)

Florian, Lino,

care to give a Reviewed-by and/or Tested-by on this?

Thierry
Lino Sanfilippo March 1, 2021, 10:13 a.m. UTC | #2
Hi Thierry, Hi Uwe

> On Thu, Jan 14, 2021 at 09:48:04PM +0100, Uwe Kleine-König wrote:
> > With an input clk rate bigger than 2000000000, scaler would have been
> > zero which then would have resulted in a division by zero.
> > 
> > Also the originally implemented algorithm divided by the result of a
> > division. This nearly always looses precision. Consider a requested period
> > of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
> > was configured with an actual period of 983869.007 ns (PERIOD = 32258)
> > while the hardware can provide 1000003.508 ns (PERIOD = 32787).
> > And note if the input clock frequency was 32786886 Hz instead, the hardware
> > was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
> > setting results in 1000003.477 ns (PERIOD = 32787).
> > 
> > This patch implements proper range checking and only divides once for
> > the calculation of period (and similar for duty_cycle).
> > 
> > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > ---
> > Hello,
> > 
> > changes since v2 (Message-Id:
> > 20201222221319.2101107-1-u.kleine-koenig@pengutronix.de):
> > 
> >  - Add my calculation to the comment explaining the max_period formula
> >    as discussed with Lino.
> > 
> > Best regards
> > Uwe
> > 
> >  drivers/pwm/pwm-bcm2835.c | 35 +++++++++++++++++++++++++++--------
> >  1 file changed, 27 insertions(+), 8 deletions(-)
> 
> Florian, Lino,
> 
> care to give a Reviewed-by and/or Tested-by on this?
> 
> Thierry

this looks good to me, IMHO also the calculation now is comprehensible due to the comment, so FWIW:

Reviewed-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>

Unfortunately I do not have the possibility to test this right now (I am currently working on another project)
but I will try to get the needed hardware and then do another test with this patch.

Best Regards,
Lino
Lino Sanfilippo March 6, 2021, 6:28 p.m. UTC | #3
Hi,

On 14.01.21 at 21:48, Uwe Kleine-König wrote:
> With an input clk rate bigger than 2000000000, scaler would have been
> zero which then would have resulted in a division by zero.
>
> Also the originally implemented algorithm divided by the result of a
> division. This nearly always looses precision. Consider a requested period
> of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
> was configured with an actual period of 983869.007 ns (PERIOD = 32258)
> while the hardware can provide 1000003.508 ns (PERIOD = 32787).
> And note if the input clock frequency was 32786886 Hz instead, the hardware
> was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
> setting results in 1000003.477 ns (PERIOD = 32787).
>
> This patch implements proper range checking and only divides once for
> the calculation of period (and similar for duty_cycle).
>
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> ---
> Hello,
>
> changes since v2 (Message-Id:
> 20201222221319.2101107-1-u.kleine-koenig@pengutronix.de):
>
>  - Add my calculation to the comment explaining the max_period formula
>    as discussed with Lino.
>
> Best regards
> Uwe
>
>  drivers/pwm/pwm-bcm2835.c | 35 +++++++++++++++++++++++++++--------
>  1 file changed, 27 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/pwm/pwm-bcm2835.c b/drivers/pwm/pwm-bcm2835.c
> index 6ff5f04b3e07..d593cce249d9 100644
> --- a/drivers/pwm/pwm-bcm2835.c
> +++ b/drivers/pwm/pwm-bcm2835.c
> @@ -64,8 +64,9 @@ static int bcm2835_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
>
>  	struct bcm2835_pwm *pc = to_bcm2835_pwm(chip);
>  	unsigned long rate = clk_get_rate(pc->clk);
> -	unsigned long long period;
> -	unsigned long scaler;
> +	unsigned long long period_cycles;
> +	u64 max_period;
> +
>  	u32 val;
>
>  	if (!rate) {
> @@ -73,18 +74,36 @@ static int bcm2835_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
>  		return -EINVAL;
>  	}
>
> -	scaler = DIV_ROUND_CLOSEST(NSEC_PER_SEC, rate);
> +	/*
> +	 * period_cycles must be a 32 bit value, so period * rate / NSEC_PER_SEC
> +	 * must be <= U32_MAX. As U32_MAX * NSEC_PER_SEC < U64_MAX the
> +	 * multiplication period * rate doesn't overflow.
> +	 * To calculate the maximal possible period that guarantees the
> +	 * above inequality:
> +	 *
> +	 *     round(period * rate / NSEC_PER_SEC) <= U32_MAX
> +	 * <=> period * rate / NSEC_PER_SEC < U32_MAX + 0.5
> +	 * <=> period * rate < (U32_MAX + 0.5) * NSEC_PER_SEC
> +	 * <=> period < ((U32_MAX + 0.5) * NSEC_PER_SEC) / rate
> +	 * <=> period < ((U32_MAX * NSEC_PER_SEC + NSEC_PER_SEC/2) / rate
> +	 * <=> period <= ceil((U32_MAX * NSEC_PER_SEC + NSEC_PER_SEC/2) / rate) - 1
> +	 */
> +	max_period = DIV_ROUND_UP_ULL((u64)U32_MAX * NSEC_PER_SEC + NSEC_PER_SEC / 2, rate) - 1;
> +
> +	if (state->period > max_period)
> +		return -EINVAL;
> +
>  	/* set period */
> -	period = DIV_ROUND_CLOSEST_ULL(state->period, scaler);
> +	period_cycles = DIV_ROUND_CLOSEST_ULL(state->period * rate, NSEC_PER_SEC);
>
> -	/* dont accept a period that is too small or has been truncated */
> -	if ((period < PERIOD_MIN) || (period > U32_MAX))
> +	/* don't accept a period that is too small */
> +	if (period_cycles < PERIOD_MIN)
>  		return -EINVAL;
>
> -	writel(period, pc->base + PERIOD(pwm->hwpwm));
> +	writel(period_cycles, pc->base + PERIOD(pwm->hwpwm));
>
>  	/* set duty cycle */
> -	val = DIV_ROUND_CLOSEST_ULL(state->duty_cycle, scaler);
> +	val = DIV_ROUND_CLOSEST_ULL(state->duty_cycle * rate, NSEC_PER_SEC);
>  	writel(val, pc->base + DUTY(pwm->hwpwm));
>
>  	/* set polarity */
>

I was able to test this now and everything looks good.

Tested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>

Best Regards,
Lino
Thierry Reding March 22, 2021, 10:48 a.m. UTC | #4
On Thu, Jan 14, 2021 at 09:48:04PM +0100, Uwe Kleine-König wrote:
> With an input clk rate bigger than 2000000000, scaler would have been
> zero which then would have resulted in a division by zero.
> 
> Also the originally implemented algorithm divided by the result of a
> division. This nearly always looses precision. Consider a requested period
> of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware
> was configured with an actual period of 983869.007 ns (PERIOD = 32258)
> while the hardware can provide 1000003.508 ns (PERIOD = 32787).
> And note if the input clock frequency was 32786886 Hz instead, the hardware
> was configured to 1016656.477 ns (PERIOD = 33333) while the optimal
> setting results in 1000003.477 ns (PERIOD = 32787).
> 
> This patch implements proper range checking and only divides once for
> the calculation of period (and similar for duty_cycle).
> 
> Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> ---
> Hello,
> 
> changes since v2 (Message-Id:
> 20201222221319.2101107-1-u.kleine-koenig@pengutronix.de):
> 
>  - Add my calculation to the comment explaining the max_period formula
>    as discussed with Lino.
> 
> Best regards
> Uwe
> 
>  drivers/pwm/pwm-bcm2835.c | 35 +++++++++++++++++++++++++++--------
>  1 file changed, 27 insertions(+), 8 deletions(-)

Applied, thanks.

Thierry
diff mbox series

Patch

diff --git a/drivers/pwm/pwm-bcm2835.c b/drivers/pwm/pwm-bcm2835.c
index 6ff5f04b3e07..d593cce249d9 100644
--- a/drivers/pwm/pwm-bcm2835.c
+++ b/drivers/pwm/pwm-bcm2835.c
@@ -64,8 +64,9 @@  static int bcm2835_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
 
 	struct bcm2835_pwm *pc = to_bcm2835_pwm(chip);
 	unsigned long rate = clk_get_rate(pc->clk);
-	unsigned long long period;
-	unsigned long scaler;
+	unsigned long long period_cycles;
+	u64 max_period;
+
 	u32 val;
 
 	if (!rate) {
@@ -73,18 +74,36 @@  static int bcm2835_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm,
 		return -EINVAL;
 	}
 
-	scaler = DIV_ROUND_CLOSEST(NSEC_PER_SEC, rate);
+	/*
+	 * period_cycles must be a 32 bit value, so period * rate / NSEC_PER_SEC
+	 * must be <= U32_MAX. As U32_MAX * NSEC_PER_SEC < U64_MAX the
+	 * multiplication period * rate doesn't overflow.
+	 * To calculate the maximal possible period that guarantees the
+	 * above inequality:
+	 *
+	 *     round(period * rate / NSEC_PER_SEC) <= U32_MAX
+	 * <=> period * rate / NSEC_PER_SEC < U32_MAX + 0.5
+	 * <=> period * rate < (U32_MAX + 0.5) * NSEC_PER_SEC
+	 * <=> period < ((U32_MAX + 0.5) * NSEC_PER_SEC) / rate
+	 * <=> period < ((U32_MAX * NSEC_PER_SEC + NSEC_PER_SEC/2) / rate
+	 * <=> period <= ceil((U32_MAX * NSEC_PER_SEC + NSEC_PER_SEC/2) / rate) - 1
+	 */
+	max_period = DIV_ROUND_UP_ULL((u64)U32_MAX * NSEC_PER_SEC + NSEC_PER_SEC / 2, rate) - 1;
+
+	if (state->period > max_period)
+		return -EINVAL;
+
 	/* set period */
-	period = DIV_ROUND_CLOSEST_ULL(state->period, scaler);
+	period_cycles = DIV_ROUND_CLOSEST_ULL(state->period * rate, NSEC_PER_SEC);
 
-	/* dont accept a period that is too small or has been truncated */
-	if ((period < PERIOD_MIN) || (period > U32_MAX))
+	/* don't accept a period that is too small */
+	if (period_cycles < PERIOD_MIN)
 		return -EINVAL;
 
-	writel(period, pc->base + PERIOD(pwm->hwpwm));
+	writel(period_cycles, pc->base + PERIOD(pwm->hwpwm));
 
 	/* set duty cycle */
-	val = DIV_ROUND_CLOSEST_ULL(state->duty_cycle, scaler);
+	val = DIV_ROUND_CLOSEST_ULL(state->duty_cycle * rate, NSEC_PER_SEC);
 	writel(val, pc->base + DUTY(pwm->hwpwm));
 
 	/* set polarity */