Message ID | 20210114204804.143892-1-u.kleine-koenig@pengutronix.de |
---|---|
State | Accepted |
Headers | show |
Series | [v3] pwm: bcm2835: Improve period and duty cycle calculation | expand |
On Thu, Jan 14, 2021 at 09:48:04PM +0100, Uwe Kleine-König wrote: > With an input clk rate bigger than 2000000000, scaler would have been > zero which then would have resulted in a division by zero. > > Also the originally implemented algorithm divided by the result of a > division. This nearly always looses precision. Consider a requested period > of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware > was configured with an actual period of 983869.007 ns (PERIOD = 32258) > while the hardware can provide 1000003.508 ns (PERIOD = 32787). > And note if the input clock frequency was 32786886 Hz instead, the hardware > was configured to 1016656.477 ns (PERIOD = 33333) while the optimal > setting results in 1000003.477 ns (PERIOD = 32787). > > This patch implements proper range checking and only divides once for > the calculation of period (and similar for duty_cycle). > > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> > --- > Hello, > > changes since v2 (Message-Id: > 20201222221319.2101107-1-u.kleine-koenig@pengutronix.de): > > - Add my calculation to the comment explaining the max_period formula > as discussed with Lino. > > Best regards > Uwe > > drivers/pwm/pwm-bcm2835.c | 35 +++++++++++++++++++++++++++-------- > 1 file changed, 27 insertions(+), 8 deletions(-) Florian, Lino, care to give a Reviewed-by and/or Tested-by on this? Thierry
Hi Thierry, Hi Uwe > On Thu, Jan 14, 2021 at 09:48:04PM +0100, Uwe Kleine-König wrote: > > With an input clk rate bigger than 2000000000, scaler would have been > > zero which then would have resulted in a division by zero. > > > > Also the originally implemented algorithm divided by the result of a > > division. This nearly always looses precision. Consider a requested period > > of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware > > was configured with an actual period of 983869.007 ns (PERIOD = 32258) > > while the hardware can provide 1000003.508 ns (PERIOD = 32787). > > And note if the input clock frequency was 32786886 Hz instead, the hardware > > was configured to 1016656.477 ns (PERIOD = 33333) while the optimal > > setting results in 1000003.477 ns (PERIOD = 32787). > > > > This patch implements proper range checking and only divides once for > > the calculation of period (and similar for duty_cycle). > > > > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> > > --- > > Hello, > > > > changes since v2 (Message-Id: > > 20201222221319.2101107-1-u.kleine-koenig@pengutronix.de): > > > > - Add my calculation to the comment explaining the max_period formula > > as discussed with Lino. > > > > Best regards > > Uwe > > > > drivers/pwm/pwm-bcm2835.c | 35 +++++++++++++++++++++++++++-------- > > 1 file changed, 27 insertions(+), 8 deletions(-) > > Florian, Lino, > > care to give a Reviewed-by and/or Tested-by on this? > > Thierry this looks good to me, IMHO also the calculation now is comprehensible due to the comment, so FWIW: Reviewed-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Unfortunately I do not have the possibility to test this right now (I am currently working on another project) but I will try to get the needed hardware and then do another test with this patch. Best Regards, Lino
Hi, On 14.01.21 at 21:48, Uwe Kleine-König wrote: > With an input clk rate bigger than 2000000000, scaler would have been > zero which then would have resulted in a division by zero. > > Also the originally implemented algorithm divided by the result of a > division. This nearly always looses precision. Consider a requested period > of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware > was configured with an actual period of 983869.007 ns (PERIOD = 32258) > while the hardware can provide 1000003.508 ns (PERIOD = 32787). > And note if the input clock frequency was 32786886 Hz instead, the hardware > was configured to 1016656.477 ns (PERIOD = 33333) while the optimal > setting results in 1000003.477 ns (PERIOD = 32787). > > This patch implements proper range checking and only divides once for > the calculation of period (and similar for duty_cycle). > > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> > --- > Hello, > > changes since v2 (Message-Id: > 20201222221319.2101107-1-u.kleine-koenig@pengutronix.de): > > - Add my calculation to the comment explaining the max_period formula > as discussed with Lino. > > Best regards > Uwe > > drivers/pwm/pwm-bcm2835.c | 35 +++++++++++++++++++++++++++-------- > 1 file changed, 27 insertions(+), 8 deletions(-) > > diff --git a/drivers/pwm/pwm-bcm2835.c b/drivers/pwm/pwm-bcm2835.c > index 6ff5f04b3e07..d593cce249d9 100644 > --- a/drivers/pwm/pwm-bcm2835.c > +++ b/drivers/pwm/pwm-bcm2835.c > @@ -64,8 +64,9 @@ static int bcm2835_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm, > > struct bcm2835_pwm *pc = to_bcm2835_pwm(chip); > unsigned long rate = clk_get_rate(pc->clk); > - unsigned long long period; > - unsigned long scaler; > + unsigned long long period_cycles; > + u64 max_period; > + > u32 val; > > if (!rate) { > @@ -73,18 +74,36 @@ static int bcm2835_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm, > return -EINVAL; > } > > - scaler = DIV_ROUND_CLOSEST(NSEC_PER_SEC, rate); > + /* > + * period_cycles must be a 32 bit value, so period * rate / NSEC_PER_SEC > + * must be <= U32_MAX. As U32_MAX * NSEC_PER_SEC < U64_MAX the > + * multiplication period * rate doesn't overflow. > + * To calculate the maximal possible period that guarantees the > + * above inequality: > + * > + * round(period * rate / NSEC_PER_SEC) <= U32_MAX > + * <=> period * rate / NSEC_PER_SEC < U32_MAX + 0.5 > + * <=> period * rate < (U32_MAX + 0.5) * NSEC_PER_SEC > + * <=> period < ((U32_MAX + 0.5) * NSEC_PER_SEC) / rate > + * <=> period < ((U32_MAX * NSEC_PER_SEC + NSEC_PER_SEC/2) / rate > + * <=> period <= ceil((U32_MAX * NSEC_PER_SEC + NSEC_PER_SEC/2) / rate) - 1 > + */ > + max_period = DIV_ROUND_UP_ULL((u64)U32_MAX * NSEC_PER_SEC + NSEC_PER_SEC / 2, rate) - 1; > + > + if (state->period > max_period) > + return -EINVAL; > + > /* set period */ > - period = DIV_ROUND_CLOSEST_ULL(state->period, scaler); > + period_cycles = DIV_ROUND_CLOSEST_ULL(state->period * rate, NSEC_PER_SEC); > > - /* dont accept a period that is too small or has been truncated */ > - if ((period < PERIOD_MIN) || (period > U32_MAX)) > + /* don't accept a period that is too small */ > + if (period_cycles < PERIOD_MIN) > return -EINVAL; > > - writel(period, pc->base + PERIOD(pwm->hwpwm)); > + writel(period_cycles, pc->base + PERIOD(pwm->hwpwm)); > > /* set duty cycle */ > - val = DIV_ROUND_CLOSEST_ULL(state->duty_cycle, scaler); > + val = DIV_ROUND_CLOSEST_ULL(state->duty_cycle * rate, NSEC_PER_SEC); > writel(val, pc->base + DUTY(pwm->hwpwm)); > > /* set polarity */ > I was able to test this now and everything looks good. Tested-by: Lino Sanfilippo <LinoSanfilippo@gmx.de> Best Regards, Lino
On Thu, Jan 14, 2021 at 09:48:04PM +0100, Uwe Kleine-König wrote: > With an input clk rate bigger than 2000000000, scaler would have been > zero which then would have resulted in a division by zero. > > Also the originally implemented algorithm divided by the result of a > division. This nearly always looses precision. Consider a requested period > of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware > was configured with an actual period of 983869.007 ns (PERIOD = 32258) > while the hardware can provide 1000003.508 ns (PERIOD = 32787). > And note if the input clock frequency was 32786886 Hz instead, the hardware > was configured to 1016656.477 ns (PERIOD = 33333) while the optimal > setting results in 1000003.477 ns (PERIOD = 32787). > > This patch implements proper range checking and only divides once for > the calculation of period (and similar for duty_cycle). > > Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> > --- > Hello, > > changes since v2 (Message-Id: > 20201222221319.2101107-1-u.kleine-koenig@pengutronix.de): > > - Add my calculation to the comment explaining the max_period formula > as discussed with Lino. > > Best regards > Uwe > > drivers/pwm/pwm-bcm2835.c | 35 +++++++++++++++++++++++++++-------- > 1 file changed, 27 insertions(+), 8 deletions(-) Applied, thanks. Thierry
diff --git a/drivers/pwm/pwm-bcm2835.c b/drivers/pwm/pwm-bcm2835.c index 6ff5f04b3e07..d593cce249d9 100644 --- a/drivers/pwm/pwm-bcm2835.c +++ b/drivers/pwm/pwm-bcm2835.c @@ -64,8 +64,9 @@ static int bcm2835_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm, struct bcm2835_pwm *pc = to_bcm2835_pwm(chip); unsigned long rate = clk_get_rate(pc->clk); - unsigned long long period; - unsigned long scaler; + unsigned long long period_cycles; + u64 max_period; + u32 val; if (!rate) { @@ -73,18 +74,36 @@ static int bcm2835_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm, return -EINVAL; } - scaler = DIV_ROUND_CLOSEST(NSEC_PER_SEC, rate); + /* + * period_cycles must be a 32 bit value, so period * rate / NSEC_PER_SEC + * must be <= U32_MAX. As U32_MAX * NSEC_PER_SEC < U64_MAX the + * multiplication period * rate doesn't overflow. + * To calculate the maximal possible period that guarantees the + * above inequality: + * + * round(period * rate / NSEC_PER_SEC) <= U32_MAX + * <=> period * rate / NSEC_PER_SEC < U32_MAX + 0.5 + * <=> period * rate < (U32_MAX + 0.5) * NSEC_PER_SEC + * <=> period < ((U32_MAX + 0.5) * NSEC_PER_SEC) / rate + * <=> period < ((U32_MAX * NSEC_PER_SEC + NSEC_PER_SEC/2) / rate + * <=> period <= ceil((U32_MAX * NSEC_PER_SEC + NSEC_PER_SEC/2) / rate) - 1 + */ + max_period = DIV_ROUND_UP_ULL((u64)U32_MAX * NSEC_PER_SEC + NSEC_PER_SEC / 2, rate) - 1; + + if (state->period > max_period) + return -EINVAL; + /* set period */ - period = DIV_ROUND_CLOSEST_ULL(state->period, scaler); + period_cycles = DIV_ROUND_CLOSEST_ULL(state->period * rate, NSEC_PER_SEC); - /* dont accept a period that is too small or has been truncated */ - if ((period < PERIOD_MIN) || (period > U32_MAX)) + /* don't accept a period that is too small */ + if (period_cycles < PERIOD_MIN) return -EINVAL; - writel(period, pc->base + PERIOD(pwm->hwpwm)); + writel(period_cycles, pc->base + PERIOD(pwm->hwpwm)); /* set duty cycle */ - val = DIV_ROUND_CLOSEST_ULL(state->duty_cycle, scaler); + val = DIV_ROUND_CLOSEST_ULL(state->duty_cycle * rate, NSEC_PER_SEC); writel(val, pc->base + DUTY(pwm->hwpwm)); /* set polarity */
With an input clk rate bigger than 2000000000, scaler would have been zero which then would have resulted in a division by zero. Also the originally implemented algorithm divided by the result of a division. This nearly always looses precision. Consider a requested period of 1000000 ns. With an input clock frequency of 32786885 Hz the hardware was configured with an actual period of 983869.007 ns (PERIOD = 32258) while the hardware can provide 1000003.508 ns (PERIOD = 32787). And note if the input clock frequency was 32786886 Hz instead, the hardware was configured to 1016656.477 ns (PERIOD = 33333) while the optimal setting results in 1000003.477 ns (PERIOD = 32787). This patch implements proper range checking and only divides once for the calculation of period (and similar for duty_cycle). Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> --- Hello, changes since v2 (Message-Id: 20201222221319.2101107-1-u.kleine-koenig@pengutronix.de): - Add my calculation to the comment explaining the max_period formula as discussed with Lino. Best regards Uwe drivers/pwm/pwm-bcm2835.c | 35 +++++++++++++++++++++++++++-------- 1 file changed, 27 insertions(+), 8 deletions(-)