Message ID | 20200416085627.1882-1-clay@daemons.net |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | net: cpts: Condition WARN_ON on PTP_1588_CLOCK | expand |
On 16/04/2020 11:56, Clay McClure wrote: > CPTS_MOD merely implies PTP_1588_CLOCK; it is possible to build cpts > without PTP clock support. In that case, ptp_clock_register() returns > NULL and we should not WARN_ON(cpts->clock) when downing the interface. > The ptp_*() functions are stubbed without PTP_1588_CLOCK, so it's safe > to pass them a null pointer. Could you explain the purpose of the exercise (Enabling CPTS with PTP_1588_CLOCK disabled), pls? > > Signed-off-by: Clay McClure <clay@daemons.net> > --- > drivers/net/ethernet/ti/cpts.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c > index fd214f8730a9..daf4505f4a70 100644 > --- a/drivers/net/ethernet/ti/cpts.c > +++ b/drivers/net/ethernet/ti/cpts.c > @@ -646,7 +646,7 @@ EXPORT_SYMBOL_GPL(cpts_register); > > void cpts_unregister(struct cpts *cpts) > { > - if (WARN_ON(!cpts->clock)) > + if (IS_REACHABLE(PTP_1588_CLOCK) && WARN_ON(!cpts->clock)) > return; > > ptp_clock_unregister(cpts->clock); >
On Thu, Apr 16, 2020 at 02:11:45PM +0300, Grygorii Strashko wrote: Grygorii, > > CPTS_MOD merely implies PTP_1588_CLOCK; it is possible to build cpts > > without PTP clock support. In that case, ptp_clock_register() returns > > NULL and we should not WARN_ON(cpts->clock) when downing the interface. > > The ptp_*() functions are stubbed without PTP_1588_CLOCK, so it's safe > > to pass them a null pointer. > > Could you explain the purpose of the exercise (Enabling CPTS with > PTP_1588_CLOCK disabled), pls? Hardware timestamping with a free-running PHC _almost_ works without PTP_1588_CLOCK, but since PHC rollover is handled by the PTP kworker in this driver the timestamps end up not being monotonic. And of course the moment you want to syntonize/synchronize the PHC with another clock (say, CLOCK_REALTIME), you'll need a PTP clock device. So you're right, there's not much point in building CPTS_MOD without PTP_1588_CLOCK. Given that, I wonder why all the Ethernet drivers seem to just `imply` PTP_1588_CLOCK, rather than `depends on` it? In any case, I was surprised to get a warning during `ifdown` but not during `ifup`. What do you think of this change, which prints an error like this during `ifup` if PTP_1588_CLOCK is not enabled: [ 6.192707] 000: cpsw 4a100000.ethernet: error registering cpts device --- diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c index 10ad706dda53..70b15039cd37 100644 --- a/drivers/net/ethernet/ti/cpts.c +++ b/drivers/net/ethernet/ti/cpts.c @@ -462,8 +462,8 @@ int cpts_register(struct cpts *cpts) timecounter_init(&cpts->tc, &cpts->cc, ktime_get_real_ns()); cpts->clock = ptp_clock_register(&cpts->info, cpts->dev); - if (IS_ERR(cpts->clock)) { - err = PTR_ERR(cpts->clock); + if (IS_ERR_OR_NULL(cpts->clock)) { + err = cpts->clock ? PTR_ERR(cpts->clock) : -EOPNOTSUPP; cpts->clock = NULL; goto err_ptp; } -- Clay
On Mon, Apr 20, 2020 at 11:38 AM Clay McClure <clay@daemons.net> wrote: > On Thu, Apr 16, 2020 at 02:11:45PM +0300, Grygorii Strashko wrote: > > > > CPTS_MOD merely implies PTP_1588_CLOCK; it is possible to build cpts > > > without PTP clock support. In that case, ptp_clock_register() returns > > > NULL and we should not WARN_ON(cpts->clock) when downing the interface. > > > The ptp_*() functions are stubbed without PTP_1588_CLOCK, so it's safe > > > to pass them a null pointer. > > > > Could you explain the purpose of the exercise (Enabling CPTS with > > PTP_1588_CLOCK disabled), pls? > > Hardware timestamping with a free-running PHC _almost_ works without > PTP_1588_CLOCK, but since PHC rollover is handled by the PTP kworker > in this driver the timestamps end up not being monotonic. > > And of course the moment you want to syntonize/synchronize the PHC with > another clock (say, CLOCK_REALTIME), you'll need a PTP clock device. So > you're right, there's not much point in building CPTS_MOD without > PTP_1588_CLOCK. > > Given that, I wonder why all the Ethernet drivers seem to just `imply` > PTP_1588_CLOCK, rather than `depends on` it? I suspect we should move all of them back. This was an early user of 'imply', but the meaning of that keyword has now changed in the latest Kconfig. > diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c > index 10ad706dda53..70b15039cd37 100644 > --- a/drivers/net/ethernet/ti/cpts.c > +++ b/drivers/net/ethernet/ti/cpts.c > @@ -462,8 +462,8 @@ int cpts_register(struct cpts *cpts) > timecounter_init(&cpts->tc, &cpts->cc, ktime_get_real_ns()); > > cpts->clock = ptp_clock_register(&cpts->info, cpts->dev); > - if (IS_ERR(cpts->clock)) { > - err = PTR_ERR(cpts->clock); > + if (IS_ERR_OR_NULL(cpts->clock)) { > + err = cpts->clock ? PTR_ERR(cpts->clock) : -EOPNOTSUPP; > cpts->clock = NULL; > goto err_ptp; Something else is wrong if you need IS_ERR_OR_NULL(). Any kernel interface should either return an negative error code when something goes wrong, or should return NULL for all errors, but not mix the two. Arnd
On Mon, Apr 20, 2020 at 04:38:32PM +0200, Arnd Bergmann wrote: > > I suspect we should move all of them back. This was an early user > of 'imply', but the meaning of that keyword has now changed > in the latest Kconfig. Can you please explain the justification for changing the meaning? It was a big PITA for me to support this in the first place, and now we are back to square one? > Something else is wrong if you need IS_ERR_OR_NULL(). Any > kernel interface should either return an negative error code when > something goes wrong, or should return NULL for all errors, but > not mix the two. On the contrary, this is exactly what the whole "imply" thing demanded. d1cbfd771ce82 (Nicolas Pitre 2016-11-11 172) #if IS_REACHABLE(CONFIG_PTP_1588_CLOCK) d1cbfd771ce82 (Nicolas Pitre 2016-11-11 173) d1cbfd771ce82 (Nicolas Pitre 2016-11-11 174) /** d1cbfd771ce82 (Nicolas Pitre 2016-11-11 175) * ptp_clock_register() - register a PTP hardware clock driver d1cbfd771ce82 (Nicolas Pitre 2016-11-11 176) * d1cbfd771ce82 (Nicolas Pitre 2016-11-11 177) * @info: Structure describing the new clock. d1cbfd771ce82 (Nicolas Pitre 2016-11-11 178) * @parent: Pointer to the parent device of the new clock. d1cbfd771ce82 (Nicolas Pitre 2016-11-11 179) * d1cbfd771ce82 (Nicolas Pitre 2016-11-11 180) * Returns a valid pointer on success or PTR_ERR on failure. If PHC d1cbfd771ce82 (Nicolas Pitre 2016-11-11 181) * support is missing at the configuration level, this function d1cbfd771ce82 (Nicolas Pitre 2016-11-11 182) * returns NULL, and drivers are expected to gracefully handle that d1cbfd771ce82 (Nicolas Pitre 2016-11-11 183) * case separately. d1cbfd771ce82 (Nicolas Pitre 2016-11-11 184) */ d1cbfd771ce82 (Nicolas Pitre 2016-11-11 185) d1cbfd771ce82 (Nicolas Pitre 2016-11-11 186) extern struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info, d1cbfd771ce82 (Nicolas Pitre 2016-11-11 187) struct device *parent); Thanks, Richard
On Mon, Apr 20, 2020 at 7:00 PM Richard Cochran <richardcochran@gmail.com> wrote: > > On Mon, Apr 20, 2020 at 04:38:32PM +0200, Arnd Bergmann wrote: > > > > I suspect we should move all of them back. This was an early user > > of 'imply', but the meaning of that keyword has now changed > > in the latest Kconfig. > > Can you please explain the justification for changing the meaning? > > It was a big PITA for me to support this in the first place, and now > we are back to square one? I don't understand it either. Apparently it didn't always do what users expected, though the new definition seems less useful to me, as it only changes the default. > > Something else is wrong if you need IS_ERR_OR_NULL(). Any > > kernel interface should either return an negative error code when > > something goes wrong, or should return NULL for all errors, but > > not mix the two. > > On the contrary, this is exactly what the whole "imply" thing > demanded. > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 172) #if IS_REACHABLE(CONFIG_PTP_1588_CLOCK) > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 173) > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 174) /** > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 175) * ptp_clock_register() - register a PTP hardware clock driver > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 176) * > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 177) * @info: Structure describing the new clock. > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 178) * @parent: Pointer to the parent device of the new clock. > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 179) * > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 180) * Returns a valid pointer on success or PTR_ERR on failure. If PHC > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 181) * support is missing at the configuration level, this function > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 182) * returns NULL, and drivers are expected to gracefully handle that > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 183) * case separately. > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 184) */ The key here is "gracefully". The second patch from Clay just turns NULL into -EOPNOTSUPP and treats the compile-time condition into a runtime error. I don't see a point in allowing the driver to compile if it just always returns an error. The two callers then go on to print a message for any error and just keep going. Arnd
On Mon, Apr 20, 2020 at 08:57:05PM +0200, Arnd Bergmann wrote: > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 172) #if IS_REACHABLE(CONFIG_PTP_1588_CLOCK) > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 173) > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 174) /** > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 175) * ptp_clock_register() - register a PTP hardware clock driver > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 176) * > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 177) * @info: Structure describing the new clock. > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 178) * @parent: Pointer to the parent device of the new clock. > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 179) * > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 180) * Returns a valid pointer on success or PTR_ERR on failure. If PHC > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 181) * support is missing at the configuration level, this function > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 182) * returns NULL, and drivers are expected to gracefully handle that > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 183) * case separately. > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 184) */ > > The key here is "gracefully". The second patch from Clay just turns NULL into > -EOPNOTSUPP and treats the compile-time condition into a runtime error. You are talking about the cpts driver, no? I'm worried about ptp_clock_register(), because it does return NULL if IS_REACHABLE(CONFIG_PTP_1588_CLOCK), and this is the "correct" behavior ever since November 2016. If somebody wants to change that stub to return EOPNOTSUPP, then fine, but please have them audit the callers and submit a patch series. Thanks, Richard
On Mon, Apr 20, 2020 at 11:18 PM Richard Cochran <richardcochran@gmail.com> wrote: > On Mon, Apr 20, 2020 at 08:57:05PM +0200, Arnd Bergmann wrote: > > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 172) #if IS_REACHABLE(CONFIG_PTP_1588_CLOCK) > > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 173) > > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 174) /** > > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 175) * ptp_clock_register() - register a PTP hardware clock driver > > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 176) * > > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 177) * @info: Structure describing the new clock. > > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 178) * @parent: Pointer to the parent device of the new clock. > > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 179) * > > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 180) * Returns a valid pointer on success or PTR_ERR on failure. If PHC > > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 181) * support is missing at the configuration level, this function > > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 182) * returns NULL, and drivers are expected to gracefully handle that > > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 183) * case separately. > > > d1cbfd771ce82 (Nicolas Pitre 2016-11-11 184) */ > > > > The key here is "gracefully". The second patch from Clay just turns NULL into > > -EOPNOTSUPP and treats the compile-time condition into a runtime error. > > You are talking about the cpts driver, no? > > I'm worried about ptp_clock_register(), because it does return NULL if > IS_REACHABLE(CONFIG_PTP_1588_CLOCK), and this is the "correct" > behavior ever since November 2016. > > If somebody wants to change that stub to return EOPNOTSUPP, then fine, > but please have them audit the callers and submit a patch series. It's not great, but we have other interfaces like this that can return NULL for success when the subsystem is disabled. The problem is when there is a mismatch between the caller treating NULL as failure when it is meant to be "successful lack of object returned". Arnd
On Mon, Apr 20, 2020 at 11:21:20PM +0200, Arnd Bergmann wrote: > It's not great, but we have other interfaces like this that can return NULL for > success when the subsystem is disabled. The problem is when there is > a mismatch between the caller treating NULL as failure when it is meant to > be "successful lack of object returned". Yeah, that should be fixed. To be clear, do you all see a need to change the stubbed version of ptp_clock_register() or not? Thanks, Richard
On Mon, Apr 20, 2020 at 11:34 PM Richard Cochran <richardcochran@gmail.com> wrote: > > On Mon, Apr 20, 2020 at 11:21:20PM +0200, Arnd Bergmann wrote: > > It's not great, but we have other interfaces like this that can return NULL for > > success when the subsystem is disabled. The problem is when there is > > a mismatch between the caller treating NULL as failure when it is meant to > > be "successful lack of object returned". > > Yeah, that should be fixed. > > To be clear, do you all see a need to change the stubbed version of > ptp_clock_register() or not? No, if the NULL return is only meant to mean "nothing wrong, keep going wihtout an object", that's fine with me. It does occasionally confuse driver writers (as seen here), so it's not a great interface, but there is no general solution to make it better. Arnd
Hi On 21/04/2020 00:42, Arnd Bergmann wrote: > On Mon, Apr 20, 2020 at 11:34 PM Richard Cochran > <richardcochran@gmail.com> wrote: >> >> On Mon, Apr 20, 2020 at 11:21:20PM +0200, Arnd Bergmann wrote: >>> It's not great, but we have other interfaces like this that can return NULL for >>> success when the subsystem is disabled. The problem is when there is >>> a mismatch between the caller treating NULL as failure when it is meant to >>> be "successful lack of object returned". >> >> Yeah, that should be fixed. >> >> To be clear, do you all see a need to change the stubbed version of >> ptp_clock_register() or not? > > No, if the NULL return is only meant to mean "nothing wrong, keep going > wihtout an object", that's fine with me. It does occasionally confuse driver > writers (as seen here), so it's not a great interface, but there is no general > solution to make it better. As per commit commit d1cbfd771ce8297fa11e89f315392de6056a2181 Author: Nicolas Pitre <nicolas.pitre@linaro.org> Date: Fri Nov 11 00:10:07 2016 -0500 ptp_clock: Allow for it to be optional In order to break the hard dependency between the PTP clock subsystem and ethernet drivers capable of being clock providers, this patch provides simple PTP stub functions to allow linkage of those drivers into the kernel even when the PTP subsystem is configured out. Drivers must be ready to accept NULL from ptp_clock_register() in that case. And to make it possible for PTP to be configured out, the select statement in those driver's Kconfig menu entries is converted to the new "imply" statement. This way the PTP subsystem may have Kconfig dependencies of its own, such as POSIX_TIMERS, without having to make those ethernet drivers unavailable if POSIX timers are cconfigured out. And when support for POSIX timers is selected again then the default config option for PTP clock support will automatically be adjusted accordingly. the idea of using "imply" is to keep networking enabled even if PTP is configured out and this exactly what happens with cpts driver. Another question is that CPTS completely nonfunctional in this case and it was never expected that somebody will even try to use/run such configuration (except for random build purposes).
On Wed, Apr 22, 2020 at 02:16:11PM +0300, Grygorii Strashko wrote: > > On 21/04/2020 00:42, Arnd Bergmann wrote: > > > > On Mon, Apr 20, 2020 at 11:34 PM Richard Cochran > > > > > > To be clear, do you all see a need to change the stubbed version of > > > ptp_clock_register() or not? > > > > No, if the NULL return is only meant to mean "nothing wrong, keep going > > wihtout an object", that's fine with me. It does occasionally confuse driver > > writers (as seen here), so it's not a great interface, but there is no general > > solution to make it better. That's why in my first patch I condition the WARN_ON() on PTP_1588_CLOCK, since without that the null pointer here is not an error: void cpts_unregister(struct cpts *cpts) { if (WARN_ON(!cpts->clock)) return; Grygorii's question (paraphrasing: "why would you ever do that?") prompted my second patch, making the broken configuration obvious by emitting an error during `ifup`, instead of just a warning during `ifdown`. But I think Grygorii is on to something here: > Another question is that CPTS completely nonfunctional in this case and > it was never expected that somebody will even try to use/run such > configuration (except for random build purposes). So, let's not allow it. In my view, commit d1cbfd771ce8 ("ptp_clock: Allow for it to be optional") went a bit too far, and converted even clearly PTP-specific modules from `select` to `imply` PTP_1588_CLOCK, which is what made this broken configuration possible. I suggest reverting that change, just for the PTP-specific modules under drivers/net/ethernet. I audited all drivers that call `ptp_clock_register()`; it looks like these should `select` (instead of merely `imply`) PTP_1588_CLOCK: NET_DSA_MV88E6XXX_PTP NET_DSA_SJA1105_PTP MACB_USE_HWSTAMP CAVIUM_PTP TI_CPTS_MOD PTP_1588_CLOCK_IXP46X Note how they all reference PTP or timestamping in their name, which is a clue that they depend on PTP_1588_CLOCK. I have a patch for this, but first, a procedural question: does mailing list etiquette dictate that I reply to this thread with the new patch, or does it begin a new thread?
diff --git a/drivers/net/ethernet/ti/cpts.c b/drivers/net/ethernet/ti/cpts.c index fd214f8730a9..daf4505f4a70 100644 --- a/drivers/net/ethernet/ti/cpts.c +++ b/drivers/net/ethernet/ti/cpts.c @@ -646,7 +646,7 @@ EXPORT_SYMBOL_GPL(cpts_register); void cpts_unregister(struct cpts *cpts) { - if (WARN_ON(!cpts->clock)) + if (IS_REACHABLE(PTP_1588_CLOCK) && WARN_ON(!cpts->clock)) return; ptp_clock_unregister(cpts->clock);
CPTS_MOD merely implies PTP_1588_CLOCK; it is possible to build cpts without PTP clock support. In that case, ptp_clock_register() returns NULL and we should not WARN_ON(cpts->clock) when downing the interface. The ptp_*() functions are stubbed without PTP_1588_CLOCK, so it's safe to pass them a null pointer. Signed-off-by: Clay McClure <clay@daemons.net> --- drivers/net/ethernet/ti/cpts.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)