diff mbox

[net-next] ptp: increase the maximum number of clocks

Message ID d219aa87bd5b3e2fa42449720d845a7110f642f0.1363893841.git.jbenc@redhat.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Jiri Benc March 21, 2013, 7:24 p.m. UTC
As network adapters supporting PTP are becoming more common, machines with
many NICs suddenly have many PHCs, too. Although the PHCs are not used in
such cases, they produce error messages like this:

igb 0000:07:00.0: ptp_clock_register failed

Currently, the maximum number of devices accepted by ptp_clock_register
is 8 which is pretty low. We could silence the error messages but this would
hurt in case somebody wants to use one of the interfaces to actually run
PTP, as the /dev/ptp%d device for the desired interface may or may not be
available after each boot.

Let's increase the maximum to 128. This shouldn't be a problem, as for the
char devices, the whole major number is reserved anyway, and 128bit bitmap
isn't a big deal, either.

Signed-off-by: Jiri Benc <jbenc@redhat.com>
---
 drivers/ptp/ptp_clock.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

Comments

David Miller March 21, 2013, 9:20 p.m. UTC | #1
From: Jiri Benc <jbenc@redhat.com>
Date: Thu, 21 Mar 2013 20:24:19 +0100

> As network adapters supporting PTP are becoming more common, machines with
> many NICs suddenly have many PHCs, too. Although the PHCs are not used in
> such cases, they produce error messages like this:
> 
> igb 0000:07:00.0: ptp_clock_register failed
> 
> Currently, the maximum number of devices accepted by ptp_clock_register
> is 8 which is pretty low. We could silence the error messages but this would
> hurt in case somebody wants to use one of the interfaces to actually run
> PTP, as the /dev/ptp%d device for the desired interface may or may not be
> available after each boot.
> 
> Let's increase the maximum to 128. This shouldn't be a problem, as for the
> char devices, the whole major number is reserved anyway, and 128bit bitmap
> isn't a big deal, either.
> 
> Signed-off-by: Jiri Benc <jbenc@redhat.com>

This really needs to be dynamic, and it shows what a bad idea it was
to use character drivers for this which introduces arbitrary device
arity limitations.

I want to see a better solution to this problem than just bumping this
limit arbitrarily every once in a while.

I'm not apply this patch, we're better than this.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran March 23, 2013, 5:59 p.m. UTC | #2
On Thu, Mar 21, 2013 at 05:20:17PM -0400, David Miller wrote:
> 
> This really needs to be dynamic, and it shows what a bad idea it was
> to use character drivers for this which introduces arbitrary device
> arity limitations.

If a clock is not a character device, what then should it be? Network
interface?  Block device?  sysfs apparition?

IIRC, the RTCs appear as char devices. In any case, we are better off
with char devices than with an enumeration of SYSV clock IDs.
 
> I want to see a better solution to this problem than just bumping this
> limit arbitrarily every once in a while.

Jiri, do you want to work on making the number of clocks essentially
unlimited? If not, I can take a look at this, but not right away.

Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller March 23, 2013, 11:53 p.m. UTC | #3
From: Richard Cochran <richardcochran@gmail.com>
Date: Sat, 23 Mar 2013 18:59:15 +0100

> On Thu, Mar 21, 2013 at 05:20:17PM -0400, David Miller wrote:
>> 
>> This really needs to be dynamic, and it shows what a bad idea it was
>> to use character drivers for this which introduces arbitrary device
>> arity limitations.
> 
> If a clock is not a character device, what then should it be? Network
> interface?  Block device?  sysfs apparition?
> 
> IIRC, the RTCs appear as char devices. In any case, we are better off
> with char devices than with an enumeration of SYSV clock IDs.

It should be an abstract device created/deleted/changed by netlink
operations.

Then there are no limitations.  PTP devices are then referened by
string names, rather than some bogus set of character device minor
numbers which have no meaning and have arbitrary limitations
wrt. arity.

If, in theory, we can have as many PTP devices as network devices,
we damn well better be able to create as many PTP devices as
network devices.

That means they must have the same capabilities as far as how many
of them you can create.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jiri Benc March 25, 2013, 4:08 p.m. UTC | #4
On Sat, 23 Mar 2013 18:59:15 +0100, Richard Cochran wrote:
> Jiri, do you want to work on making the number of clocks essentially
> unlimited? If not, I can take a look at this, but not right away.

I'm currently thinking of using just /dev/ptp for all PHCs in the
system and specifying the desired network device (ifindex, likely)
by an ioctl as the very first operation. What do you think? If this
is viable, I can try to implement it.

As for netlink, I don't see how to do that without reimplementing the
whole kernel<->user space timer API. The API expects clockid_t which
is either a const (a new constant wouldn't help, we can have multiple
PHCs in the system), or a dynamic value. There is no space for more
dynamic values as far as I can see, all combinations of the three lower
bits are already defined. It seems we're left with file descriptors.

 Jiri
Richard Cochran March 25, 2013, 6:51 p.m. UTC | #5
On Mon, Mar 25, 2013 at 05:08:55PM +0100, Jiri Benc wrote:
> On Sat, 23 Mar 2013 18:59:15 +0100, Richard Cochran wrote:
> > Jiri, do you want to work on making the number of clocks essentially
> > unlimited? If not, I can take a look at this, but not right away.
> 
> I'm currently thinking of using just /dev/ptp for all PHCs in the
> system and specifying the desired network device (ifindex, likely)
> by an ioctl as the very first operation. What do you think? If this
> is viable, I can try to implement it.

While that might work for the PTP ioctls, it won't for the main
clock_gettime/settime/adjtime call. As you wrote below, these take a
dynamic clockid_t (which comes from a file descriptor), and so each
device needs its own, unique file system node.

For better or worse, we are stuck with char devs, unless we want to
redo the clock_xyz API. I certainly don't want to. We can have over a
million PTP clocks on a single major number. That should be enough.

IIRC, it is possible to grow the range of minor numbers
incrementally. Perhaps you could take a look at that?

Thanks,
Richard
 
> As for netlink, I don't see how to do that without reimplementing the
> whole kernel<->user space timer API. The API expects clockid_t which
> is either a const (a new constant wouldn't help, we can have multiple
> PHCs in the system), or a dynamic value. There is no space for more
> dynamic values as far as I can see, all combinations of the three lower
> bits are already defined. It seems we're left with file descriptors.
> 
>  Jiri
> 
> -- 
> Jiri Benc
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings March 25, 2013, 9:39 p.m. UTC | #6
On Mon, 2013-03-25 at 19:51 +0100, Richard Cochran wrote:
> On Mon, Mar 25, 2013 at 05:08:55PM +0100, Jiri Benc wrote:
> > On Sat, 23 Mar 2013 18:59:15 +0100, Richard Cochran wrote:
> > > Jiri, do you want to work on making the number of clocks essentially
> > > unlimited? If not, I can take a look at this, but not right away.
> > 
> > I'm currently thinking of using just /dev/ptp for all PHCs in the
> > system and specifying the desired network device (ifindex, likely)
> > by an ioctl as the very first operation. What do you think? If this
> > is viable, I can try to implement it.
> 
> While that might work for the PTP ioctls, it won't for the main
> clock_gettime/settime/adjtime call. As you wrote below, these take a
> dynamic clockid_t (which comes from a file descriptor), and so each
> device needs its own, unique file system node.
> 
> For better or worse, we are stuck with char devs, unless we want to
> redo the clock_xyz API. I certainly don't want to. We can have over a
> million PTP clocks on a single major number. That should be enough.
> 
> IIRC, it is possible to grow the range of minor numbers
> incrementally. Perhaps you could take a look at that?

MTD used to have a static table of devices, which was a problem for the
sfc driver as it exposes multiple partitions on each NIC through MTD.
It was reasonably easy to switch to dynamic numbering using the idr
library, and I expect that would also work for PTP clock devices.

Ben.
Jiri Benc March 26, 2013, 9:48 a.m. UTC | #7
On Mon, 25 Mar 2013 19:51:20 +0100, Richard Cochran wrote:
> While that might work for the PTP ioctls, it won't for the main
> clock_gettime/settime/adjtime call. As you wrote below, these take a
> dynamic clockid_t (which comes from a file descriptor), and so each
> device needs its own, unique file system node.

They need an unique file descriptor, not an unique file. There's nothing
preventing you from opening the same file multiple times, using each
obtained file descriptor to access a different PHC. Sure, there is
currently no place to store a per-fd data in posix-clock implementation
but that could be added.

> For better or worse, we are stuck with char devs, unless we want to
> redo the clock_xyz API. I certainly don't want to. We can have over a
> million PTP clocks on a single major number. That should be enough.
> 
> IIRC, it is possible to grow the range of minor numbers
> incrementally. Perhaps you could take a look at that?

That was what I was originally thinking about. But I don't like it much
- the overhead will increase significantly with increasing number of
devices, as the chrdev allocations for a given major are kept in a
linked list which has to be completely traversed on each allocation.
We'd also need to replace the ptp_clocks_map with a different data
structure (e.g. a linked list of bitmaps if we allocate minors in
chunks to alleviate the chrdev allocation overhead), not helping the
overhead.

Please correct me if I misunderstood something.

Thanks for the ideas,

 Jiri
Richard Cochran March 26, 2013, 6:44 p.m. UTC | #8
On Tue, Mar 26, 2013 at 10:48:36AM +0100, Jiri Benc wrote:
> On Mon, 25 Mar 2013 19:51:20 +0100, Richard Cochran wrote:
> > While that might work for the PTP ioctls, it won't for the main
> > clock_gettime/settime/adjtime call. As you wrote below, these take a
> > dynamic clockid_t (which comes from a file descriptor), and so each
> > device needs its own, unique file system node.
> 
> They need an unique file descriptor, not an unique file. There's nothing
> preventing you from opening the same file multiple times, using each
> obtained file descriptor to access a different PHC. Sure, there is
> currently no place to store a per-fd data in posix-clock implementation
> but that could be added.

Assume you have multiple clocks in the system. You open /dev/ptp, and
you get a FD to be passed to clock_gettime. But *which* clock's time
should be read?

(An extra ioctl to choose the active clock seems a bit gross to me.)

> > For better or worse, we are stuck with char devs, unless we want to
> > redo the clock_xyz API. I certainly don't want to. We can have over a
> > million PTP clocks on a single major number. That should be enough.
> > 
> > IIRC, it is possible to grow the range of minor numbers
> > incrementally. Perhaps you could take a look at that?
> 
> That was what I was originally thinking about. But I don't like it much
> - the overhead will increase significantly with increasing number of
> devices, as the chrdev allocations for a given major are kept in a
> linked list which has to be completely traversed on each allocation.
> We'd also need to replace the ptp_clocks_map with a different data
> structure (e.g. a linked list of bitmaps if we allocate minors in
> chunks to alleviate the chrdev allocation overhead), not helping the
> overhead.

How about using idr and then, at each clock registration, just
increase the range of minors to include the new number?

I wouldn't worry about the overhead at allocation time. It is one time
only, during device registration, and so is not performance critical.

Thanks,
Richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c
index 79f4bce..381d557 100644
--- a/drivers/ptp/ptp_clock.c
+++ b/drivers/ptp/ptp_clock.c
@@ -32,7 +32,7 @@ 
 #include "ptp_private.h"
 
 #define PTP_MAX_ALARMS 4
-#define PTP_MAX_CLOCKS 8
+#define PTP_MAX_CLOCKS 128
 #define PTP_PPS_DEFAULTS (PPS_CAPTUREASSERT | PPS_OFFSETASSERT)
 #define PTP_PPS_EVENT PPS_CAPTUREASSERT
 #define PTP_PPS_MODE (PTP_PPS_DEFAULTS | PPS_CANWAIT | PPS_TSFMT_TSPEC)