Message ID | 20180117230621.26074-2-jesus.sanchez-palencia@intel.com |
---|---|
State | RFC |
Headers | show |
Series | Time based packet transmission | expand |
On Wed, Jan 17, 2018 at 03:06:12PM -0800, Jesus Sanchez-Palencia wrote: > From: Richard Cochran <rcochran@linutronix.de> > > This patch introduces SO_TXTIME. User space enables this option in > order to pass a desired future transmit time in a CMSG when calling > sendmsg(2). > > A new field is added to struct sockcm_cookie, and the tstamp from > skbuffs will be used later on. In the discussion about the v1 patchset, there was a question if the cmsg should include a clockid_t. Without that, how can an application prevent the packet from being sent using an incorrect clock, e.g. the system clock when it expects it to be a PHC, or a different PHC when the socket is not bound to a specific interface? At least in some applications it would be preferred to not sent a packet at all instead of sending it at a wrong time. Please keep in mind that the PHCs and the system clock don't have to be synchronized to each other. If I understand the rest of the series correctly, there is an assumption that the PHCs are keeping time in TAI and CLOCK_TAI can be used as a fallback.
On Wed, Jan 17, 2018 at 03:06:12PM -0800, Jesus Sanchez-Palencia wrote: > @@ -2130,6 +2137,15 @@ int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg, > sockc->tsflags &= ~SOF_TIMESTAMPING_TX_RECORD_MASK; > sockc->tsflags |= tsflags; > break; > + case SO_TXTIME: > + if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) > + return -EPERM; > + if (!sock_flag(sk, SOCK_TXTIME)) > + return -EINVAL; > + if (cmsg->cmsg_len != CMSG_LEN(sizeof(ktime_t))) > + return -EINVAL; > + sockc->transmit_time = *(ktime_t *)CMSG_DATA(cmsg); As pointed out in the first series' review: No guarantee the CMSG is properly aligned on arches that might trap on unaligned access. Thanks, Richard
On Thu, Jan 18, 2018 at 09:42:27AM +0100, Miroslav Lichvar wrote: > In the discussion about the v1 patchset, there was a question if the > cmsg should include a clockid_t. Without that, how can an application > prevent the packet from being sent using an incorrect clock, e.g. > the system clock when it expects it to be a PHC, or a different PHC > when the socket is not bound to a specific interface? Right, the clockid_t should be passed in through the CMSG along with the time. Thanks, Richard
On Wed, Jan 17, 2018 at 6:06 PM, Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> wrote: > From: Richard Cochran <rcochran@linutronix.de> > > This patch introduces SO_TXTIME. User space enables this option in > order to pass a desired future transmit time in a CMSG when calling > sendmsg(2). > > A new field is added to struct sockcm_cookie, and the tstamp from > skbuffs will be used later on. > > Signed-off-by: Richard Cochran <rcochran@linutronix.de> > Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> > --- > diff --git a/net/core/sock.c b/net/core/sock.c > index abf4cbff99b2..37ef4b33fd92 100644 > --- a/net/core/sock.c > +++ b/net/core/sock.c > @@ -1061,6 +1061,13 @@ int sock_setsockopt(struct socket *sock, int level, int optname, > sock_valbool_flag(sk, SOCK_ZEROCOPY, valbool); > break; > > + case SO_TXTIME: > + if (ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) > + sock_valbool_flag(sk, SOCK_TXTIME, valbool); > + else > + ret = -EPERM; > + break; > + > default: > ret = -ENOPROTOOPT; > break; Please add getsockopt alongside setsockopt. I would also restrict input to [0, 1] exactly to allow for future extensions. If using ns_capable, skb->tstamp must continue to be scrubbed when traversing network namespaces. > @@ -2130,6 +2137,15 @@ int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg, > sockc->tsflags &= ~SOF_TIMESTAMPING_TX_RECORD_MASK; > sockc->tsflags |= tsflags; > break; > + case SO_TXTIME: > + if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) > + return -EPERM; > + if (!sock_flag(sk, SOCK_TXTIME)) > + return -EINVAL; No need for ns_capable check on each packet when already required to toggle socket option. > + if (cmsg->cmsg_len != CMSG_LEN(sizeof(ktime_t))) > + return -EINVAL; I don't see any existing reference to ktime_t in include/uapi. Just use a s64?
On Fri, Jan 19, 2018 at 04:15:46PM -0500, Willem de Bruijn wrote: > > + if (cmsg->cmsg_len != CMSG_LEN(sizeof(ktime_t))) > > + return -EINVAL; > > I don't see any existing reference to ktime_t in include/uapi. Just use a s64? Agreed. I didn't see the point of switching to ktime, either. Thanks, Richard
On 01/18/2018 09:11 AM, Richard Cochran wrote: > On Wed, Jan 17, 2018 at 03:06:12PM -0800, Jesus Sanchez-Palencia wrote: >> @@ -2130,6 +2137,15 @@ int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg, >> sockc->tsflags &= ~SOF_TIMESTAMPING_TX_RECORD_MASK; >> sockc->tsflags |= tsflags; >> break; >> + case SO_TXTIME: >> + if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) >> + return -EPERM; >> + if (!sock_flag(sk, SOCK_TXTIME)) >> + return -EINVAL; >> + if (cmsg->cmsg_len != CMSG_LEN(sizeof(ktime_t))) >> + return -EINVAL; >> + sockc->transmit_time = *(ktime_t *)CMSG_DATA(cmsg); > > As pointed out in the first series' review: > > No guarantee the CMSG is properly aligned on arches that might trap > on unaligned access. Yes, it will be fixed on the next version. We should probably fix the other cases on this function as well then. Thanks, Jesus > > Thanks, > Richard >
Hi, On 01/19/2018 01:15 PM, Willem de Bruijn wrote: > On Wed, Jan 17, 2018 at 6:06 PM, Jesus Sanchez-Palencia > <jesus.sanchez-palencia@intel.com> wrote: >> From: Richard Cochran <rcochran@linutronix.de> >> >> This patch introduces SO_TXTIME. User space enables this option in >> order to pass a desired future transmit time in a CMSG when calling >> sendmsg(2). >> >> A new field is added to struct sockcm_cookie, and the tstamp from >> skbuffs will be used later on. >> >> Signed-off-by: Richard Cochran <rcochran@linutronix.de> >> Signed-off-by: Jesus Sanchez-Palencia <jesus.sanchez-palencia@intel.com> >> --- > >> diff --git a/net/core/sock.c b/net/core/sock.c >> index abf4cbff99b2..37ef4b33fd92 100644 >> --- a/net/core/sock.c >> +++ b/net/core/sock.c >> @@ -1061,6 +1061,13 @@ int sock_setsockopt(struct socket *sock, int level, int optname, >> sock_valbool_flag(sk, SOCK_ZEROCOPY, valbool); >> break; >> >> + case SO_TXTIME: >> + if (ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) >> + sock_valbool_flag(sk, SOCK_TXTIME, valbool); >> + else >> + ret = -EPERM; >> + break; >> + >> default: >> ret = -ENOPROTOOPT; >> break; > > Please add getsockopt alongside setsockopt. > > I would also restrict input to [0, 1] exactly to allow for future extensions. Ok, will do. > > If using ns_capable, skb->tstamp must continue to be scrubbed when traversing > network namespaces. I was planning to follow Eric's suggestion and move the tstamp scrubbing out of skb_scrub_packet() into ____dev_forward_skb() instead. Would that break when traversing namespaces? > >> @@ -2130,6 +2137,15 @@ int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg, >> sockc->tsflags &= ~SOF_TIMESTAMPING_TX_RECORD_MASK; >> sockc->tsflags |= tsflags; >> break; >> + case SO_TXTIME: >> + if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) >> + return -EPERM; >> + if (!sock_flag(sk, SOCK_TXTIME)) >> + return -EINVAL; > > No need for ns_capable check on each packet when already required to > toggle socket option. Ok. SO_MARK is doing the same so it might have "mis-inspired" me. I should probably fix both. > >> + if (cmsg->cmsg_len != CMSG_LEN(sizeof(ktime_t))) >> + return -EINVAL; > > I don't see any existing reference to ktime_t in include/uapi. Just use a s64? Sure, will fix. Thanks, Jesus
>> If using ns_capable, skb->tstamp must continue to be scrubbed when traversing >> network namespaces. > > > I was planning to follow Eric's suggestion and move the tstamp scrubbing out of > skb_scrub_packet() into ____dev_forward_skb() instead. Would that break when > traversing namespaces? That implies namespace traversal, so sounds perfect for this purpose. >> >>> @@ -2130,6 +2137,15 @@ int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg, >>> sockc->tsflags &= ~SOF_TIMESTAMPING_TX_RECORD_MASK; >>> sockc->tsflags |= tsflags; >>> break; >>> + case SO_TXTIME: >>> + if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) >>> + return -EPERM; >>> + if (!sock_flag(sk, SOCK_TXTIME)) >>> + return -EINVAL; >> >> No need for ns_capable check on each packet when already required to >> toggle socket option. > > > Ok. SO_MARK is doing the same so it might have "mis-inspired" me. I should > probably fix both. The SO_MARK cmsg does need a check on each invocation, because it is not conditional on a sock_flag like SO_TXTIME.
Hi, Miroslav Lichvar <mlichvar@redhat.com> writes: > On Wed, Jan 17, 2018 at 03:06:12PM -0800, Jesus Sanchez-Palencia wrote: >> From: Richard Cochran <rcochran@linutronix.de> >> >> This patch introduces SO_TXTIME. User space enables this option in >> order to pass a desired future transmit time in a CMSG when calling >> sendmsg(2). >> >> A new field is added to struct sockcm_cookie, and the tstamp from >> skbuffs will be used later on. > > In the discussion about the v1 patchset, there was a question if the > cmsg should include a clockid_t. Without that, how can an application > prevent the packet from being sent using an incorrect clock, e.g. > the system clock when it expects it to be a PHC, or a different PHC > when the socket is not bound to a specific interface? > > At least in some applications it would be preferred to not sent a > packet at all instead of sending it at a wrong time. Including the clockid in a CMSG field does make sense. Will add it in the next version of this series. What I think would be the ideal scenario would be if the clockid parameter to the TBS Qdisc would not be necessary (if offload was enabled), but that's not quite possible right now, because there's no support for using the hrtimer infrastructure with dynamic clocks (/dev/ptp*). What I am thinking is to keep the clockid parameter for the Qdisc (and add support for expressing the clockid in friendlier ways, as requested later in this thread), but I can't think of a way to add support for using /dev/ptp* clocks without first having hrtimer support them. And the behavior would be to drop any packets with a clockid not matching the Qdisc clockid. How does this sound? > > Please keep in mind that the PHCs and the system clock don't have to > be synchronized to each other. If I understand the rest of the series > correctly, there is an assumption that the PHCs are keeping time in > TAI and CLOCK_TAI can be used as a fallback. You understand correctly, that's because of whole hrtimer issue. > > -- > Miroslav Lichvar Cheers, -- Vinicius
On Tue, Jan 23, 2018 at 01:22:37PM -0800, Vinicius Costa Gomes wrote: > What I think would be the ideal scenario would be if the clockid > parameter to the TBS Qdisc would not be necessary (if offload was > enabled), but that's not quite possible right now, because there's no > support for using the hrtimer infrastructure with dynamic clocks > (/dev/ptp*). We don't need hrtimer for HW offloading. Just enqueue the packets. I thought we agreed that user space get the ordering correct. In fact, davem insisted on it, IIRC. Thanks, Richard
Hi Richard, Richard Cochran <richardcochran@gmail.com> writes: > On Tue, Jan 23, 2018 at 01:22:37PM -0800, Vinicius Costa Gomes wrote: >> What I think would be the ideal scenario would be if the clockid >> parameter to the TBS Qdisc would not be necessary (if offload was >> enabled), but that's not quite possible right now, because there's no >> support for using the hrtimer infrastructure with dynamic clocks >> (/dev/ptp*). > > We don't need hrtimer for HW offloading. Just enqueue the packets. I > thought we agreed that user space get the ordering correct. In fact, > davem insisted on it, IIRC. About the ordering of packets, From here [1], there are 3 clear points (in my understanding): 1. Re-ordering of TX descriptors on the device queue should/must not happen; 2. Out of order requests are an error; 3. Timestamps in the past are an error; The only robust way that we could think of about keeping the the packets in order for the device queue is re-ordering packets in the Qdisc. We tried to reach out for confirmation [2] of this understanding but didn't receive any word. Even if we reach a decision that the Qdisc should not re-order packets (we wouldn't have any dependency on hrtimers in the offload case, as you pointed out), we still need hrtimers for the software implementation. So, I guess, the problem remains, if it's possible for the user to express a /dev/ptp* clock, what should we do? > > Thanks, > Richard Cheers, -- Vinicius [1] https://patchwork.ozlabs.org/comment/1770302/ [2] https://patchwork.ozlabs.org/comment/1816492/q
On Fri, Jan 19, 2018 at 06:09:15PM -0800, Richard Cochran wrote: > On Fri, Jan 19, 2018 at 04:15:46PM -0500, Willem de Bruijn wrote: > > > + if (cmsg->cmsg_len != CMSG_LEN(sizeof(ktime_t))) > > > + return -EINVAL; > > > > I don't see any existing reference to ktime_t in include/uapi. Just use a s64? > > Agreed. I didn't see the point of switching to ktime, either. Do I understand it correctly that no other interface is using nanoseconds since 1970? We probably don't have to worry about year 2262 yet, but wouldn't it be better to make it consistent with the timestamping API using timespec? Or is it just better to avoid the 64/32-bit mess of time_t?
On Thu, Jan 25, 2018 at 10:12:25AM +0100, Miroslav Lichvar wrote: > Do I understand it correctly that no other interface is using > nanoseconds since 1970? We probably don't have to worry about year > 2262 yet, but wouldn't it be better to make it consistent with the > timestamping API using timespec? Or is it just better to avoid the > 64/32-bit mess of time_t? I prefer a single 64 bit nanoseconds field: - Applications won't have to convert to timespec. - Avoids the time_t issue. Thanks, Richard
On Wed, Jan 24, 2018 at 02:46:24PM -0800, Vinicius Costa Gomes wrote: > The only robust way that we could think of about keeping the the packets > in order for the device queue is re-ordering packets in the Qdisc. Right, but you cannot afford the overhead of the timerqueue when using HW offload, when the HW device sits on a PCIe bus. Many serious TSN applications (like industrial controls) will want to have just one packet queued, readying the next one just in time for the next deadline. The control loops are sensitive to the feedback interval. > Even if we reach a decision that the Qdisc should not re-order packets > (we wouldn't have any dependency on hrtimers in the offload case, as you > pointed out), we still need hrtimers for the software implementation. Fine. > So, I guess, the problem remains, if it's possible for the user to > express a /dev/ptp* clock, what should we do? Thinking a bit more, it doesn't make sense to have a user choice for the HW offloading case. The clock should implicitly be the device clock, always. Using any other clock would make no sense. Thanks, Richard
Hi, On 01/18/2018 09:13 AM, Richard Cochran wrote: > On Thu, Jan 18, 2018 at 09:42:27AM +0100, Miroslav Lichvar wrote: >> In the discussion about the v1 patchset, there was a question if the >> cmsg should include a clockid_t. Without that, how can an application >> prevent the packet from being sent using an incorrect clock, e.g. >> the system clock when it expects it to be a PHC, or a different PHC >> when the socket is not bound to a specific interface? > > Right, the clockid_t should be passed in through the CMSG along with > the time. While implementing this today it crossed my mind that why don't we have the clockid_t set per socket (e.g. as an argument to SO_TXTIME) instead of per packet? The only use-case that we could think of that would be 'blocked' was using sendmmsg() to send a packet to different interfaces with a single syscall, but I'm not sure how common that is. What do you think? Thanks, Jesus > > Thanks, > Richard >
On Wed, Jan 31, 2018 at 04:49:36PM -0800, Jesus Sanchez-Palencia wrote: > While implementing this today it crossed my mind that why don't we have the > clockid_t set per socket (e.g. as an argument to SO_TXTIME) instead of per packet? Sounds good to me. Thanks, Richard
On Wed, Jan 31, 2018 at 04:49:36PM -0800, Jesus Sanchez-Palencia wrote: > On 01/18/2018 09:13 AM, Richard Cochran wrote: > > Right, the clockid_t should be passed in through the CMSG along with > > the time. > > While implementing this today it crossed my mind that why don't we have the > clockid_t set per socket (e.g. as an argument to SO_TXTIME) instead of per packet? I suspect that might have an impact on the performance. Even if the application doesn't use sendmmsg(), it would possibly have to call setsockopt() before each sendmsg() to change the clockid_t, right? If clockid_t could be set per packet, a special value could be used to allow sending on interfaces that don't support it. > The only use-case that we could think of that would be 'blocked' was using > sendmmsg() to send a packet to different interfaces with a single syscall, but > I'm not sure how common that is. The SO_TXTIME option will make sendmmsg() useful in applications where it wasn't before. For instance, an NTP server will be able to batch multiple responses as their transmit timestamps can be set accurately in advance and it's no longer necessary to send the responses as soon as they are assembled. I think it would be nice the sendmmsg() calls didn't have to be split by clockid_t.
Hi, On 02/01/2018 01:27 AM, Miroslav Lichvar wrote: > On Wed, Jan 31, 2018 at 04:49:36PM -0800, Jesus Sanchez-Palencia wrote: >> On 01/18/2018 09:13 AM, Richard Cochran wrote: >>> Right, the clockid_t should be passed in through the CMSG along with >>> the time. >> >> While implementing this today it crossed my mind that why don't we have the >> clockid_t set per socket (e.g. as an argument to SO_TXTIME) instead of per packet? > > I suspect that might have an impact on the performance. Even if the > application doesn't use sendmmsg(), it would possibly have to call > setsockopt() before each sendmsg() to change the clockid_t, right? Yes. On the other hand, for applications that will be using only 1 clockid_t, keeping it per packet will also have an impact as we'll be copying the same value from the cmsg cookie into sk_buffs over and over. > > If clockid_t could be set per packet, a special value could be used > to allow sending on interfaces that don't support it. > >> The only use-case that we could think of that would be 'blocked' was using >> sendmmsg() to send a packet to different interfaces with a single syscall, but >> I'm not sure how common that is. > > The SO_TXTIME option will make sendmmsg() useful in applications where > it wasn't before. For instance, an NTP server will be able to batch > multiple responses as their transmit timestamps can be set accurately > in advance and it's no longer necessary to send the responses as soon > as they are assembled. > > I think it would be nice the sendmmsg() calls didn't have to be split > by clockid_t. OK, fair enough. I will keep it per-packet for now as initially agreed and we can revisit this later if needed. Thanks, Jesus
Hi, On 01/18/2018 12:42 AM, Miroslav Lichvar wrote: > On Wed, Jan 17, 2018 at 03:06:12PM -0800, Jesus Sanchez-Palencia wrote: >> From: Richard Cochran <rcochran@linutronix.de> >> >> This patch introduces SO_TXTIME. User space enables this option in >> order to pass a desired future transmit time in a CMSG when calling >> sendmsg(2). >> >> A new field is added to struct sockcm_cookie, and the tstamp from >> skbuffs will be used later on. > > In the discussion about the v1 patchset, there was a question if the > cmsg should include a clockid_t. Without that, how can an application > prevent the packet from being sent using an incorrect clock, e.g. > the system clock when it expects it to be a PHC, or a different PHC > when the socket is not bound to a specific interface? > > At least in some applications it would be preferred to not sent a > packet at all instead of sending it at a wrong time. > > Please keep in mind that the PHCs and the system clock don't have to > be synchronized to each other. If I understand the rest of the series > correctly, there is an assumption that the PHCs are keeping time in > TAI and CLOCK_TAI can be used as a fallback. Just to double-check, imagine that I've configured the qdisc for SW best-effort and with clockid CLOCK_REALTIME. When it receives a packet with the clockid of a /dev/ptpX, the qdisc should just drop that packet, right? Or would this block any use-cases that I couldn't think of ? Thanks, Jesus
On Mon, Feb 12, 2018 at 02:39:06PM -0800, Jesus Sanchez-Palencia wrote: > On 01/18/2018 12:42 AM, Miroslav Lichvar wrote: > > Please keep in mind that the PHCs and the system clock don't have to > > be synchronized to each other. If I understand the rest of the series > > correctly, there is an assumption that the PHCs are keeping time in > > TAI and CLOCK_TAI can be used as a fallback. > > Just to double-check, imagine that I've configured the qdisc for > SW best-effort and with clockid CLOCK_REALTIME. When it receives a > packet with the clockid of a /dev/ptpX, the qdisc should just drop that > packet, right? Yes, I think it should drop it. The kernel does not know the offset between the two clocks (they don't even have to be synchronized), so it cannot convert a PHC-based TX time to the system time.
diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h index be14f16149d5..065fb372e355 100644 --- a/arch/alpha/include/uapi/asm/socket.h +++ b/arch/alpha/include/uapi/asm/socket.h @@ -112,4 +112,7 @@ #define SO_ZEROCOPY 60 +#define SO_TXTIME 61 +#define SCM_TXTIME SO_TXTIME + #endif /* _UAPI_ASM_SOCKET_H */ diff --git a/arch/frv/include/uapi/asm/socket.h b/arch/frv/include/uapi/asm/socket.h index 9168e78fa32a..0e95f45cd058 100644 --- a/arch/frv/include/uapi/asm/socket.h +++ b/arch/frv/include/uapi/asm/socket.h @@ -105,5 +105,8 @@ #define SO_ZEROCOPY 60 +#define SO_TXTIME 61 +#define SCM_TXTIME SO_TXTIME + #endif /* _ASM_SOCKET_H */ diff --git a/arch/ia64/include/uapi/asm/socket.h b/arch/ia64/include/uapi/asm/socket.h index 3efba40adc54..c872c4e6bafb 100644 --- a/arch/ia64/include/uapi/asm/socket.h +++ b/arch/ia64/include/uapi/asm/socket.h @@ -114,4 +114,7 @@ #define SO_ZEROCOPY 60 +#define SO_TXTIME 61 +#define SCM_TXTIME SO_TXTIME + #endif /* _ASM_IA64_SOCKET_H */ diff --git a/arch/m32r/include/uapi/asm/socket.h b/arch/m32r/include/uapi/asm/socket.h index cf5018e82c3d..65276c95b8df 100644 --- a/arch/m32r/include/uapi/asm/socket.h +++ b/arch/m32r/include/uapi/asm/socket.h @@ -105,4 +105,7 @@ #define SO_ZEROCOPY 60 +#define SO_TXTIME 61 +#define SCM_TXTIME SO_TXTIME + #endif /* _ASM_M32R_SOCKET_H */ diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h index 49c3d4795963..71370fb3ceef 100644 --- a/arch/mips/include/uapi/asm/socket.h +++ b/arch/mips/include/uapi/asm/socket.h @@ -123,4 +123,7 @@ #define SO_ZEROCOPY 60 +#define SO_TXTIME 61 +#define SCM_TXTIME SO_TXTIME + #endif /* _UAPI_ASM_SOCKET_H */ diff --git a/arch/mn10300/include/uapi/asm/socket.h b/arch/mn10300/include/uapi/asm/socket.h index b35eee132142..d029a40b1b55 100644 --- a/arch/mn10300/include/uapi/asm/socket.h +++ b/arch/mn10300/include/uapi/asm/socket.h @@ -105,4 +105,7 @@ #define SO_ZEROCOPY 60 +#define SO_TXTIME 61 +#define SCM_TXTIME SO_TXTIME + #endif /* _ASM_SOCKET_H */ diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h index 1d0fdc3b5d22..061b9cf2a779 100644 --- a/arch/parisc/include/uapi/asm/socket.h +++ b/arch/parisc/include/uapi/asm/socket.h @@ -104,4 +104,7 @@ #define SO_ZEROCOPY 0x4035 +#define SO_TXTIME 0x4036 +#define SCM_TXTIME SO_TXTIME + #endif /* _UAPI_ASM_SOCKET_H */ diff --git a/arch/s390/include/uapi/asm/socket.h b/arch/s390/include/uapi/asm/socket.h index 3510c0fd06f4..39d901476ee5 100644 --- a/arch/s390/include/uapi/asm/socket.h +++ b/arch/s390/include/uapi/asm/socket.h @@ -111,4 +111,7 @@ #define SO_ZEROCOPY 60 +#define SO_TXTIME 61 +#define SCM_TXTIME SO_TXTIME + #endif /* _ASM_SOCKET_H */ diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h index d58520c2e6ff..7ea35e5601b6 100644 --- a/arch/sparc/include/uapi/asm/socket.h +++ b/arch/sparc/include/uapi/asm/socket.h @@ -101,6 +101,9 @@ #define SO_ZEROCOPY 0x003e +#define SO_TXTIME 0x003f +#define SCM_TXTIME SO_TXTIME + /* Security levels - as per NRL IPv6 - don't actually do anything */ #define SO_SECURITY_AUTHENTICATION 0x5001 #define SO_SECURITY_ENCRYPTION_TRANSPORT 0x5002 diff --git a/arch/xtensa/include/uapi/asm/socket.h b/arch/xtensa/include/uapi/asm/socket.h index 75a07b8119a9..1de07a7f7680 100644 --- a/arch/xtensa/include/uapi/asm/socket.h +++ b/arch/xtensa/include/uapi/asm/socket.h @@ -116,4 +116,7 @@ #define SO_ZEROCOPY 60 +#define SO_TXTIME 61 +#define SCM_TXTIME SO_TXTIME + #endif /* _XTENSA_SOCKET_H */ diff --git a/include/net/sock.h b/include/net/sock.h index 73b7830b0bb8..927af34f3e2c 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -777,6 +777,7 @@ enum sock_flags { SOCK_FILTER_LOCKED, /* Filter cannot be changed anymore */ SOCK_SELECT_ERR_QUEUE, /* Wake select on error queue */ SOCK_RCU_FREE, /* wait rcu grace period in sk_destruct() */ + SOCK_TXTIME, }; #define SK_FLAGS_TIMESTAMP ((1UL << SOCK_TIMESTAMP) | (1UL << SOCK_TIMESTAMPING_RX_SOFTWARE)) @@ -1567,6 +1568,7 @@ void sock_kzfree_s(struct sock *sk, void *mem, int size); void sk_send_sigurg(struct sock *sk); struct sockcm_cookie { + ktime_t transmit_time; u32 mark; u16 tsflags; }; diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h index 0ae758c90e54..a12692e5f7a8 100644 --- a/include/uapi/asm-generic/socket.h +++ b/include/uapi/asm-generic/socket.h @@ -107,4 +107,7 @@ #define SO_ZEROCOPY 60 +#define SO_TXTIME 61 +#define SCM_TXTIME SO_TXTIME + #endif /* __ASM_GENERIC_SOCKET_H */ diff --git a/net/core/sock.c b/net/core/sock.c index abf4cbff99b2..37ef4b33fd92 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1061,6 +1061,13 @@ int sock_setsockopt(struct socket *sock, int level, int optname, sock_valbool_flag(sk, SOCK_ZEROCOPY, valbool); break; + case SO_TXTIME: + if (ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) + sock_valbool_flag(sk, SOCK_TXTIME, valbool); + else + ret = -EPERM; + break; + default: ret = -ENOPROTOOPT; break; @@ -2130,6 +2137,15 @@ int __sock_cmsg_send(struct sock *sk, struct msghdr *msg, struct cmsghdr *cmsg, sockc->tsflags &= ~SOF_TIMESTAMPING_TX_RECORD_MASK; sockc->tsflags |= tsflags; break; + case SO_TXTIME: + if (!ns_capable(sock_net(sk)->user_ns, CAP_NET_ADMIN)) + return -EPERM; + if (!sock_flag(sk, SOCK_TXTIME)) + return -EINVAL; + if (cmsg->cmsg_len != CMSG_LEN(sizeof(ktime_t))) + return -EINVAL; + sockc->transmit_time = *(ktime_t *)CMSG_DATA(cmsg); + break; /* SCM_RIGHTS and SCM_CREDENTIALS are semantically in SOL_UNIX. */ case SCM_RIGHTS: case SCM_CREDENTIALS: