diff mbox

[v1,net-next,5/6] net: allow simultaneous SW and HW transmit timestamping

Message ID 20170426145035.25846-6-mlichvar@redhat.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Miroslav Lichvar April 26, 2017, 2:50 p.m. UTC
Add SOF_TIMESTAMPING_OPT_TX_SWHW option to allow an outgoing packet to
be looped to the socket's error queue with a software timestamp even
when a hardware transmit timestamp is expected to be provided by the
driver.

Applications using this option will receive two separate messages from
the error queue, one with a software timestamp and the other with a
hardware timestamp. As the hardware timestamp is saved to the shared skb
info, which may happen before the first message with software timestamp
is received by the application, the hardware timestamp is copied to the
SCM_TIMESTAMPING control message only when the skb has no software
timestamp or it is an incoming packet.

CC: Richard Cochran <richardcochran@gmail.com>
CC: Willem de Bruijn <willemb@google.com>
Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com>
---
 Documentation/networking/timestamping.txt | 14 ++++++++++++--
 include/linux/skbuff.h                    |  3 +--
 include/uapi/linux/net_tstamp.h           |  3 ++-
 net/core/skbuff.c                         |  4 ++++
 net/socket.c                              |  1 +
 5 files changed, 20 insertions(+), 5 deletions(-)

Comments

Willem de Bruijn April 27, 2017, midnight UTC | #1
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 81ef53f..42bff22 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -3300,8 +3300,7 @@ void skb_tstamp_tx(struct sk_buff *orig_skb,
>
>  static inline void sw_tx_timestamp(struct sk_buff *skb)
>  {
> -       if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
> -           !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
> +       if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP)
>                 skb_tstamp_tx(skb, NULL);
>  }
>
> diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h
> index 8fcae35..d251972 100644
> --- a/include/uapi/linux/net_tstamp.h
> +++ b/include/uapi/linux/net_tstamp.h
> @@ -27,8 +27,9 @@ enum {
>         SOF_TIMESTAMPING_OPT_TSONLY = (1<<11),
>         SOF_TIMESTAMPING_OPT_STATS = (1<<12),
>         SOF_TIMESTAMPING_OPT_PKTINFO = (1<<13),
> +       SOF_TIMESTAMPING_OPT_TX_SWHW = (1<<14),
>
> -       SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_PKTINFO,
> +       SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_TX_SWHW,
>         SOF_TIMESTAMPING_MASK = (SOF_TIMESTAMPING_LAST - 1) |
>                                  SOF_TIMESTAMPING_LAST
>  };
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 58604c1..db5aa19 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -3874,6 +3874,10 @@ void __skb_tstamp_tx(struct sk_buff *orig_skb,
>         if (!sk)
>                 return;
>
> +       if (!hwtstamps && !(sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW) &&
> +           skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)
> +               return;
> +

This check should only happen for software transmit timestamps, so simpler to
revise the check in sw_tx_timestamp above to

  if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
-        !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
+      (!(skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)) ||
+      (skb->sk && skb->sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW)

> diff --git a/net/socket.c b/net/socket.c
> index 68b9304..14b7688 100644
> --- a/net/socket.c
> +++ b/net/socket.c
> @@ -720,6 +720,7 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
>                 empty = 0;
>         if (shhwtstamps &&
>             (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
> +           (empty || !skb_is_err_queue(skb)) &&
>             ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {

I find skb->tstamp == 0 easier to understand than the condition on empty.

Indeed, this is so non-obvious that I would suggest another helper function
skb_is_hwtx_tstamp with a concise comment about the race condition
between tx software and hardware timestamps (as in the last sentence of
the commit message).
Miroslav Lichvar April 27, 2017, 4:17 p.m. UTC | #2
On Wed, Apr 26, 2017 at 08:00:02PM -0400, Willem de Bruijn wrote:
> > +       if (!hwtstamps && !(sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW) &&
> > +           skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)
> > +               return;
> > +
> 
> This check should only happen for software transmit timestamps, so simpler to
> revise the check in sw_tx_timestamp above to
> 
>   if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
> -        !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
> +      (!(skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)) ||
> +      (skb->sk && skb->sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW)

Good point. This will avoid unnecessary calls of skb_tstamp_tx() in
the common case when SOF_TIMESTAMPING_OPT_TX_SWHW will not be enabled.

> > @@ -720,6 +720,7 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
> >                 empty = 0;
> >         if (shhwtstamps &&
> >             (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
> > +           (empty || !skb_is_err_queue(skb)) &&
> >             ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
> 
> I find skb->tstamp == 0 easier to understand than the condition on empty.
> 
> Indeed, this is so non-obvious that I would suggest another helper function
> skb_is_hwtx_tstamp with a concise comment about the race condition
> between tx software and hardware timestamps (as in the last sentence of
> the commit message).

Should it include also the skb_is_err_queue() check? If it returned
true for both TX and RX HW timestamps, maybe it could be called
skb_has_hw_tstamp?
Willem de Bruijn April 27, 2017, 4:21 p.m. UTC | #3
>> > @@ -720,6 +720,7 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
>> >                 empty = 0;
>> >         if (shhwtstamps &&
>> >             (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
>> > +           (empty || !skb_is_err_queue(skb)) &&
>> >             ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
>>
>> I find skb->tstamp == 0 easier to understand than the condition on empty.
>>
>> Indeed, this is so non-obvious that I would suggest another helper function
>> skb_is_hwtx_tstamp with a concise comment about the race condition
>> between tx software and hardware timestamps (as in the last sentence of
>> the commit message).
>
> Should it include also the skb_is_err_queue() check? If it returned
> true for both TX and RX HW timestamps, maybe it could be called
> skb_has_hw_tstamp?

For the purpose of documenting why this complex condition exists,
I would call the skb_is_err_queue in that helper function and make
it tx + hw specific.
Miroslav Lichvar April 27, 2017, 4:39 p.m. UTC | #4
On Thu, Apr 27, 2017 at 12:21:00PM -0400, Willem de Bruijn wrote:
> >> > @@ -720,6 +720,7 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
> >> >                 empty = 0;
> >> >         if (shhwtstamps &&
> >> >             (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
> >> > +           (empty || !skb_is_err_queue(skb)) &&
> >> >             ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
> >>
> >> I find skb->tstamp == 0 easier to understand than the condition on empty.
> >>
> >> Indeed, this is so non-obvious that I would suggest another helper function
> >> skb_is_hwtx_tstamp with a concise comment about the race condition
> >> between tx software and hardware timestamps (as in the last sentence of
> >> the commit message).
> >
> > Should it include also the skb_is_err_queue() check? If it returned
> > true for both TX and RX HW timestamps, maybe it could be called
> > skb_has_hw_tstamp?
> 
> For the purpose of documenting why this complex condition exists,
> I would call the skb_is_err_queue in that helper function and make
> it tx + hw specific.

Hm, like this?

        if (shhwtstamps &&
            (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
+           (skb_is_hwtx_tstamp(skb) || !skb_is_err_queue(skb)) &&
            ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {

where skb_is_hwtx_tstamp() has
	return skb->tstamp == 0 && skb_is_err_queue(skb);

I was just not sure about the unnecessary skb_is_err_queue() call.
Willem de Bruijn April 27, 2017, 4:48 p.m. UTC | #5
On Thu, Apr 27, 2017 at 12:39 PM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> On Thu, Apr 27, 2017 at 12:21:00PM -0400, Willem de Bruijn wrote:
>> >> > @@ -720,6 +720,7 @@ void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
>> >> >                 empty = 0;
>> >> >         if (shhwtstamps &&
>> >> >             (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
>> >> > +           (empty || !skb_is_err_queue(skb)) &&
>> >> >             ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
>> >>
>> >> I find skb->tstamp == 0 easier to understand than the condition on empty.
>> >>
>> >> Indeed, this is so non-obvious that I would suggest another helper function
>> >> skb_is_hwtx_tstamp with a concise comment about the race condition
>> >> between tx software and hardware timestamps (as in the last sentence of
>> >> the commit message).
>> >
>> > Should it include also the skb_is_err_queue() check? If it returned
>> > true for both TX and RX HW timestamps, maybe it could be called
>> > skb_has_hw_tstamp?
>>
>> For the purpose of documenting why this complex condition exists,
>> I would call the skb_is_err_queue in that helper function and make
>> it tx + hw specific.
>
> Hm, like this?
>
>         if (shhwtstamps &&
>             (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
> +           (skb_is_hwtx_tstamp(skb) || !skb_is_err_queue(skb)) &&
>             ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
>
> where skb_is_hwtx_tstamp() has
>         return skb->tstamp == 0 && skb_is_err_queue(skb);
>
> I was just not sure about the unnecessary skb_is_err_queue() call.

Oh, good point. If the condition is

  (skb_is_err_queue(skb) && !skb->tstamp) || !skb_is_err_queue(skb)

then it makes more sense to just use the simpler expression

  (!skb_is_err_queue(skb)) || (!skb->tstamp)

This cannot be called skb_is_hwtx_tstamp, as this does not imply
skb_hwtstamps(skb). Perhaps instead define

  /* On transmit, software and hardware timestamps are returned independently.
   * Do not return hardware timestamps even if skb_hwtstamps(skb) is true if
   * skb->tstamp is set
   */
  static bool skb_is_swtx_tstamp(skb) {
    return skb_is_err_queue(skb) && skb->tstamp;
  }

and use !skb_is_swtx_tstamp(skb) in this condition. The comment is
just a quick first try, can probably be improved.
Miroslav Lichvar April 28, 2017, 8:54 a.m. UTC | #6
On Wed, Apr 26, 2017 at 08:00:02PM -0400, Willem de Bruijn wrote:
> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> > index 81ef53f..42bff22 100644
> > --- a/include/linux/skbuff.h
> > +++ b/include/linux/skbuff.h
> > @@ -3300,8 +3300,7 @@ void skb_tstamp_tx(struct sk_buff *orig_skb,
> >
> >  static inline void sw_tx_timestamp(struct sk_buff *skb)
> >  {
> > -       if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
> > -           !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
> > +       if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP)
> >                 skb_tstamp_tx(skb, NULL);
> >  }

> > +++ b/net/core/skbuff.c
> > @@ -3874,6 +3874,10 @@ void __skb_tstamp_tx(struct sk_buff *orig_skb,
> >         if (!sk)
> >                 return;
> >
> > +       if (!hwtstamps && !(sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW) &&
> > +           skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)
> > +               return;
> > +
> 
> This check should only happen for software transmit timestamps, so simpler to
> revise the check in sw_tx_timestamp above to
> 
>   if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
> -        !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
> +      (!(skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)) ||
> +      (skb->sk && skb->sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW)

I'm not sure if this can work. sk_buff.h would need to include sock.h
in order to get the definition of struct sock. Any suggestions?
Willem de Bruijn April 28, 2017, 3:50 p.m. UTC | #7
On Fri, Apr 28, 2017 at 4:54 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> On Wed, Apr 26, 2017 at 08:00:02PM -0400, Willem de Bruijn wrote:
>> > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
>> > index 81ef53f..42bff22 100644
>> > --- a/include/linux/skbuff.h
>> > +++ b/include/linux/skbuff.h
>> > @@ -3300,8 +3300,7 @@ void skb_tstamp_tx(struct sk_buff *orig_skb,
>> >
>> >  static inline void sw_tx_timestamp(struct sk_buff *skb)
>> >  {
>> > -       if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
>> > -           !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
>> > +       if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP)
>> >                 skb_tstamp_tx(skb, NULL);
>> >  }
>
>> > +++ b/net/core/skbuff.c
>> > @@ -3874,6 +3874,10 @@ void __skb_tstamp_tx(struct sk_buff *orig_skb,
>> >         if (!sk)
>> >                 return;
>> >
>> > +       if (!hwtstamps && !(sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW) &&
>> > +           skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)
>> > +               return;
>> > +
>>
>> This check should only happen for software transmit timestamps, so simpler to
>> revise the check in sw_tx_timestamp above to
>>
>>   if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
>> -        !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
>> +      (!(skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)) ||
>> +      (skb->sk && skb->sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW)
>
> I'm not sure if this can work. sk_buff.h would need to include sock.h
> in order to get the definition of struct sock. Any suggestions?

A more elegant solution would be to not set SKBTX_IN_PROGRESS
at all if SOF_TIMESTAMPING_OPT_TX_SWHW is set on the socket.
But the patch to do so is not elegant, having to update callsites in many
device drivers.

Otherwise you may indeed have to call skb_tstamp_tx for every packet
that has SKBTX_SW_TSTAMP set, as you do. We can at least move
the skb->sk != NULL check into skb_tx_timestamp in skbuff.h.

By the way, if changing this code, I think that it's time to get rid of
sw_tx_timestamp. It is only called from skb_tx_timestamp. Let's
just move the condition in there.
Miroslav Lichvar April 28, 2017, 4:23 p.m. UTC | #8
On Fri, Apr 28, 2017 at 11:50:28AM -0400, Willem de Bruijn wrote:
> On Fri, Apr 28, 2017 at 4:54 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> >>   if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
> >> -        !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
> >> +      (!(skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)) ||
> >> +      (skb->sk && skb->sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW)
> >
> > I'm not sure if this can work. sk_buff.h would need to include sock.h
> > in order to get the definition of struct sock. Any suggestions?
> 
> A more elegant solution would be to not set SKBTX_IN_PROGRESS
> at all if SOF_TIMESTAMPING_OPT_TX_SWHW is set on the socket.
> But the patch to do so is not elegant, having to update callsites in many
> device drivers.

Also, it would change the meaning of the flag as it seems some drivers
actually use the SKBTX_IN_PROGRESS flag to check if they expect a
timestamp.

How about allocating the last bit of tx_flags for SKBTX_SWHW_TSTAMP?

> Otherwise you may indeed have to call skb_tstamp_tx for every packet
> that has SKBTX_SW_TSTAMP set, as you do. We can at least move
> the skb->sk != NULL check into skb_tx_timestamp in skbuff.h.
> 
> By the way, if changing this code, I think that it's time to get rid of
> sw_tx_timestamp. It is only called from skb_tx_timestamp. Let's
> just move the condition in there.

Ok. I assume that should be a separate patch.
Willem de Bruijn April 28, 2017, 8:07 p.m. UTC | #9
On Fri, Apr 28, 2017 at 12:23 PM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> On Fri, Apr 28, 2017 at 11:50:28AM -0400, Willem de Bruijn wrote:
>> On Fri, Apr 28, 2017 at 4:54 AM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
>> >>   if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
>> >> -        !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
>> >> +      (!(skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)) ||
>> >> +      (skb->sk && skb->sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW)
>> >
>> > I'm not sure if this can work. sk_buff.h would need to include sock.h
>> > in order to get the definition of struct sock. Any suggestions?
>>
>> A more elegant solution would be to not set SKBTX_IN_PROGRESS
>> at all if SOF_TIMESTAMPING_OPT_TX_SWHW is set on the socket.
>> But the patch to do so is not elegant, having to update callsites in many
>> device drivers.
>
> Also, it would change the meaning of the flag as it seems some drivers
> actually use the SKBTX_IN_PROGRESS flag to check if they expect a
> timestamp.
>
> How about allocating the last bit of tx_flags for SKBTX_SWHW_TSTAMP?

That is such a scarce resource that I really would prefer to avoid using
that if we can.

>> Otherwise you may indeed have to call skb_tstamp_tx for every packet
>> that has SKBTX_SW_TSTAMP set, as you do. We can at least move
>> the skb->sk != NULL check into skb_tx_timestamp in skbuff.h.
>>
>> By the way, if changing this code, I think that it's time to get rid of
>> sw_tx_timestamp. It is only called from skb_tx_timestamp. Let's
>> just move the condition in there.
>
> Ok. I assume that should be a separate patch.

It's a single statement, I think we can do it as part of this one. Thanks.
Miroslav Lichvar May 2, 2017, 9:56 a.m. UTC | #10
On Fri, Apr 28, 2017 at 04:07:29PM -0400, Willem de Bruijn wrote:
> On Fri, Apr 28, 2017 at 12:23 PM, Miroslav Lichvar <mlichvar@redhat.com> wrote:
> > On Fri, Apr 28, 2017 at 11:50:28AM -0400, Willem de Bruijn wrote:
> >> A more elegant solution would be to not set SKBTX_IN_PROGRESS
> >> at all if SOF_TIMESTAMPING_OPT_TX_SWHW is set on the socket.
> >> But the patch to do so is not elegant, having to update callsites in many
> >> device drivers.
> >
> > Also, it would change the meaning of the flag as it seems some drivers
> > actually use the SKBTX_IN_PROGRESS flag to check if they expect a
> > timestamp.
> >
> > How about allocating the last bit of tx_flags for SKBTX_SWHW_TSTAMP?
> 
> That is such a scarce resource that I really would prefer to avoid using
> that if we can.

Ok. I think it won't really matter. We should keep in mind that the
reason for adding the OPT_TX_SWHW option was to not break old
applications which enabled both SW and HW TX timestamping, even though
they could get only one timestamp. I think most applications in future
will either enable only SW or HW TX timestamping, or enable both
together with the OPT_TX_SWHW option in order to get a SW timestamp
when HW timestamp was requested but missing.

> >> Otherwise you may indeed have to call skb_tstamp_tx for every packet
> >> that has SKBTX_SW_TSTAMP set, as you do. We can at least move
> >> the skb->sk != NULL check into skb_tx_timestamp in skbuff.h.

There are other callers of skb_tx_timestamp() and it's not obvious to
me they are all safe (i.e. cannot pass skb will sk==NULL), so I think
this should rather be a separate patch if necessary.

I'll resend the series with the other changes you have suggested.
diff mbox

Patch

diff --git a/Documentation/networking/timestamping.txt b/Documentation/networking/timestamping.txt
index 6c07e7c..ab29a6e 100644
--- a/Documentation/networking/timestamping.txt
+++ b/Documentation/networking/timestamping.txt
@@ -201,6 +201,14 @@  SOF_TIMESTAMPING_OPT_PKTINFO:
   which received the packet and its length at layer 2. This option
   works only if CONFIG_NET_RX_BUSY_POLL is enabled.
 
+SOF_TIMESTAMPING_OPT_TX_SWHW:
+
+  Request both hardware and software timestamps for outgoing packets
+  when SOF_TIMESTAMPING_TX_HARDWARE and SOF_TIMESTAMPING_TX_SOFTWARE
+  are enabled at the same time. If both timestamps are generated,
+  two separate messages will be looped to the socket's error queue,
+  each containing just one timestamp.
+
 New applications are encouraged to pass SOF_TIMESTAMPING_OPT_ID to
 disambiguate timestamps and SOF_TIMESTAMPING_OPT_TSONLY to operate
 regardless of the setting of sysctl net.core.tstamp_allow_data.
@@ -320,8 +328,10 @@  struct scm_timestamping {
 };
 
 The structure can return up to three timestamps. This is a legacy
-feature. Only one field is non-zero at any time. Most timestamps
-are passed in ts[0]. Hardware timestamps are passed in ts[2].
+feature. Most timestamps are passed in ts[0]. Hardware timestamps
+are passed in ts[2]. Incoming packets may have timestamps in both
+ts[0] and ts[2], but for outgoing packets only one field is non-zero
+at any time.
 
 ts[1] used to hold hardware timestamps converted to system time.
 Instead, expose the hardware clock device on the NIC directly as
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 81ef53f..42bff22 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -3300,8 +3300,7 @@  void skb_tstamp_tx(struct sk_buff *orig_skb,
 
 static inline void sw_tx_timestamp(struct sk_buff *skb)
 {
-	if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP &&
-	    !(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS))
+	if (skb_shinfo(skb)->tx_flags & SKBTX_SW_TSTAMP)
 		skb_tstamp_tx(skb, NULL);
 }
 
diff --git a/include/uapi/linux/net_tstamp.h b/include/uapi/linux/net_tstamp.h
index 8fcae35..d251972 100644
--- a/include/uapi/linux/net_tstamp.h
+++ b/include/uapi/linux/net_tstamp.h
@@ -27,8 +27,9 @@  enum {
 	SOF_TIMESTAMPING_OPT_TSONLY = (1<<11),
 	SOF_TIMESTAMPING_OPT_STATS = (1<<12),
 	SOF_TIMESTAMPING_OPT_PKTINFO = (1<<13),
+	SOF_TIMESTAMPING_OPT_TX_SWHW = (1<<14),
 
-	SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_PKTINFO,
+	SOF_TIMESTAMPING_LAST = SOF_TIMESTAMPING_OPT_TX_SWHW,
 	SOF_TIMESTAMPING_MASK = (SOF_TIMESTAMPING_LAST - 1) |
 				 SOF_TIMESTAMPING_LAST
 };
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 58604c1..db5aa19 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -3874,6 +3874,10 @@  void __skb_tstamp_tx(struct sk_buff *orig_skb,
 	if (!sk)
 		return;
 
+	if (!hwtstamps && !(sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TX_SWHW) &&
+	    skb_shinfo(orig_skb)->tx_flags & SKBTX_IN_PROGRESS)
+		return;
+
 	tsonly = sk->sk_tsflags & SOF_TIMESTAMPING_OPT_TSONLY;
 	if (!skb_may_tx_timestamp(sk, tsonly))
 		return;
diff --git a/net/socket.c b/net/socket.c
index 68b9304..14b7688 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -720,6 +720,7 @@  void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
 		empty = 0;
 	if (shhwtstamps &&
 	    (sk->sk_tsflags & SOF_TIMESTAMPING_RAW_HARDWARE) &&
+	    (empty || !skb_is_err_queue(skb)) &&
 	    ktime_to_timespec_cond(shhwtstamps->hwtstamp, tss.ts + 2)) {
 		empty = 0;
 		if ((sk->sk_tsflags & SOF_TIMESTAMPING_OPT_PKTINFO) &&