diff mbox

gianfar: Fix invalid TX frames returned on error queue when time stamping.

Message ID 1325775021-23049-1-git-send-email-manfred.rudigier@omicron.at
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Manfred Rudigier Jan. 5, 2012, 2:50 p.m. UTC
When TX time stamping for PTP messages is enabled on a socket, a time
stamp is returned on the socket error queue to the user space application
after the frame was transmitted. The transmitted frame is also returned on
the error queue so that an application knows to which frame the time stamp
belongs.

In the current implementation the TxFCB is immediately followed by the
frame. Since the eTSEC inserts the TX time stamp 8 bytes after the TxFCB,
parts of the frame have been overwritten and an invalid frame was returned
on the socket error queue.

This patch fixes the described problem by adding additional 16 padding
bytes between the TxFCB and the frame for all messages sent from a time
stamping enabled socket (other sockets are not affected).

Signed-off-by: Manfred Rudigier <manfred.rudigier@omicron.at>
---
 drivers/net/ethernet/freescale/gianfar.c |   33 ++++++++++++++++++++---------
 drivers/net/ethernet/freescale/gianfar.h |    3 ++
 2 files changed, 26 insertions(+), 10 deletions(-)

Comments

David Miller Jan. 5, 2012, 6:26 p.m. UTC | #1
From: Manfred Rudigier <manfred.rudigier@omicron.at>
Date: Thu, 5 Jan 2012 15:50:21 +0100

> +
> +		/* Keep sock if we must return a time stamp on the err queue */
> +		skb_new->sk = skb->sk;

When I see something like this without any kind of reference counting or
similar, I am gravely concerned.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Manfred Rudigier Jan. 9, 2012, 7:16 a.m. UTC | #2
From: David Miller [mailto:davem@davemloft.net]
Sent: Thursday, January 05, 2012 19:27

>From: Manfred Rudigier <manfred.rudigier@omicron.at>
>Date: Thu, 5 Jan 2012 15:50:21 +0100
>
>> +
>> +		/* Keep sock if we must return a time stamp on the err queue */
>> +		skb_new->sk = skb->sk;
>
>When I see something like this without any kind of reference counting or
>similar, I am gravely concerned.

The skb_tstamp_tx function called during gfar_clean_tx_ring requires the skb->sk pointer to be set. Otherwise no time stamp can be queued on the socket error queue. 
What would be the correct way for doing this?

Regards,
Manfred
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Jan. 9, 2012, 7:43 a.m. UTC | #3
Le lundi 09 janvier 2012 à 08:16 +0100, Manfred Rudigier a écrit :
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Thursday, January 05, 2012 19:27
> 
> >From: Manfred Rudigier <manfred.rudigier@omicron.at>
> >Date: Thu, 5 Jan 2012 15:50:21 +0100
> >
> >> +
> >> +		/* Keep sock if we must return a time stamp on the err queue */
> >> +		skb_new->sk = skb->sk;
> >
> >When I see something like this without any kind of reference counting or
> >similar, I am gravely concerned.
> 
> The skb_tstamp_tx function called during gfar_clean_tx_ring requires
> the skb->sk pointer to be set. Otherwise no time stamp can be queued
> on the socket error queue. 
> What would be the correct way for doing this?

I really wonder how your code was possibly working...

A correct steal sequence maybe 

	skb_new->sk = skb->sk;
	skb->sk = NULL;
	kfree_skb(skb);

Or else, kfree_skb() releases sk reference and eventually socket is
freed before skb_tstamp_tx()



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Jan. 9, 2012, 9:58 a.m. UTC | #4
Le lundi 09 janvier 2012 à 10:36 +0100, Manfred Rudigier a écrit :

> I have tried your suggested steal sequence, but it crashed. 
> I see that there is also a destructor - when I use the skb_orphan function (which calls the destructor) like this it works:
> 
> 	skb_new->sk = skb->sk;
> 	skb_orphan(skb);
> 	kfree_skb(skb);
> 
> Is this the correct solution?

I dont think so.

destructor and sk are coupled, you cant steal one and let the other
untouched.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Manfred Rudigier Jan. 9, 2012, 10:43 a.m. UTC | #5
From: Eric Dumazet [mailto:eric.dumazet@gmail.com]

Sent: Monday, January 09, 2012 10:59

>Le lundi 09 janvier 2012 à 10:36 +0100, Manfred Rudigier a écrit :

>

>> I have tried your suggested steal sequence, but it crashed.

>> I see that there is also a destructor - when I use the skb_orphan function

>(which calls the destructor) like this it works:

>>

>> 	skb_new->sk = skb->sk;

>> 	skb_orphan(skb);

>> 	kfree_skb(skb);

>>

>> Is this the correct solution?

>

>I dont think so.

>

>destructor and sk are coupled, you cant steal one and let the other

>untouched.


If I understand you correctly, we must also steal the destructor like this:

	skb_new->sk = skb->sk;
	skb_new->destructor = skb->destructor;
	skb->sk = NULL;
	skb->destructor = NULL;
	kfree_skb(skb);

Then the destructor will be called at the end of the skb_tstamp_tx function, which should be fine, right?
Richard Cochran Jan. 9, 2012, 12:41 p.m. UTC | #6
On Mon, Jan 09, 2012 at 10:36:44AM +0100, Manfred Rudigier wrote:
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> Sent: Monday, January 09, 2012 08:44
> 
> >Le lundi 09 janvier 2012 à 08:16 +0100, Manfred Rudigier a écrit :
> >> From: David Miller [mailto:davem@davemloft.net]
> >> Sent: Thursday, January 05, 2012 19:27
> >>
> >> >From: Manfred Rudigier <manfred.rudigier@omicron.at>
> >> >Date: Thu, 5 Jan 2012 15:50:21 +0100
> >> >
> >> >> +
> >> >> +		/* Keep sock if we must return a time stamp on the err queue */
> >> >> +		skb_new->sk = skb->sk;
> >> >
> >> >When I see something like this without any kind of reference counting
> >> >or similar, I am gravely concerned.
> >>
> >> The skb_tstamp_tx function called during gfar_clean_tx_ring requires
> >> the skb->sk pointer to be set. Otherwise no time stamp can be queued
> >> on the socket error queue.
> >> What would be the correct way for doing this?
> >
> >I really wonder how your code was possibly working...
> 
> Well, it was working :-)

It was only working because the conditional was always false:

	if (((skb->ip_summed == CHECKSUM_PARTIAL) ||
      	   		vlan_tx_tag_present(skb) ||
			unlikely(do_tstamp)) &&
			(skb_headroom(skb) < GMAC_FCB_LEN)) {
		struct sk_buff *skb_new;
		...
	}

IOW, there was always at least GMAC_FCB_LEN bytes in skb_headroom.

Richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran Jan. 9, 2012, 12:49 p.m. UTC | #7
On Mon, Jan 09, 2012 at 08:16:01AM +0100, Manfred Rudigier wrote:
> From: David Miller [mailto:davem@davemloft.net]
> Sent: Thursday, January 05, 2012 19:27
> 
> >From: Manfred Rudigier <manfred.rudigier@omicron.at>
> >Date: Thu, 5 Jan 2012 15:50:21 +0100
> >
> >> +
> >> +		/* Keep sock if we must return a time stamp on the err queue */
> >> +		skb_new->sk = skb->sk;
> >
> >When I see something like this without any kind of reference counting or
> >similar, I am gravely concerned.
> 
> The skb_tstamp_tx function called during gfar_clean_tx_ring requires
> the skb->sk pointer to be set. Otherwise no time stamp can be queued
> on the socket error queue.
> What would be the correct way for doing this?

Hi Manfred,

You have to make sure the new clone holds a reference on skb->sk.
A recent, similar problem was discussed:

    http://comments.gmane.org/gmane.linux.network/208143

And the solution came in:

    commit da92b194cc36b5dc1fbd85206aeeffd80bee0c39
    Author: Richard Cochran <richardcochran@gmail.com>
    Date:   Fri Oct 21 00:49:15 2011 +0000

    net: hold sock reference while processing tx timestamps

HTH,

Richard

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Richard Cochran Jan. 9, 2012, 12:50 p.m. UTC | #8
On Mon, Jan 09, 2012 at 01:41:08PM +0100, Richard Cochran wrote:
> 
> 	if (((skb->ip_summed == CHECKSUM_PARTIAL) ||
>       	   		vlan_tx_tag_present(skb) ||
> 			unlikely(do_tstamp)) &&
> 			(skb_headroom(skb) < GMAC_FCB_LEN)) {
> 		struct sk_buff *skb_new;
> 		...
> 	}
> 
> IOW, there was always at least GMAC_FCB_LEN bytes in skb_headroom.

And now, you are asking for more space.

Richard
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Manfred Rudigier Jan. 9, 2012, 2:45 p.m. UTC | #9
From: netdev-owner@vger.kernel.org [mailto:netdev-
owner@vger.kernel.org] On Behalf Of Richard Cochran
Sent: Monday, January 09, 2012 13:49

>Hi Manfred,
>
>You have to make sure the new clone holds a reference on skb->sk.
>A recent, similar problem was discussed:
>
>    http://comments.gmane.org/gmane.linux.network/208143
>
>And the solution came in:
>
>    commit da92b194cc36b5dc1fbd85206aeeffd80bee0c39
>    Author: Richard Cochran <richardcochran@gmail.com>
>    Date:   Fri Oct 21 00:49:15 2011 +0000
>
>    net: hold sock reference while processing tx timestamps

Thanks Richard, I will come up with a new version.

Regard,
Manfred

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/ethernet/freescale/gianfar.c b/drivers/net/ethernet/freescale/gianfar.c
index 83199fd..3a14eab 100644
--- a/drivers/net/ethernet/freescale/gianfar.c
+++ b/drivers/net/ethernet/freescale/gianfar.c
@@ -1984,7 +1984,8 @@  static inline struct txfcb *gfar_add_fcb(struct sk_buff *skb)
 	return fcb;
 }
 
-static inline void gfar_tx_checksum(struct sk_buff *skb, struct txfcb *fcb)
+static inline void gfar_tx_checksum(struct sk_buff *skb, struct txfcb *fcb,
+		int fcb_length)
 {
 	u8 flags = 0;
 
@@ -2006,7 +2007,7 @@  static inline void gfar_tx_checksum(struct sk_buff *skb, struct txfcb *fcb)
 	 * frame (skb->data) and the start of the IP hdr.
 	 * l4os is the distance between the start of the
 	 * l3 hdr and the l4 hdr */
-	fcb->l3os = (u16)(skb_network_offset(skb) - GMAC_FCB_LEN);
+	fcb->l3os = (u16)(skb_network_offset(skb) - fcb_length);
 	fcb->l4os = skb_network_header_len(skb);
 
 	fcb->flags = flags;
@@ -2046,7 +2047,7 @@  static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	int i, rq = 0, do_tstamp = 0;
 	u32 bufaddr;
 	unsigned long flags;
-	unsigned int nr_frags, nr_txbds, length;
+	unsigned int nr_frags, nr_txbds, length, fcb_length = GMAC_FCB_LEN;
 
 	/*
 	 * TOE=1 frames larger than 2500 bytes may see excess delays
@@ -2070,22 +2071,27 @@  static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev)
 
 	/* check if time stamp should be generated */
 	if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_HW_TSTAMP &&
-		     priv->hwts_tx_en))
+			priv->hwts_tx_en)) {
 		do_tstamp = 1;
+		fcb_length = GMAC_FCB_LEN + GMAC_TXPAL_LEN;
+	}
 
 	/* make space for additional header when fcb is needed */
 	if (((skb->ip_summed == CHECKSUM_PARTIAL) ||
 			vlan_tx_tag_present(skb) ||
 			unlikely(do_tstamp)) &&
-			(skb_headroom(skb) < GMAC_FCB_LEN)) {
+			(skb_headroom(skb) < fcb_length)) {
 		struct sk_buff *skb_new;
 
-		skb_new = skb_realloc_headroom(skb, GMAC_FCB_LEN);
+		skb_new = skb_realloc_headroom(skb, fcb_length);
 		if (!skb_new) {
 			dev->stats.tx_errors++;
 			kfree_skb(skb);
 			return NETDEV_TX_OK;
 		}
+
+		/* Keep sock if we must return a time stamp on the err queue */
+		skb_new->sk = skb->sk;
 		kfree_skb(skb);
 		skb = skb_new;
 	}
@@ -2154,6 +2160,12 @@  static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		lstatus = txbdp_start->lstatus;
 	}
 
+	/* Add TxPAL between FCB and frame if required */
+	if (unlikely(do_tstamp)) {
+		skb_push(skb, GMAC_TXPAL_LEN);
+		memset(skb->data, 0, GMAC_TXPAL_LEN);
+	}
+
 	/* Set up checksumming */
 	if (CHECKSUM_PARTIAL == skb->ip_summed) {
 		fcb = gfar_add_fcb(skb);
@@ -2164,7 +2176,7 @@  static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			skb_checksum_help(skb);
 		} else {
 			lstatus |= BD_LFLAG(TXBD_TOE);
-			gfar_tx_checksum(skb, fcb);
+			gfar_tx_checksum(skb, fcb, fcb_length);
 		}
 	}
 
@@ -2196,9 +2208,9 @@  static int gfar_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	 * the full frame length.
 	 */
 	if (unlikely(do_tstamp)) {
-		txbdp_tstamp->bufPtr = txbdp_start->bufPtr + GMAC_FCB_LEN;
+		txbdp_tstamp->bufPtr = txbdp_start->bufPtr + fcb_length;
 		txbdp_tstamp->lstatus |= BD_LFLAG(TXBD_READY) |
-				(skb_headlen(skb) - GMAC_FCB_LEN);
+				(skb_headlen(skb) - fcb_length);
 		lstatus |= BD_LFLAG(TXBD_CRC | TXBD_READY) | GMAC_FCB_LEN;
 	} else {
 		lstatus |= BD_LFLAG(TXBD_CRC | TXBD_READY) | skb_headlen(skb);
@@ -2490,7 +2502,7 @@  static int gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue)
 
 		if (unlikely(skb_shinfo(skb)->tx_flags & SKBTX_IN_PROGRESS)) {
 			next = next_txbd(bdp, base, tx_ring_size);
-			buflen = next->length + GMAC_FCB_LEN;
+			buflen = next->length + GMAC_FCB_LEN + GMAC_TXPAL_LEN;
 		} else
 			buflen = bdp->length;
 
@@ -2502,6 +2514,7 @@  static int gfar_clean_tx_ring(struct gfar_priv_tx_q *tx_queue)
 			u64 *ns = (u64*) (((u32)skb->data + 0x10) & ~0x7);
 			memset(&shhwtstamps, 0, sizeof(shhwtstamps));
 			shhwtstamps.hwtstamp = ns_to_ktime(*ns);
+			skb_pull(skb, GMAC_FCB_LEN + GMAC_TXPAL_LEN);
 			skb_tstamp_tx(skb, &shhwtstamps);
 			bdp->lstatus &= BD_LFLAG(TXBD_WRAP);
 			bdp = next;
diff --git a/drivers/net/ethernet/freescale/gianfar.h b/drivers/net/ethernet/freescale/gianfar.h
index 9aa4377..9fd5999 100644
--- a/drivers/net/ethernet/freescale/gianfar.h
+++ b/drivers/net/ethernet/freescale/gianfar.h
@@ -63,6 +63,9 @@  struct ethtool_rx_list {
 /* Length for FCB */
 #define GMAC_FCB_LEN 8
 
+/* Length for TxPAL */
+#define GMAC_TXPAL_LEN 16
+
 /* Default padding amount */
 #define DEFAULT_PADDING 2