Message ID | 1540417883-8476-1-git-send-email-Tristram.Ha@microchip.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
Series | [net] net: ethernet: cadence: fix socket buffer corruption problem | expand |
From: <Tristram.Ha@microchip.com> Date: Wed, 24 Oct 2018 14:51:23 -0700 > From: Tristram Ha <Tristram.Ha@microchip.com> > > Socket buffer is not re-created when headroom is 2 and tailroom is 1. > > Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com> Applied.
On 10/25/18 11:32 AM, David Miller wrote: > From: <Tristram.Ha@microchip.com> > Date: Wed, 24 Oct 2018 14:51:23 -0700 > >> From: Tristram Ha <Tristram.Ha@microchip.com> >> >> Socket buffer is not re-created when headroom is 2 and tailroom is 1. >> >> Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com> > > Applied. No fixes tag?
Hi Tristram, Could you, please, tell me if the above variable was false in your case? bool cloned = skb_cloned(*skb) || skb_header_cloned(*skb); If yes, then, the proper fix would be as follows: diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index 8f5bf9166c11..492a8e1a34cd 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -1690,7 +1690,7 @@ static int macb_pad_and_fcs(struct sk_buff **skb, struct net_device *ndev) padlen += ETH_FCS_LEN; } - if (!cloned && headroom + tailroom >= padlen) { + if (!cloned && headroom + tailroom >= ETH_FCS_LEN) { (*skb)->data = memmove((*skb)->head, (*skb)->data, (*skb)->len); skb_set_tail_pointer(*skb, (*skb)->len); } else { Could you please check if it works in your case (and without your patch)? Thank you, Claudiu Beznea On 25.10.2018 00:51, Tristram.Ha@microchip.com wrote: > From: Tristram Ha <Tristram.Ha@microchip.com> > > Socket buffer is not re-created when headroom is 2 and tailroom is 1. > > Signed-off-by: Tristram Ha <Tristram.Ha@microchip.com> > --- > drivers/net/ethernet/cadence/macb_main.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c > index 8f5bf91..1d86b4d 100644 > --- a/drivers/net/ethernet/cadence/macb_main.c > +++ b/drivers/net/ethernet/cadence/macb_main.c > @@ -1684,7 +1684,7 @@ static int macb_pad_and_fcs(struct sk_buff **skb, struct net_device *ndev) > padlen = 0; > /* No room for FCS, need to reallocate skb. */ > else > - padlen = ETH_FCS_LEN - tailroom; > + padlen = ETH_FCS_LEN; > } else { > /* Add room for FCS. */ > padlen += ETH_FCS_LEN; >
> Could you, please, tell me if the above variable was false in your case? > > bool cloned = skb_cloned(*skb) || skb_header_cloned(*skb); > > If yes, then, the proper fix would be as follows: > > diff --git a/drivers/net/ethernet/cadence/macb_main.c > b/drivers/net/ethernet/cadence/macb_main.c > index 8f5bf9166c11..492a8e1a34cd 100644 > --- a/drivers/net/ethernet/cadence/macb_main.c > +++ b/drivers/net/ethernet/cadence/macb_main.c > @@ -1690,7 +1690,7 @@ static int macb_pad_and_fcs(struct sk_buff **skb, > struct net_device *ndev) > padlen += ETH_FCS_LEN; > } > > - if (!cloned && headroom + tailroom >= padlen) { > + if (!cloned && headroom + tailroom >= ETH_FCS_LEN) { > (*skb)->data = memmove((*skb)->head, (*skb)->data, (*skb)->len); > skb_set_tail_pointer(*skb, (*skb)->len); > } else { > > Could you please check if it works in your case (and without your patch)? > Actually doing that reveals another bug: if (padlen) { if (padlen >= ETH_FCS_LEN) skb_put_zero(*skb, padlen - ETH_FCS_LEN); else skb_trim(*skb, ETH_FCS_LEN - padlen); } My fix calls skb_put_zero with zero length. Your change calls skb_trim which actually sets the socket buffer length to 1! When this problem happens headroom is 2, tailroom is 1, skb->len is 61, and padlen is 3. DSA driver is being used. That is why the length is already padded to 60 bytes and 1-byte tail tag is added. BTW, I am not sure while this macb_pad_and_fcs function was added. Is it to workaround some hardware bugs? The code is executed only when NETIF_IF_HW_CSUM is used. But if hardware tx checksumming is enabled why not also use the hardware to calculate CRC? NETIF_F_SG is not enabled in the MAC I used, so enabling NETIF_IF_HW_CSUM is rather pointless. With the padding code the transmit throughput cannot get higher than 100Mbps in a gigabit connection. I would recommend to add this option to disable manual padding in one of those macb_config structures.
On 29.10.2018 23:40, Tristram Ha - C24268 wrote: >> Could you, please, tell me if the above variable was false in your case? >> >> bool cloned = skb_cloned(*skb) || skb_header_cloned(*skb); >> >> If yes, then, the proper fix would be as follows: >> >> diff --git a/drivers/net/ethernet/cadence/macb_main.c >> b/drivers/net/ethernet/cadence/macb_main.c >> index 8f5bf9166c11..492a8e1a34cd 100644 >> --- a/drivers/net/ethernet/cadence/macb_main.c >> +++ b/drivers/net/ethernet/cadence/macb_main.c >> @@ -1690,7 +1690,7 @@ static int macb_pad_and_fcs(struct sk_buff **skb, >> struct net_device *ndev) >> padlen += ETH_FCS_LEN; >> } >> >> - if (!cloned && headroom + tailroom >= padlen) { >> + if (!cloned && headroom + tailroom >= ETH_FCS_LEN) { >> (*skb)->data = memmove((*skb)->head, (*skb)->data, (*skb)->len); >> skb_set_tail_pointer(*skb, (*skb)->len); >> } else { >> >> Could you please check if it works in your case (and without your patch)? >> > > Actually doing that reveals another bug: > > if (padlen) { > if (padlen >= ETH_FCS_LEN) > skb_put_zero(*skb, padlen - ETH_FCS_LEN); > else > skb_trim(*skb, ETH_FCS_LEN - padlen); > } > > My fix calls skb_put_zero with zero length. Your change calls skb_trim which > actually sets the socket buffer length to 1! > > When this problem happens headroom is 2, tailroom is 1, skb->len is 61, and > padlen is 3. > Ok, I see now. Looking again, your fix is good. But, as you said, there is the skb_trim() in this function that is wrong from the beginning (my bad). I propose to remove it since, with your fix is not even reached anymore. Could you check on your side that applying this on top of your patch, your scenario is still working? diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index 1d86b4d5645a..e1347d6d1b50 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -1702,12 +1702,8 @@ static int macb_pad_and_fcs(struct sk_buff **skb, struct net_device *ndev) *skb = nskb; } - if (padlen) { - if (padlen >= ETH_FCS_LEN) - skb_put_zero(*skb, padlen - ETH_FCS_LEN); - else - skb_trim(*skb, ETH_FCS_LEN - padlen); - } + if (padlen > ETH_FCS_LEN) + skb_put_zero(*skb, padlen - ETH_FCS_LEN); > DSA driver is being used. That is why the length is already padded to 60 bytes > and 1-byte tail tag is added. Ok, I see, I didn't test with such configurations. > > BTW, I am not sure while this macb_pad_and_fcs function was added. Is it to > workaround some hardware bugs? The code is executed only when > NETIF_IF_HW_CSUM is used. But if hardware tx checksumming is enabled why > not also use the hardware to calculate CRC? It was reported in [1] that UDP checksum is offloaded to hardware no matter the application previously computed it. The code should be executed only for packets that has checksum computed by applications ((*skb)->ip_summed != CHECKSUM_PARTIAL). The idea was to not recompute checksum for packets with checksum already computed. To do so, while hardware checksum is enabled (NETIF_F_HW_CSUM), TX_NOCRC bit should be set on buffer descriptor. But to do so, packets must have a minimum size of 64 and FCS to be computed. The NETIF_F_HW_CSUM check was placed there because the issue described in [1] is reproducible because hardware checksum is enabled and overrides the checksum provided by applications. [1] https://www.spinics.net/lists/netdev/msg505065.html > > NETIF_F_SG is not enabled in the MAC I used, so enabling NETIF_IF_HW_CSUM > is rather pointless. With the padding code the transmit throughput cannot get > higher than 100Mbps in a gigabit connection. > > I would recommend to add this option to disable manual padding in one of those > macb_config structures. In this way the user would have to know from the beginning what kind of packets are used. Thank you, Claudiu Beznea >
> Could you check on your side that applying this on top of your patch, your > scenario is still working? > > diff --git a/drivers/net/ethernet/cadence/macb_main.c > b/drivers/net/ethernet/cadence/macb_main.c > index 1d86b4d5645a..e1347d6d1b50 100644 > --- a/drivers/net/ethernet/cadence/macb_main.c > +++ b/drivers/net/ethernet/cadence/macb_main.c > @@ -1702,12 +1702,8 @@ static int macb_pad_and_fcs(struct sk_buff **skb, > struct net_device *ndev) > *skb = nskb; > } > > - if (padlen) { > - if (padlen >= ETH_FCS_LEN) > - skb_put_zero(*skb, padlen - ETH_FCS_LEN); > - else > - skb_trim(*skb, ETH_FCS_LEN - padlen); > - } > + if (padlen > ETH_FCS_LEN) > + skb_put_zero(*skb, padlen - ETH_FCS_LEN); I think it is okay but I need to check all paths are covered. > It was reported in [1] that UDP checksum is offloaded to hardware no matter > the application previously computed it. > > The code should be executed only for packets that has checksum computed > by > applications ((*skb)->ip_summed != CHECKSUM_PARTIAL). The idea was to > not > recompute checksum for packets with checksum already computed. To do > so, > while hardware checksum is enabled (NETIF_F_HW_CSUM), TX_NOCRC bit > should > be set on buffer descriptor. But to do so, packets must have a minimum size > of 64 and FCS to be computed. > > The NETIF_F_HW_CSUM check was placed there because the issue > described in > [1] is reproducible because hardware checksum is enabled and overrides the > checksum provided by applications. > > [1] https://www.spinics.net/lists/netdev/msg505065.html I understand the issue now. It is weird that the transmit descriptor does not have direct control over turning on checksum generation or not, but it wastes 3 bits returning the error codes of such generation. What can the driver do with such information? In my opinion then hardware transmit checksumming cannot be supported In Linux. > > NETIF_F_SG is not enabled in the MAC I used, so enabling > NETIF_IF_HW_CSUM > > is rather pointless. With the padding code the transmit throughput cannot > get > > higher than 100Mbps in a gigabit connection. > > > > I would recommend to add this option to disable manual padding in one of > those > > macb_config structures. > > In this way the user would have to know from the beginning what kind of > packets are used. > The kernel already does a good job of calculating checksum. Using hardware to do that does not improve performance much. Alternative is to use ethtool to disable hardware tx checksum so that software can intentionally send wrong checksums.
On 30.10.2018 21:36, Tristram Ha - C24268 wrote: >> Could you check on your side that applying this on top of your patch, your >> scenario is still working? >> >> diff --git a/drivers/net/ethernet/cadence/macb_main.c >> b/drivers/net/ethernet/cadence/macb_main.c >> index 1d86b4d5645a..e1347d6d1b50 100644 >> --- a/drivers/net/ethernet/cadence/macb_main.c >> +++ b/drivers/net/ethernet/cadence/macb_main.c >> @@ -1702,12 +1702,8 @@ static int macb_pad_and_fcs(struct sk_buff **skb, >> struct net_device *ndev) >> *skb = nskb; >> } >> >> - if (padlen) { >> - if (padlen >= ETH_FCS_LEN) >> - skb_put_zero(*skb, padlen - ETH_FCS_LEN); >> - else >> - skb_trim(*skb, ETH_FCS_LEN - padlen); >> - } >> + if (padlen > ETH_FCS_LEN) >> + skb_put_zero(*skb, padlen - ETH_FCS_LEN); > > I think it is okay but I need to check all paths are covered. On my side I checked with pktgen generating packets with sizes starting from 1-MTU. Same scripts I also used when I first implemented this but it seems that your scenario wasn't covered. Please have a look and let me know how it works on your side. > >> It was reported in [1] that UDP checksum is offloaded to hardware no matter >> the application previously computed it. >> >> The code should be executed only for packets that has checksum computed >> by >> applications ((*skb)->ip_summed != CHECKSUM_PARTIAL). The idea was to >> not >> recompute checksum for packets with checksum already computed. To do >> so, >> while hardware checksum is enabled (NETIF_F_HW_CSUM), TX_NOCRC bit >> should >> be set on buffer descriptor. But to do so, packets must have a minimum size >> of 64 and FCS to be computed. >> >> The NETIF_F_HW_CSUM check was placed there because the issue >> described in >> [1] is reproducible because hardware checksum is enabled and overrides the >> checksum provided by applications. >> >> [1] https://www.spinics.net/lists/netdev/msg505065.html > > I understand the issue now. It is weird that the transmit descriptor does not > have direct control over turning on checksum generation or not, but it wastes > 3 bits returning the error codes of such generation. What can the driver do > with such information? Yep, from my POV it would have been nice if it could have been able to do the pad an FCS even when hardware checksum is enabled and TX_NOCRC bit is set in descriptor. For cases when hardware checksum is disable it seems that it could handle it. > > In my opinion then hardware transmit checksumming cannot be supported > In Linux. > >>> NETIF_F_SG is not enabled in the MAC I used, so enabling >> NETIF_IF_HW_CSUM >>> is rather pointless. With the padding code the transmit throughput cannot >> get >>> higher than 100Mbps in a gigabit connection. >>> >>> I would recommend to add this option to disable manual padding in one of >> those >>> macb_config structures. >> >> In this way the user would have to know from the beginning what kind of >> packets are used. >> > > The kernel already does a good job of calculating checksum. Using hardware to > do that does not improve performance much. > > Alternative is to use ethtool to disable hardware tx checksum so that software can > intentionally send wrong checksums. >
diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index 8f5bf91..1d86b4d 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -1684,7 +1684,7 @@ static int macb_pad_and_fcs(struct sk_buff **skb, struct net_device *ndev) padlen = 0; /* No room for FCS, need to reallocate skb. */ else - padlen = ETH_FCS_LEN - tailroom; + padlen = ETH_FCS_LEN; } else { /* Add room for FCS. */ padlen += ETH_FCS_LEN;