Message ID | 1543880097-7106-6-git-send-email-Tristram.Ha@microchip.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
Series | net: dsa: microchip: Modify KSZ9477 DSA driver to support different tail tag formats | expand |
On Mon, Dec 03, 2018 at 03:34:56PM -0800, Tristram.Ha@microchip.com wrote: > From: Tristram Ha <Tristram.Ha@microchip.com> > > Update tag_ksz.c to access switch driver's tail tagging operations. Hi Tristram Humm, i'm not sure we want this, the tagging spit into two places. I need to take a closer look at the previous patch, to see why it cannot be done here. Andrew
On Wed, Dec 05, 2018 at 07:00:38PM +0100, Andrew Lunn wrote: > On Mon, Dec 03, 2018 at 03:34:56PM -0800, Tristram.Ha@microchip.com wrote: > > From: Tristram Ha <Tristram.Ha@microchip.com> > > > > Update tag_ksz.c to access switch driver's tail tagging operations. > > Hi Tristram > > Humm, i'm not sure we want this, the tagging spit into two places. I > need to take a closer look at the previous patch, to see why it cannot > be done here. O.K, i think i get what is going on. I would however implement it differently. One net/dsa/tag_X.c file can export two dsa_device_ops structures, allowing you to share common code for the two taggers. You could call these DSA_TAG_PROTO_KSZ_1_BYTE, and DSA_TAG_PROTO_KSZ_2_BYTE, and the .get_tag_protocol call would then return the correct one for the switch. It might also be possible to merge in tag_trailer, or at least share some code. What i don't yet understand is how you are passing PTP information around. The commit messages need to explain that, since it is not obvious, and it is the first time we have needed PTP info in a tag driver. Andrew
On 12/5/18 10:18 AM, Andrew Lunn wrote: > On Wed, Dec 05, 2018 at 07:00:38PM +0100, Andrew Lunn wrote: >> On Mon, Dec 03, 2018 at 03:34:56PM -0800, Tristram.Ha@microchip.com wrote: >>> From: Tristram Ha <Tristram.Ha@microchip.com> >>> >>> Update tag_ksz.c to access switch driver's tail tagging operations. >> >> Hi Tristram >> >> Humm, i'm not sure we want this, the tagging spit into two places. I >> need to take a closer look at the previous patch, to see why it cannot >> be done here. > > O.K, i think i get what is going on. > > I would however implement it differently. > > One net/dsa/tag_X.c file can export two dsa_device_ops structures, > allowing you to share common code for the two taggers. You could call > these DSA_TAG_PROTO_KSZ_1_BYTE, and DSA_TAG_PROTO_KSZ_2_BYTE, and the > .get_tag_protocol call would then return the correct one for the > switch. Agreed, that is what is done by net/dsa/tag_brcm.c because there are two formats for the Broadcom tag: - TAG_BRCM: the 4-bytes Broadcom tag is between MAC SA and Ethertype - TAG_BRCM_PREPEND: the 4-bytes Broadcom tag is before the MAC DA And the code to process them is basically using relative offsets from the start of the frame to access correct data. This is done largely for performance reasons because we have 1/2 Gigabit/secs capable CPU ports and so we want to avoid as little cache trashing as possible and immediately get the right rcv() function to process the packets. > > It might also be possible to merge in tag_trailer, or at least share > some code. > > What i don't yet understand is how you are passing PTP information > around. The commit messages need to explain that, since it is not > obvious, and it is the first time we have needed PTP info in a tag > driver. > > Andrew >
> >>> Update tag_ksz.c to access switch driver's tail tagging operations. > >> > >> Hi Tristram > >> > >> Humm, i'm not sure we want this, the tagging spit into two places. I > >> need to take a closer look at the previous patch, to see why it cannot > >> be done here. > > > > O.K, i think i get what is going on. > > > > I would however implement it differently. > > > > One net/dsa/tag_X.c file can export two dsa_device_ops structures, > > allowing you to share common code for the two taggers. You could call > > these DSA_TAG_PROTO_KSZ_1_BYTE, and DSA_TAG_PROTO_KSZ_2_BYTE, > and the > > .get_tag_protocol call would then return the correct one for the > > switch. > > Agreed, that is what is done by net/dsa/tag_brcm.c because there are two > formats for the Broadcom tag: > > - TAG_BRCM: the 4-bytes Broadcom tag is between MAC SA and Ethertype > - TAG_BRCM_PREPEND: the 4-bytes Broadcom tag is before the MAC DA > I did try to implement this way. But the other switches do not have the same format even though the length is the same. Then I need to change the following files for any new KSZ switch: include/linux/dsa.h, net/dsa/dsa.c, net/dsa/dsa_priv.h, and finally net/dsa/tag_ksz.c. Even then it will not work if Microchip wants to add 1588 PTP capability to the switches. For KSZ9477 the length of the tail tag changes when the PTP function is enabled. Typically this function is either enabled or disabled all the time, but if users want to change that during normal operation to see how the switch behaves, the transmit function completely stops working correctly. Older driver implementation is to monitor that register change and adjust the length dynamically. Another problem is the tail tag needs to include the timestamp for the 1-step Pdelay_Resp to have accurate turnaround time when that message is sent out by the switch. This will require access to the main switch driver which will keep track of those PTP messages. PTP handles transmit timestamp in skb_tx_timestamp, which is typically called after the frame is sent, so it is too late. DSA calls dsa_skb_tx_timestamp before sending, but it only provides a clone to the driver that supports port_txstamp and so the switch driver may not be able to do anything. > And the code to process them is basically using relative offsets from > the start of the frame to access correct data. > > This is done largely for performance reasons because we have 1/2 > Gigabit/secs capable CPU ports and so we want to avoid as little cache > trashing as possible and immediately get the right rcv() function to > process the packets. > The SoC I used for this driver development actually has problem sending Gigabit traffic so I do not see the effect of any slowdown, and the updated MAC driver change for a hardware problem does not help and greatly degrades the transmit performance. > > > > It might also be possible to merge in tag_trailer, or at least share > > some code. > > Actually in previous old DSA implementation I just hijacked this file to add the tail tag operations without creating a new file like tag_ksz.c. > > What i don't yet understand is how you are passing PTP information > > around. The commit messages need to explain that, since it is not > > obvious, and it is the first time we have needed PTP info in a tag > > driver. It seems the official 1588 PTP timestamp API for a PHY driver is only implemented in only PHY driver, net/phy/dp83640.c, in the whole kernel. DSA uses similar mechanism to support 1588 PTP. In dsa_switch_rcv() the CPU receive function is called first before dsa_skb_defer_rx_timestamp(). That means the receive tail tag operation has to be done first to retrieve the receive timestamp so that it can be passed later. It is probably not good to change the socket buffer length inside the port_rxtstamp function, and I do not see any other way to insert that transmit timestamp. A customer has already inquired about implementing 1588 PTP in the DSA driver. I hope this mechanism is approved so that I can start doing that.
> I did try to implement this way. But the other switches do not have the same > format even though the length is the same. Then I need to change the following > files for any new KSZ switch: include/linux/dsa.h, net/dsa/dsa.c, net/dsa/dsa_priv.h, > and finally net/dsa/tag_ksz.c. You can always add two different tag drivers. They don't have to share code if it does not make sense. > Even then it will not work if Microchip wants to add 1588 PTP > capability to the switches. > > For KSZ9477 the length of the tail tag changes when the PTP function > is enabled. Typically this function is either enabled or disabled > all the time, but if users want to change that during normal > operation to see how the switch behaves, the transmit function > completely stops working correctly. We should figure out how to support PTP. I think that is the main issue here. > Older driver implementation is to monitor that register change and adjust the length > dynamically. > > Another problem is the tail tag needs to include the timestamp for the 1-step > Pdelay_Resp to have accurate turnaround time when that message is sent out by the > switch. This will require access to the main switch driver which will keep track of those > PTP messages. > > PTP handles transmit timestamp in skb_tx_timestamp, which is typically called after the > frame is sent, so it is too late. DSA calls dsa_skb_tx_timestamp before sending, but it > only provides a clone to the driver that supports port_txstamp and so the switch driver > may not be able to do anything. The current design assumes the hardware will insert the PTP timestamp into the frame using the clock inside the hardware. You then ask it what timestamp it actually used. If i understand you correctly, in your case, software was to provide the timestamp which then gets inserted into the frame. So you want to provide this timestamp as late as possible, when the frame reaches the head of the queue and is about to be sent out the master interface? > In dsa_switch_rcv() the CPU receive function is called first before > dsa_skb_defer_rx_timestamp(). That means the receive tail tag > operation has to be done first to retrieve the receive timestamp so > that it can be passed later. What i think you can do is in your tag rx function you can directly add the timestamp info to the skbuf. The dsa driver function .port_txtstamp can then always return false. Your tag function is going to need access to some driver state, but you should be able to get at that, following pointers, and placing some of the structures in global headers. Andrew
On Thu, Dec 06, 2018 at 08:00:26PM +0000, Tristram.Ha@microchip.com wrote: > A customer has already inquired about implementing 1588 PTP in the DSA driver. I hope > this mechanism is approved so that I can start doing that. If you need changes to the PTP core, you had better discuss this with the PTP maintainer. Thanks, Richard
On Thu 2018-12-06 20:00:26, Tristram.Ha@microchip.com wrote: > > >>> Update tag_ksz.c to access switch driver's tail tagging operations. > > >> > > >> Hi Tristram > > >> > > >> Humm, i'm not sure we want this, the tagging spit into two places. I > > >> need to take a closer look at the previous patch, to see why it cannot > > >> be done here. > > > > > > O.K, i think i get what is going on. > > > > > > I would however implement it differently. > > > > > > One net/dsa/tag_X.c file can export two dsa_device_ops structures, > > > allowing you to share common code for the two taggers. You could call > > > these DSA_TAG_PROTO_KSZ_1_BYTE, and DSA_TAG_PROTO_KSZ_2_BYTE, > > and the > > > .get_tag_protocol call would then return the correct one for the > > > switch. > > > > Agreed, that is what is done by net/dsa/tag_brcm.c because there are two > > formats for the Broadcom tag: > > > > - TAG_BRCM: the 4-bytes Broadcom tag is between MAC SA and Ethertype > > - TAG_BRCM_PREPEND: the 4-bytes Broadcom tag is before the MAC DA > > > > I did try to implement this way. But the other switches do not have the same > format even though the length is the same. Then I need to change the following > files for any new KSZ switch: include/linux/dsa.h, net/dsa/dsa.c, net/dsa/dsa_priv.h, > and finally net/dsa/tag_ksz.c. > > Even then it will not work if Microchip wants to add 1588 PTP capability to the switches. > > For KSZ9477 the length of the tail tag changes when the PTP function is enabled. > Typically this function is either enabled or disabled all the time, but if users want to > change that during normal operation to see how the switch behaves, the transmit > function completely stops working correctly. I'd be careful about locking. Seems like dsa was designed with "tag format is static", and you want to change it dynamically... Pavel
> I'd be careful about locking. Seems like dsa was designed with "tag > format is static", and you want to change it dynamically... I see there is now a new overhead parameter in the dsa_device_ops structure and dev_set_mtu is called in master.c. It does not prevent the tag size to change dynamically though. A bigger size can be used instead to make sure the MAC controller can support it. In practice I do not think it does anything meaningful. Most MAC controllers can transmit and receive more than 1518 bytes but still only advertise 1500 MTU. It is only when they support jumbo frame the drivers allow increasing the MTU. In the case of Atmel MAC controller I only see the mtu size is 1502, but there is nothing changed inside the driver. I did find another bug in the Atmel MAC driver concerning this max_mtu implementation. It does not affect the DSA operation as the child devices still have the cap of 1500 MTU, but the main device will have problem running by itself when MTU is increased.
On Tue, Dec 11, 2018 at 11:59:34PM +0000, Tristram.Ha@microchip.com wrote: > > I'd be careful about locking. Seems like dsa was designed with "tag > > format is static", and you want to change it dynamically... > > I see there is now a new overhead parameter in the dsa_device_ops structure > and dev_set_mtu is called in master.c. It does not prevent the tag size to > change dynamically though. A bigger size can be used instead to make sure the > MAC controller can support it. > > In practice I do not think it does anything meaningful. Most MAC controllers > can transmit and receive more than 1518 bytes but still only advertise 1500 > MTU. Hi Tristram There are a few MAC devices to do enforce 1518. e1000e is one example. You have to increase the MTU before it will receive DSA tagged frames. I initially had similar problems with the FEC driver when i started using that a few years ago. At that time i did not rallies it was a wide scale problem and just changed the FEC. This should be a more generic solution. Andrew
diff --git a/net/dsa/tag_ksz.c b/net/dsa/tag_ksz.c index 0f62eff..307e58b 100644 --- a/net/dsa/tag_ksz.c +++ b/net/dsa/tag_ksz.c @@ -11,37 +11,25 @@ #include <linux/etherdevice.h> #include <linux/list.h> #include <linux/slab.h> +#include <linux/dsa/ksz_dsa.h> #include <net/dsa.h> #include "dsa_priv.h" -/* For Ingress (Host -> KSZ), 2 bytes are added before FCS. - * --------------------------------------------------------------------------- - * DA(6bytes)|SA(6bytes)|....|Data(nbytes)|tag0(1byte)|tag1(1byte)|FCS(4bytes) - * --------------------------------------------------------------------------- - * tag0 : Prioritization (not used now) - * tag1 : each bit represents port (eg, 0x01=port1, 0x02=port2, 0x10=port5) - * - * For Egress (KSZ -> Host), 1 byte is added before FCS. - * --------------------------------------------------------------------------- - * DA(6bytes)|SA(6bytes)|....|Data(nbytes)|tag0(1byte)|FCS(4bytes) - * --------------------------------------------------------------------------- - * tag0 : zero-based value represents port - * (eg, 0x00=port1, 0x02=port3, 0x06=port7) - */ - -#define KSZ_INGRESS_TAG_LEN 2 #define KSZ_EGRESS_TAG_LEN 1 static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct net_device *dev) { struct dsa_port *dp = dsa_slave_to_port(dev); + struct ksz_device *sw = dp->ds->priv; struct sk_buff *nskb; + int len; int padlen; - u8 *tag; + + len = sw->tag_ops->get_len(sw); padlen = (skb->len >= ETH_ZLEN) ? 0 : ETH_ZLEN - skb->len; - if (skb_tailroom(skb) >= padlen + KSZ_INGRESS_TAG_LEN) { + if (skb_tailroom(skb) >= padlen + len) { /* Let dsa_slave_xmit() free skb */ if (__skb_put_padto(skb, skb->len + padlen, false)) return NULL; @@ -49,7 +37,7 @@ static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct net_device *dev) nskb = skb; } else { nskb = alloc_skb(NET_IP_ALIGN + skb->len + - padlen + KSZ_INGRESS_TAG_LEN, GFP_ATOMIC); + padlen + len, GFP_ATOMIC); if (!nskb) return NULL; skb_reserve(nskb, NET_IP_ALIGN); @@ -70,9 +58,8 @@ static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct net_device *dev) consume_skb(skb); } - tag = skb_put(nskb, KSZ_INGRESS_TAG_LEN); - tag[0] = 0; - tag[1] = 1 << dp->index; /* destination port */ + sw->tag_ops->set_tag(sw, skb_put(nskb, len), skb_mac_header(nskb), + dp->index); return nskb; } @@ -80,18 +67,27 @@ static struct sk_buff *ksz_xmit(struct sk_buff *skb, struct net_device *dev) static struct sk_buff *ksz_rcv(struct sk_buff *skb, struct net_device *dev, struct packet_type *pt) { + struct dsa_port *cpu_dp = dev->dsa_ptr; + struct dsa_switch_tree *dst = cpu_dp->dst; + struct dsa_switch *ds = dst->ds[0]; + struct ksz_device *sw; u8 *tag; + int len; int source_port; + if (!ds) + return NULL; + sw = ds->priv; + tag = skb_tail_pointer(skb) - KSZ_EGRESS_TAG_LEN; - source_port = tag[0] & 7; + len = sw->tag_ops->get_tag(sw, tag, &source_port); skb->dev = dsa_master_find_slave(dev, 0, source_port); if (!skb->dev) return NULL; - pskb_trim_rcsum(skb, skb->len - KSZ_EGRESS_TAG_LEN); + pskb_trim_rcsum(skb, skb->len - len); return skb; }