Message ID | 20220422180305.301882-1-maxime.chevallier@bootlin.com |
---|---|
Headers | show |
Series | net: ipqess: introduce Qualcomm IPQESS driver | expand |
On 4/22/22 11:03, Maxime Chevallier wrote: > This tagging protocol is designed for the situation where the link > between the MAC and the Switch is designed such that the Destination > Port, which is usually embedded in some part of the Ethernet Header, is > sent out-of-band, and isn't present at all in the Ethernet frame. > > This can happen when the MAC and Switch are tightly integrated on an > SoC, as is the case with the Qualcomm IPQ4019 for example, where the DSA > tag is inserted directly into the DMA descriptors. In that case, > the MAC driver is responsible for sending the tag to the switch using > the out-of-band medium. To do so, the MAC driver needs to have the > information of the destination port for that skb. > > This tagging protocol relies on a new set of fields in skb->shinfo to > transmit the dsa tagging information to and from the MAC driver. > > Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> First off, I am not a big fan of expanding skb::shared_info because it is sensitive to cache line sizes and is critical for performance at much higher speeds, I would expect Eric and Jakub to not be terribly happy about it. The Broadcom systemport (bcmsysport.c) has a mode where it can extract the Broadcom tag and put it in front of the actual packet contents which appears to be very similar here. From there on, you can have two strategies: - have the Ethernet controller mangle the packet contents such that the QCA tag is located in front of the actual Ethernet frame and create a new tagging protocol variant for QCA, similar to the TAG_BRCM versus TAG_BRCM_PREPEND - provide the necessary information for the tagger to work using an out of band mechanism, which is what you have done, in which case, maybe you can use skb->cb[] instead of using skb::shared_info?
> +static int ipqess_axi_probe(struct platform_device *pdev) > +{ > + struct device_node *np = pdev->dev.of_node; > + struct net_device *netdev; > + phy_interface_t phy_mode; > + struct resource *res; > + struct ipqess *ess; > + int i, err = 0; > + > + netdev = devm_alloc_etherdev_mqs(&pdev->dev, sizeof(struct ipqess), > + IPQESS_NETDEV_QUEUES, > + IPQESS_NETDEV_QUEUES); > + if (!netdev) > + return -ENOMEM; > + > + ess = netdev_priv(netdev); > + ess->netdev = netdev; > + ess->pdev = pdev; > + spin_lock_init(&ess->stats_lock); > + SET_NETDEV_DEV(netdev, &pdev->dev); > + platform_set_drvdata(pdev, netdev); .... > + > + ipqess_set_ethtool_ops(netdev); > + > + err = register_netdev(netdev); > + if (err) > + goto err_out; Before register_netdev() even returns, your devices can be in use, the open callback called and packets sent. This is particularly true for NFS root. Which means any setup done after this is probably wrong. > + > + err = ipqess_hw_init(ess); > + if (err) > + goto err_out; > + > + for (i = 0; i < IPQESS_NETDEV_QUEUES; i++) { > + int qid; > + > + netif_tx_napi_add(netdev, &ess->tx_ring[i].napi_tx, > + ipqess_tx_napi, 64); > + netif_napi_add(netdev, > + &ess->rx_ring[i].napi_rx, > + ipqess_rx_napi, 64); > + > + qid = ess->tx_ring[i].idx; > + err = devm_request_irq(&ess->netdev->dev, ess->tx_irq[qid], > + ipqess_interrupt_tx, 0, > + ess->tx_irq_names[qid], > + &ess->tx_ring[i]); > + if (err) > + goto err_out; > + > + qid = ess->rx_ring[i].idx; > + err = devm_request_irq(&ess->netdev->dev, ess->rx_irq[qid], > + ipqess_interrupt_rx, 0, > + ess->rx_irq_names[qid], > + &ess->rx_ring[i]); > + if (err) > + goto err_out; > + } All this should probably go before netdev_register(). > +static int ipqess_get_strset_count(struct net_device *netdev, int sset) > +{ > + switch (sset) { > + case ETH_SS_STATS: > + return ARRAY_SIZE(ipqess_stats); > + default: > + netdev_dbg(netdev, "%s: Invalid string set", __func__); Unsupported would be better than invalid. > + return -EOPNOTSUPP; > + } > +} Andrew
On Fri, Apr 22, 2022 at 08:03:00PM +0200, Maxime Chevallier wrote: > Hello everyone, > > This series introduces a new driver, for the Qualcomm IPQESS Ethernet > Controller, found on the IPQ4019. > > The driver itself is pretty straightforward, but has lived out-of-tree > for a while. I've done my best to clean-up some outdated API calls, but > some might remain. > > This controller is somewhat special, since it's part of the IPQ4019 SoC > which also includes an QCA8K switch, and uses the IPQESS controller for > the CPU port. Does it exist in a form where it is not connected to a switch? As Florian has suggested, if we assume frames are always going to/coming from a switch, we can play around with the frame format a little. A dummy tag could be added to the head or tail of the frame, which the MAC driver then uses. That gives us a more normal structure. Andrew
On Fri, Apr 22, 2022 at 10:29 PM Andrew Lunn <andrew@lunn.ch> wrote: > > On Fri, Apr 22, 2022 at 08:03:00PM +0200, Maxime Chevallier wrote: > > Hello everyone, > > > > This series introduces a new driver, for the Qualcomm IPQESS Ethernet > > Controller, found on the IPQ4019. > > > > The driver itself is pretty straightforward, but has lived out-of-tree > > for a while. I've done my best to clean-up some outdated API calls, but > > some might remain. > > > > This controller is somewhat special, since it's part of the IPQ4019 SoC > > which also includes an QCA8K switch, and uses the IPQESS controller for > > the CPU port. > > Does it exist in a form where it is not connected to a switch? Hi Andrew, both the ethernet controller and the QCA8337N based switch are part of the SoC silicon and are always present. The ethernet controller is always connected to the switch port 0. Regards, Robert > > As Florian has suggested, if we assume frames are always going > to/coming from a switch, we can play around with the frame format a > little. A dummy tag could be added to the head or tail of the frame, > which the MAC driver then uses. That gives us a more normal structure. > > Andrew
Hello Florian, On Fri, 22 Apr 2022 11:28:30 -0700 Florian Fainelli <f.fainelli@gmail.com> wrote: Thanks for the review :) > On 4/22/22 11:03, Maxime Chevallier wrote: > > This tagging protocol is designed for the situation where the link > > between the MAC and the Switch is designed such that the Destination > > Port, which is usually embedded in some part of the Ethernet > > Header, is sent out-of-band, and isn't present at all in the > > Ethernet frame. > > > > This can happen when the MAC and Switch are tightly integrated on an > > SoC, as is the case with the Qualcomm IPQ4019 for example, where > > the DSA tag is inserted directly into the DMA descriptors. In that > > case, the MAC driver is responsible for sending the tag to the > > switch using the out-of-band medium. To do so, the MAC driver needs > > to have the information of the destination port for that skb. > > > > This tagging protocol relies on a new set of fields in skb->shinfo > > to transmit the dsa tagging information to and from the MAC driver. > > > > Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com> > > First off, I am not a big fan of expanding skb::shared_info because > it is sensitive to cache line sizes and is critical for performance > at much higher speeds, I would expect Eric and Jakub to not be > terribly happy about it. No problem, I'm testing with the skb->cb approach as you suggested and see how it goes. > The Broadcom systemport (bcmsysport.c) has a mode where it can > extract the Broadcom tag and put it in front of the actual packet > contents which appears to be very similar here. From there on, you > can have two strategies: > > - have the Ethernet controller mangle the packet contents such that > the QCA tag is located in front of the actual Ethernet frame and > create a new tagging protocol variant for QCA, similar to the > TAG_BRCM versus TAG_BRCM_PREPEND > > - provide the necessary information for the tagger to work using an > out of band mechanism, which is what you have done, in which case, > maybe you can use skb->cb[] instead of using skb::shared_info? One of the reason why I chose the second is to support possible future cases where another controller would face a similar situation, and also make use of the out-of-band tagger. I understand that it's not very elegant in the sense that this breaks the nice tagging model we have, but adding/removing data before the payload also seems convoluted to achieve the same thing :) It seems that this approach comes with a bit of an overhead since it implies mangling the skb a bit, but I've yet to test this myself. That's actually what I wanted your opinion on, it also seems like Andrew likes the idea of putting the tag ahead of the frame to stick with the actual model. I don't have strong feelings myself on the way of doing this, I'm looking for an approach that is efficient but yet easily maintainable. Thanks, Maxime
Hello Andrew, On Fri, 22 Apr 2022 22:19:43 +0200 Andrew Lunn <andrew@lunn.ch> wrote: Thanks for the review :) > > +static int ipqess_axi_probe(struct platform_device *pdev) > > +{ > > + struct device_node *np = pdev->dev.of_node; > > + struct net_device *netdev; > > + phy_interface_t phy_mode; > > + struct resource *res; > > + struct ipqess *ess; > > + int i, err = 0; > > + > > + netdev = devm_alloc_etherdev_mqs(&pdev->dev, sizeof(struct > > ipqess), > > + IPQESS_NETDEV_QUEUES, > > + IPQESS_NETDEV_QUEUES); > > + if (!netdev) > > + return -ENOMEM; > > + > > + ess = netdev_priv(netdev); > > + ess->netdev = netdev; > > + ess->pdev = pdev; > > + spin_lock_init(&ess->stats_lock); > > + SET_NETDEV_DEV(netdev, &pdev->dev); > > + platform_set_drvdata(pdev, netdev); > > .... > > > + > > + ipqess_set_ethtool_ops(netdev); > > + > > + err = register_netdev(netdev); > > + if (err) > > + goto err_out; > > Before register_netdev() even returns, your devices can be in use, the > open callback called and packets sent. This is particularly true for > NFS root. Which means any setup done after this is probably wrong. Nice catch, thank you ! > > + > > + err = ipqess_hw_init(ess); > > + if (err) > > + goto err_out; > > + > > + for (i = 0; i < IPQESS_NETDEV_QUEUES; i++) { > > + int qid; > > + > > + netif_tx_napi_add(netdev, &ess->tx_ring[i].napi_tx, > > + ipqess_tx_napi, 64); > > + netif_napi_add(netdev, > > + &ess->rx_ring[i].napi_rx, > > + ipqess_rx_napi, 64); > > + > > + qid = ess->tx_ring[i].idx; > > + err = devm_request_irq(&ess->netdev->dev, > > ess->tx_irq[qid], > > + ipqess_interrupt_tx, 0, > > + ess->tx_irq_names[qid], > > + &ess->tx_ring[i]); > > + if (err) > > + goto err_out; > > + > > + qid = ess->rx_ring[i].idx; > > + err = devm_request_irq(&ess->netdev->dev, > > ess->rx_irq[qid], > > + ipqess_interrupt_rx, 0, > > + ess->rx_irq_names[qid], > > + &ess->rx_ring[i]); > > + if (err) > > + goto err_out; > > + } > > All this should probably go before netdev_register(). I'll fix this for V2. > > +static int ipqess_get_strset_count(struct net_device *netdev, int > > sset) +{ > > + switch (sset) { > > + case ETH_SS_STATS: > > + return ARRAY_SIZE(ipqess_stats); > > + default: > > + netdev_dbg(netdev, "%s: Invalid string set", > > __func__); > > Unsupported would be better than invalid. That's right, thanks > > + return -EOPNOTSUPP; > > + } > > +} > > Andrew Best Regards, Maxime
Hi Maxime, On Tue, Apr 26, 2022 at 03:57:32PM +0200, Maxime Chevallier wrote: > > First off, I am not a big fan of expanding skb::shared_info because > > it is sensitive to cache line sizes and is critical for performance > > at much higher speeds, I would expect Eric and Jakub to not be > > terribly happy about it. > > No problem, I'm testing with the skb->cb approach as you suggested and > see how it goes. > > > The Broadcom systemport (bcmsysport.c) has a mode where it can > > extract the Broadcom tag and put it in front of the actual packet > > contents which appears to be very similar here. From there on, you > > can have two strategies: > > > > - have the Ethernet controller mangle the packet contents such that > > the QCA tag is located in front of the actual Ethernet frame and > > create a new tagging protocol variant for QCA, similar to the > > TAG_BRCM versus TAG_BRCM_PREPEND > > > > - provide the necessary information for the tagger to work using an > > out of band mechanism, which is what you have done, in which case, > > maybe you can use skb->cb[] instead of using skb::shared_info? > > One of the reason why I chose the second is to support possible future > cases where another controller would face a similar situation, and also > make use of the out-of-band tagger. > > I understand that it's not very elegant in the sense that this breaks > the nice tagging model we have, but adding/removing data before the > payload also seems convoluted to achieve the same thing :) It seems > that this approach comes with a bit of an overhead since it implies > mangling the skb a bit, but I've yet to test this myself. > > That's actually what I wanted your opinion on, it also seems like > Andrew likes the idea of putting the tag ahead of the frame to stick > with the actual model. > > I don't have strong feelings myself on the way of doing this, I'm > looking for an approach that is efficient but yet easily maintainable. The skb->cb is not free to use for passing data between the DSA master and the switch driver. There's the qdisc layer on TX, GRO on RX, maybe others, and these mangle that memory region. So that would make it a -1 for skb->cb, and a +1 from my side for making the DSA master driver push a fake prepended header, and consuming it "as usual" from the DSA tagging protocol driver. That, plus the hope that I'll be long dead by the time we'd need to find a solution that scales to more switches designed like this :) I'll take a closer look at your patches a bit later too, probably not today, don't wait for my feedback if you think you're otherwise ready to repost.