mbox series

[net-next,0/5] net: ipqess: introduce Qualcomm IPQESS driver

Message ID 20220422180305.301882-1-maxime.chevallier@bootlin.com
Headers show
Series net: ipqess: introduce Qualcomm IPQESS driver | expand

Message

Maxime Chevallier April 22, 2022, 6:03 p.m. UTC
Hello everyone,

This series introduces a new driver, for the Qualcomm IPQESS Ethernet
Controller, found on the IPQ4019.

The driver itself is pretty straightforward, but has lived out-of-tree
for a while. I've done my best to clean-up some outdated API calls, but
some might remain.

This controller is somewhat special, since it's part of the IPQ4019 SoC
which also includes an QCA8K switch, and uses the IPQESS controller for
the CPU port. The switch is so tightly intergrated with the MAC that it
is connected to the MAC using an internal link (hence the fact that we
only support PHY_INTERFACE_MODE_INTERNAL), and this has some
consequences on the DSA side.

The tagging for the switch isn't done inband as most switch do, but
out-of-band, the DSA tag being included in the DMA descriptor.

So, this series also includes a new DSA tagging protocol, that sets the
DSA port index into skb->shinfo, so that the MAC driver can use it to
build the descriptor. This is definitely unusual, so I'l very openned to
suggestions, comments and reviews on the tagging side of this series.

Thanks to the Sartura folks who worked on a base version of this driver,
and provided test hardware.

Best regards,

Maxime Chevallier

Maxime Chevallier (5):
  net: ipqess: introduce the Qualcomm IPQESS driver
  net: dsa: add out-of-band tagging protocol
  net: ipqess: Add out-of-band DSA tagging support
  net: dt-bindings: Introduce the Qualcomm IPQESS Ethernet controller
  ARM: dts: qcom: ipq4019: Add description for the IPQESS Ethernet
    controller

 .../devicetree/bindings/net/qcom,ipqess.yaml  |   94 ++
 MAINTAINERS                                   |    6 +
 arch/arm/boot/dts/qcom-ipq4019.dtsi           |   42 +
 drivers/net/ethernet/qualcomm/Kconfig         |   11 +
 drivers/net/ethernet/qualcomm/Makefile        |    2 +
 drivers/net/ethernet/qualcomm/ipqess/Makefile |    8 +
 drivers/net/ethernet/qualcomm/ipqess/ipqess.c | 1258 +++++++++++++++++
 drivers/net/ethernet/qualcomm/ipqess/ipqess.h |  515 +++++++
 .../ethernet/qualcomm/ipqess/ipqess_ethtool.c |  168 +++
 include/linux/skbuff.h                        |    7 +
 include/net/dsa.h                             |    2 +
 net/dsa/Kconfig                               |    7 +
 net/dsa/Makefile                              |    1 +
 net/dsa/tag_oob.c                             |   45 +
 14 files changed, 2166 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/net/qcom,ipqess.yaml
 create mode 100644 drivers/net/ethernet/qualcomm/ipqess/Makefile
 create mode 100644 drivers/net/ethernet/qualcomm/ipqess/ipqess.c
 create mode 100644 drivers/net/ethernet/qualcomm/ipqess/ipqess.h
 create mode 100644 drivers/net/ethernet/qualcomm/ipqess/ipqess_ethtool.c
 create mode 100644 net/dsa/tag_oob.c

Comments

Florian Fainelli April 22, 2022, 6:28 p.m. UTC | #1
On 4/22/22 11:03, Maxime Chevallier wrote:
> This tagging protocol is designed for the situation where the link
> between the MAC and the Switch is designed such that the Destination
> Port, which is usually embedded in some part of the Ethernet Header, is
> sent out-of-band, and isn't present at all in the Ethernet frame.
> 
> This can happen when the MAC and Switch are tightly integrated on an
> SoC, as is the case with the Qualcomm IPQ4019 for example, where the DSA
> tag is inserted directly into the DMA descriptors. In that case,
> the MAC driver is responsible for sending the tag to the switch using
> the out-of-band medium. To do so, the MAC driver needs to have the
> information of the destination port for that skb.
> 
> This tagging protocol relies on a new set of fields in skb->shinfo to
> transmit the dsa tagging information to and from the MAC driver.
> 
> Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>

First off, I am not a big fan of expanding skb::shared_info because it 
is sensitive to cache line sizes and is critical for performance at much 
higher speeds, I would expect Eric and Jakub to not be terribly happy 
about it.

The Broadcom systemport (bcmsysport.c) has a mode where it can extract 
the Broadcom tag and put it in front of the actual packet contents which 
appears to be very similar here. From there on, you can have two strategies:

- have the Ethernet controller mangle the packet contents such that the 
QCA tag is located in front of the actual Ethernet frame and create a 
new tagging protocol variant for QCA, similar to the TAG_BRCM versus 
TAG_BRCM_PREPEND

- provide the necessary information for the tagger to work using an out 
of band mechanism, which is what you have done, in which case, maybe you 
can use skb->cb[] instead of using skb::shared_info?
Andrew Lunn April 22, 2022, 8:19 p.m. UTC | #2
> +static int ipqess_axi_probe(struct platform_device *pdev)
> +{
> +	struct device_node *np = pdev->dev.of_node;
> +	struct net_device *netdev;
> +	phy_interface_t phy_mode;
> +	struct resource *res;
> +	struct ipqess *ess;
> +	int i, err = 0;
> +
> +	netdev = devm_alloc_etherdev_mqs(&pdev->dev, sizeof(struct ipqess),
> +					 IPQESS_NETDEV_QUEUES,
> +					 IPQESS_NETDEV_QUEUES);
> +	if (!netdev)
> +		return -ENOMEM;
> +
> +	ess = netdev_priv(netdev);
> +	ess->netdev = netdev;
> +	ess->pdev = pdev;
> +	spin_lock_init(&ess->stats_lock);
> +	SET_NETDEV_DEV(netdev, &pdev->dev);
> +	platform_set_drvdata(pdev, netdev);

....

> +
> +	ipqess_set_ethtool_ops(netdev);
> +
> +	err = register_netdev(netdev);
> +	if (err)
> +		goto err_out;

Before register_netdev() even returns, your devices can be in use, the
open callback called and packets sent. This is particularly true for
NFS root. Which means any setup done after this is probably wrong.

> +
> +	err = ipqess_hw_init(ess);
> +	if (err)
> +		goto err_out;
> +
> +	for (i = 0; i < IPQESS_NETDEV_QUEUES; i++) {
> +		int qid;
> +
> +		netif_tx_napi_add(netdev, &ess->tx_ring[i].napi_tx,
> +				  ipqess_tx_napi, 64);
> +		netif_napi_add(netdev,
> +			       &ess->rx_ring[i].napi_rx,
> +			       ipqess_rx_napi, 64);
> +
> +		qid = ess->tx_ring[i].idx;
> +		err = devm_request_irq(&ess->netdev->dev, ess->tx_irq[qid],
> +				       ipqess_interrupt_tx, 0,
> +				       ess->tx_irq_names[qid],
> +				       &ess->tx_ring[i]);
> +		if (err)
> +			goto err_out;
> +
> +		qid = ess->rx_ring[i].idx;
> +		err = devm_request_irq(&ess->netdev->dev, ess->rx_irq[qid],
> +				       ipqess_interrupt_rx, 0,
> +				       ess->rx_irq_names[qid],
> +				       &ess->rx_ring[i]);
> +		if (err)
> +			goto err_out;
> +	}

All this should probably go before netdev_register().

> +static int ipqess_get_strset_count(struct net_device *netdev, int sset)
> +{
> +	switch (sset) {
> +	case ETH_SS_STATS:
> +		return ARRAY_SIZE(ipqess_stats);
> +	default:
> +		netdev_dbg(netdev, "%s: Invalid string set", __func__);

Unsupported would be better than invalid.

> +		return -EOPNOTSUPP;
> +	}
> +}

  Andrew
Andrew Lunn April 22, 2022, 8:29 p.m. UTC | #3
On Fri, Apr 22, 2022 at 08:03:00PM +0200, Maxime Chevallier wrote:
> Hello everyone,
> 
> This series introduces a new driver, for the Qualcomm IPQESS Ethernet
> Controller, found on the IPQ4019.
> 
> The driver itself is pretty straightforward, but has lived out-of-tree
> for a while. I've done my best to clean-up some outdated API calls, but
> some might remain.
> 
> This controller is somewhat special, since it's part of the IPQ4019 SoC
> which also includes an QCA8K switch, and uses the IPQESS controller for
> the CPU port.

Does it exist in a form where it is not connected to a switch?

As Florian has suggested, if we assume frames are always going
to/coming from a switch, we can play around with the frame format a
little. A dummy tag could be added to the head or tail of the frame,
which the MAC driver then uses. That gives us a more normal structure.

      Andrew
Robert Marko April 26, 2022, 12:12 p.m. UTC | #4
On Fri, Apr 22, 2022 at 10:29 PM Andrew Lunn <andrew@lunn.ch> wrote:
>
> On Fri, Apr 22, 2022 at 08:03:00PM +0200, Maxime Chevallier wrote:
> > Hello everyone,
> >
> > This series introduces a new driver, for the Qualcomm IPQESS Ethernet
> > Controller, found on the IPQ4019.
> >
> > The driver itself is pretty straightforward, but has lived out-of-tree
> > for a while. I've done my best to clean-up some outdated API calls, but
> > some might remain.
> >
> > This controller is somewhat special, since it's part of the IPQ4019 SoC
> > which also includes an QCA8K switch, and uses the IPQESS controller for
> > the CPU port.
>
> Does it exist in a form where it is not connected to a switch?

Hi Andrew, both the ethernet controller and the QCA8337N based switch are
part of the SoC silicon and are always present.
The ethernet controller is always connected to the switch port 0.

Regards,
Robert
>
> As Florian has suggested, if we assume frames are always going
> to/coming from a switch, we can play around with the frame format a
> little. A dummy tag could be added to the head or tail of the frame,
> which the MAC driver then uses. That gives us a more normal structure.
>
>       Andrew
Maxime Chevallier April 26, 2022, 1:57 p.m. UTC | #5
Hello Florian,

On Fri, 22 Apr 2022 11:28:30 -0700
Florian Fainelli <f.fainelli@gmail.com> wrote:

Thanks for the review :)

> On 4/22/22 11:03, Maxime Chevallier wrote:
> > This tagging protocol is designed for the situation where the link
> > between the MAC and the Switch is designed such that the Destination
> > Port, which is usually embedded in some part of the Ethernet
> > Header, is sent out-of-band, and isn't present at all in the
> > Ethernet frame.
> > 
> > This can happen when the MAC and Switch are tightly integrated on an
> > SoC, as is the case with the Qualcomm IPQ4019 for example, where
> > the DSA tag is inserted directly into the DMA descriptors. In that
> > case, the MAC driver is responsible for sending the tag to the
> > switch using the out-of-band medium. To do so, the MAC driver needs
> > to have the information of the destination port for that skb.
> > 
> > This tagging protocol relies on a new set of fields in skb->shinfo
> > to transmit the dsa tagging information to and from the MAC driver.
> > 
> > Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>  
> 
> First off, I am not a big fan of expanding skb::shared_info because
> it is sensitive to cache line sizes and is critical for performance
> at much higher speeds, I would expect Eric and Jakub to not be
> terribly happy about it.

No problem, I'm testing with the skb->cb approach as you suggested and
see how it goes.

> The Broadcom systemport (bcmsysport.c) has a mode where it can
> extract the Broadcom tag and put it in front of the actual packet
> contents which appears to be very similar here. From there on, you
> can have two strategies:
> 
> - have the Ethernet controller mangle the packet contents such that
> the QCA tag is located in front of the actual Ethernet frame and
> create a new tagging protocol variant for QCA, similar to the
> TAG_BRCM versus TAG_BRCM_PREPEND
> 
> - provide the necessary information for the tagger to work using an
> out of band mechanism, which is what you have done, in which case,
> maybe you can use skb->cb[] instead of using skb::shared_info?

One of the reason why I chose the second is to support possible future
cases where another controller would face a similar situation, and also
make use of the out-of-band tagger.

I understand that it's not very elegant in the sense that this breaks
the nice tagging model we have, but adding/removing data before the
payload also seems convoluted to achieve the same thing :) It seems
that this approach comes with a bit of an overhead since it implies
mangling the skb a bit, but I've yet to test this myself.

That's actually what I wanted your opinion on, it also seems like
Andrew likes the idea of putting the tag ahead of the frame to stick
with the actual model.

I don't have strong feelings myself on the way of doing this, I'm
looking for an approach that is efficient but yet easily maintainable.

Thanks,

Maxime
Maxime Chevallier April 26, 2022, 1:59 p.m. UTC | #6
Hello Andrew,

On Fri, 22 Apr 2022 22:19:43 +0200
Andrew Lunn <andrew@lunn.ch> wrote:

Thanks for the review :)

> > +static int ipqess_axi_probe(struct platform_device *pdev)
> > +{
> > +	struct device_node *np = pdev->dev.of_node;
> > +	struct net_device *netdev;
> > +	phy_interface_t phy_mode;
> > +	struct resource *res;
> > +	struct ipqess *ess;
> > +	int i, err = 0;
> > +
> > +	netdev = devm_alloc_etherdev_mqs(&pdev->dev, sizeof(struct
> > ipqess),
> > +					 IPQESS_NETDEV_QUEUES,
> > +					 IPQESS_NETDEV_QUEUES);
> > +	if (!netdev)
> > +		return -ENOMEM;
> > +
> > +	ess = netdev_priv(netdev);
> > +	ess->netdev = netdev;
> > +	ess->pdev = pdev;
> > +	spin_lock_init(&ess->stats_lock);
> > +	SET_NETDEV_DEV(netdev, &pdev->dev);
> > +	platform_set_drvdata(pdev, netdev);  
> 
> ....
> 
> > +
> > +	ipqess_set_ethtool_ops(netdev);
> > +
> > +	err = register_netdev(netdev);
> > +	if (err)
> > +		goto err_out;  
> 
> Before register_netdev() even returns, your devices can be in use, the
> open callback called and packets sent. This is particularly true for
> NFS root. Which means any setup done after this is probably wrong.

Nice catch, thank you !

> > +
> > +	err = ipqess_hw_init(ess);
> > +	if (err)
> > +		goto err_out;
> > +
> > +	for (i = 0; i < IPQESS_NETDEV_QUEUES; i++) {
> > +		int qid;
> > +
> > +		netif_tx_napi_add(netdev, &ess->tx_ring[i].napi_tx,
> > +				  ipqess_tx_napi, 64);
> > +		netif_napi_add(netdev,
> > +			       &ess->rx_ring[i].napi_rx,
> > +			       ipqess_rx_napi, 64);
> > +
> > +		qid = ess->tx_ring[i].idx;
> > +		err = devm_request_irq(&ess->netdev->dev,
> > ess->tx_irq[qid],
> > +				       ipqess_interrupt_tx, 0,
> > +				       ess->tx_irq_names[qid],
> > +				       &ess->tx_ring[i]);
> > +		if (err)
> > +			goto err_out;
> > +
> > +		qid = ess->rx_ring[i].idx;
> > +		err = devm_request_irq(&ess->netdev->dev,
> > ess->rx_irq[qid],
> > +				       ipqess_interrupt_rx, 0,
> > +				       ess->rx_irq_names[qid],
> > +				       &ess->rx_ring[i]);
> > +		if (err)
> > +			goto err_out;
> > +	}  
> 
> All this should probably go before netdev_register().

I'll fix this for V2.

> > +static int ipqess_get_strset_count(struct net_device *netdev, int
> > sset) +{
> > +	switch (sset) {
> > +	case ETH_SS_STATS:
> > +		return ARRAY_SIZE(ipqess_stats);
> > +	default:
> > +		netdev_dbg(netdev, "%s: Invalid string set",
> > __func__);  
> 
> Unsupported would be better than invalid.

That's right, thanks

> > +		return -EOPNOTSUPP;
> > +	}
> > +}  
> 
>   Andrew

Best Regards,

Maxime
Vladimir Oltean April 26, 2022, 6:52 p.m. UTC | #7
Hi Maxime,

On Tue, Apr 26, 2022 at 03:57:32PM +0200, Maxime Chevallier wrote:
> > First off, I am not a big fan of expanding skb::shared_info because
> > it is sensitive to cache line sizes and is critical for performance
> > at much higher speeds, I would expect Eric and Jakub to not be
> > terribly happy about it.
> 
> No problem, I'm testing with the skb->cb approach as you suggested and
> see how it goes.
> 
> > The Broadcom systemport (bcmsysport.c) has a mode where it can
> > extract the Broadcom tag and put it in front of the actual packet
> > contents which appears to be very similar here. From there on, you
> > can have two strategies:
> > 
> > - have the Ethernet controller mangle the packet contents such that
> > the QCA tag is located in front of the actual Ethernet frame and
> > create a new tagging protocol variant for QCA, similar to the
> > TAG_BRCM versus TAG_BRCM_PREPEND
> > 
> > - provide the necessary information for the tagger to work using an
> > out of band mechanism, which is what you have done, in which case,
> > maybe you can use skb->cb[] instead of using skb::shared_info?
> 
> One of the reason why I chose the second is to support possible future
> cases where another controller would face a similar situation, and also
> make use of the out-of-band tagger.
> 
> I understand that it's not very elegant in the sense that this breaks
> the nice tagging model we have, but adding/removing data before the
> payload also seems convoluted to achieve the same thing :) It seems
> that this approach comes with a bit of an overhead since it implies
> mangling the skb a bit, but I've yet to test this myself.
> 
> That's actually what I wanted your opinion on, it also seems like
> Andrew likes the idea of putting the tag ahead of the frame to stick
> with the actual model.
> 
> I don't have strong feelings myself on the way of doing this, I'm
> looking for an approach that is efficient but yet easily maintainable.

The skb->cb is not free to use for passing data between the DSA master
and the switch driver. There's the qdisc layer on TX, GRO on RX, maybe
others, and these mangle that memory region. So that would make it a -1
for skb->cb, and a +1 from my side for making the DSA master driver push
a fake prepended header, and consuming it "as usual" from the DSA
tagging protocol driver. That, plus the hope that I'll be long dead by
the time we'd need to find a solution that scales to more switches
designed like this :)

I'll take a closer look at your patches a bit later too, probably not today,
don't wait for my feedback if you think you're otherwise ready to repost.