Message ID | 20190808140600.21477-7-antoine.tenart@bootlin.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | net: macsec: initial support for hardware offloading | expand |
On 08.08.2019 17:05, Antoine Tenart wrote: > The Rx and TX handlers are modified to take in account the special case > were the MACsec transformation happens in the hardware, whether in a PHY > or in a MAC, as the packets seen by the networking stack on both the Don't you think we may eventually may need xmit / handle_frame ops to be a part of macsec_ops? That way software macsec could be extract to just another type of offload. The drawback of current code is it doesn't show explicitly the path of offloaded packets. It is hidden in `handle_not_macsec` and in `macsec_start_xmit` branch. This makes incorrect counters to tick (see my below comment) Another thing is that both xmit / macsec_handle_frame can't now be customized by device driver. But this may be required. We for example have usecases and HW features to allow specific flows to bypass macsec encryption. This is normally used for macsec key control protocols, identified by ethertype. Your phy is also capable on that as I see. > @@ -2546,11 +2814,15 @@ static netdev_tx_t macsec_start_xmit(struct sk_buff *skb, > { > struct macsec_dev *macsec = netdev_priv(dev); > struct macsec_secy *secy = &macsec->secy; > + struct macsec_tx_sc *tx_sc = &secy->tx_sc; > struct pcpu_secy_stats *secy_stats; > + struct macsec_tx_sa *tx_sa; > int ret, len; > > + tx_sa = macsec_txsa_get(tx_sc->sa[tx_sc->encoding_sa]); Declared, but not used? > /* 10.5 */ > - if (!secy->protect_frames) { > + if (!secy->protect_frames || macsec_get_ops(netdev_priv(dev), NULL)) { > secy_stats = this_cpu_ptr(macsec->stats); > u64_stats_update_begin(&secy_stats->syncp); > secy_stats->stats.OutPktsUntagged++; Here you use same `if` for sw and hw flows, this making `OutPktsUntagged` counter invalid. > struct macsec_dev *macsec = macsec_priv(dev); > - struct net_device *real_dev; > + struct net_device *real_dev, *loop_dev; > + struct macsec_context ctx; > + const struct macsec_ops *ops; > + struct net *loop_net; Reverse Christmas tree is normally a formatting requirement where possible. > + for_each_net(loop_net) { > + for_each_netdev(loop_net, loop_dev) { > + struct macsec_dev *priv; > + > + if (!netif_is_macsec(loop_dev)) > + continue; > + > + priv = macsec_priv(loop_dev); > + > + /* A limitation of the MACsec h/w offloading is only a > + * single MACsec interface can be created for a given > + * real interface. > + */ > + if (macsec_get_ops(netdev_priv(dev), NULL) && > + priv->real_dev == real_dev) > + return -EBUSY; > + } > + } > + There is no need to do this search loop if `macsec_get_ops(..) == NULL` ? So you can extract this check before `for_each_net` for SW macsec... Regards, Igor
On Thu, Aug 08, 2019 at 04:05:57PM +0200, Antoine Tenart wrote: > This patch introduces the MACsec hardware offloading infrastructure. > > The main idea here is to re-use the logic and data structures of the > software MACsec implementation. This allows not to duplicate definitions > and structure storing the same kind of information. It also allows to > use a unified genlink interface for both MACsec implementations (so that > the same userspace tool, `ip macsec`, is used with the same arguments). > The MACsec offloading support cannot be disabled if an interface > supports it at the moment. > > The MACsec configuration is passed to device drivers supporting it > through macsec_hw_offload() which is called from the MACsec genl > helpers. This function calls the macsec ops of PHY and Ethernet > drivers in two steps Hi Antoine, Igor It is great that you are thinking how a MAC driver would make use of this. But on the flip side, we don't usual add an API unless there is a user. And as far as i see, you only add a PHY level implementation, not a MAC level. Igor, what is your interest here? I know the Aquantia PHY can do MACsec, but i guess you are more interested in the atlantic and AQC111 MAC drivers which hide the PHY behind firmware rather than make use of the Linux aquantia PHY driver. Are you likely to be contributing a MAC driver level implementation of MACsec soon? Thanks Andrew
Hi Andrew, On Sat, Aug 10, 2019 at 06:34:23PM +0200, Andrew Lunn wrote: > On Thu, Aug 08, 2019 at 04:05:57PM +0200, Antoine Tenart wrote: > > This patch introduces the MACsec hardware offloading infrastructure. > > > > The main idea here is to re-use the logic and data structures of the > > software MACsec implementation. This allows not to duplicate definitions > > and structure storing the same kind of information. It also allows to > > use a unified genlink interface for both MACsec implementations (so that > > the same userspace tool, `ip macsec`, is used with the same arguments). > > The MACsec offloading support cannot be disabled if an interface > > supports it at the moment. > > > > The MACsec configuration is passed to device drivers supporting it > > through macsec_hw_offload() which is called from the MACsec genl > > helpers. This function calls the macsec ops of PHY and Ethernet > > drivers in two steps > > It is great that you are thinking how a MAC driver would make use of > this. But on the flip side, we don't usual add an API unless there is > a user. And as far as i see, you only add a PHY level implementation, > not a MAC level. That's right, and the only modification here is a simple patch adding the MACsec ops within struct net_device. I can remove it as we do not have providers as of now and it can be added easily later on. Thanks, Antoine
Hi Igor, On Sat, Aug 10, 2019 at 01:20:32PM +0000, Igor Russkikh wrote: > On 08.08.2019 17:05, Antoine Tenart wrote: > > > The Rx and TX handlers are modified to take in account the special case > > were the MACsec transformation happens in the hardware, whether in a PHY > > or in a MAC, as the packets seen by the networking stack on both the > > Don't you think we may eventually may need xmit / handle_frame ops to be > a part of macsec_ops? > > That way software macsec could be extract to just another type of offload. > The drawback of current code is it doesn't show explicitly the path of > offloaded packets. It is hidden in `handle_not_macsec` and in > `macsec_start_xmit` branch. This makes incorrect counters to tick (see my below > comment) > > Another thing is that both xmit / macsec_handle_frame can't now be customized > by device driver. But this may be required. > We for example have usecases and HW features to allow specific flows to bypass > macsec encryption. This is normally used for macsec key control protocols, > identified by ethertype. Your phy is also capable on that as I see. I think this question is linked to the use of a MACsec virtual interface when using h/w offloading. The starting point for me was that I wanted to reuse the data structures and the API exposed to the userspace by the s/w implementation of MACsec. I then had two choices: keeping the exact same interface for the user (having a virtual MACsec interface), or registering the MACsec genl ops onto the real net devices (and making the s/w implementation a virtual net dev and a provider of the MACsec "offloading" ops). The advantages of the first option were that nearly all the logic of the s/w implementation could be kept and especially that it would be transparent for the user to use both implementations of MACsec. But this raised an issue as I had to modify the xmit / handle_frame ops to let all the traffic pass. This is because we have no way of knowing if a frame was handled by the MACsec h/w or not in ingress. So the virtual interface here only serve as the entrypoint for the API... The second option would have the advantage to better represent the actual flow, but the way of configuring MACsec would be a bit different for the user, whether he wants to use s/w or h/w MACsec. If we were to do this I think we could extract the genl functions from the MACsec s/w implementation, and let it implement the MACsec ops (exactly as the offloading drivers). I'm open to discussing this :) As for the need for xmit / handle_frame ops (for a MAC w/ MACsec offloading), I'd say the xmit / handle_frame ops of the real net device driver could be used as the one of the MACsec virtual interface do not do much (regardless of the implementation choice discussed above). > > @@ -2546,11 +2814,15 @@ static netdev_tx_t macsec_start_xmit(struct sk_buff *skb, > > { > > struct macsec_dev *macsec = netdev_priv(dev); > > struct macsec_secy *secy = &macsec->secy; > > + struct macsec_tx_sc *tx_sc = &secy->tx_sc; > > struct pcpu_secy_stats *secy_stats; > > + struct macsec_tx_sa *tx_sa; > > int ret, len; > > > > + tx_sa = macsec_txsa_get(tx_sc->sa[tx_sc->encoding_sa]); > > Declared, but not used? I'll remove it then. > > /* 10.5 */ > > - if (!secy->protect_frames) { > > + if (!secy->protect_frames || macsec_get_ops(netdev_priv(dev), NULL)) { > > secy_stats = this_cpu_ptr(macsec->stats); > > u64_stats_update_begin(&secy_stats->syncp); > > secy_stats->stats.OutPktsUntagged++; > > Here you use same `if` for sw and hw flows, this making `OutPktsUntagged` > counter invalid. Right, I'll try to fix that. > > struct macsec_dev *macsec = macsec_priv(dev); > > - struct net_device *real_dev; > > + struct net_device *real_dev, *loop_dev; > > + struct macsec_context ctx; > > + const struct macsec_ops *ops; > > + struct net *loop_net; > > Reverse Christmas tree is normally a formatting requirement where possible. Sure. > > + for_each_net(loop_net) { > > + for_each_netdev(loop_net, loop_dev) { > > + struct macsec_dev *priv; > > + > > + if (!netif_is_macsec(loop_dev)) > > + continue; > > + > > + priv = macsec_priv(loop_dev); > > + > > + /* A limitation of the MACsec h/w offloading is only a > > + * single MACsec interface can be created for a given > > + * real interface. > > + */ > > + if (macsec_get_ops(netdev_priv(dev), NULL) && > > + priv->real_dev == real_dev) > > + return -EBUSY; > > + } > > + } > > + > > There is no need to do this search loop if `macsec_get_ops(..) == NULL` ? > So you can extract this check before `for_each_net` for SW macsec... Right, I'll fix it! Thanks! Antoine
On 10.08.2019 19:34, Andrew Lunn wrote: > On Thu, Aug 08, 2019 at 04:05:57PM +0200, Antoine Tenart wrote: >> The MACsec configuration is passed to device drivers supporting it >> through macsec_hw_offload() which is called from the MACsec genl >> helpers. This function calls the macsec ops of PHY and Ethernet >> drivers in two steps > > Hi Antoine, Igor > > It is great that you are thinking how a MAC driver would make use of > this. But on the flip side, we don't usual add an API unless there is > a user. And as far as i see, you only add a PHY level implementation, > not a MAC level. > > Igor, what is your interest here? I know the Aquantia PHY can do > MACsec, but i guess you are more interested in the atlantic and AQC111 > MAC drivers which hide the PHY behind firmware rather than make use of > the Linux aquantia PHY driver. Are you likely to be contributing a MAC > driver level implementation of MACsec soon? Hi Andrew, Yes, we are interested in MAC level MACSec offload implementation. Although in our solution macsec engine itself is in Phy, we do actively use firmware support in areas of configuration, interrupt management. So from SW perspective that'll be MAC driver level macsec. Regards, Igor
On Tue, Aug 13, 2019 at 10:58:17AM +0200, Antoine Tenart wrote: > I think this question is linked to the use of a MACsec virtual interface > when using h/w offloading. The starting point for me was that I wanted > to reuse the data structures and the API exposed to the userspace by the > s/w implementation of MACsec. I then had two choices: keeping the exact > same interface for the user (having a virtual MACsec interface), or > registering the MACsec genl ops onto the real net devices (and making > the s/w implementation a virtual net dev and a provider of the MACsec > "offloading" ops). > > The advantages of the first option were that nearly all the logic of the > s/w implementation could be kept and especially that it would be > transparent for the user to use both implementations of MACsec. Hi Antoine We have always talked about offloading operations to the hardware, accelerating what the linux stack can do by making use of hardware accelerators. The basic user API should not change because of acceleration. Those are the general guidelines. It would however be interesting to get comments from those who did the software implementation and what they think of this architecture. I've no personal experience with MACSec, so it is hard for me to say if the current architecture makes sense when using accelerators. Andrew
On 13.08.2019 16:17, Andrew Lunn wrote: > On Tue, Aug 13, 2019 at 10:58:17AM +0200, Antoine Tenart wrote: >> I think this question is linked to the use of a MACsec virtual interface >> when using h/w offloading. The starting point for me was that I wanted >> to reuse the data structures and the API exposed to the userspace by the >> s/w implementation of MACsec. I then had two choices: keeping the exact >> same interface for the user (having a virtual MACsec interface), or >> registering the MACsec genl ops onto the real net devices (and making >> the s/w implementation a virtual net dev and a provider of the MACsec >> "offloading" ops). >> >> The advantages of the first option were that nearly all the logic of the >> s/w implementation could be kept and especially that it would be >> transparent for the user to use both implementations of MACsec. > > Hi Antoine > > We have always talked about offloading operations to the hardware, > accelerating what the linux stack can do by making use of hardware > accelerators. The basic user API should not change because of > acceleration. Those are the general guidelines. > > It would however be interesting to get comments from those who did the > software implementation and what they think of this architecture. I've > no personal experience with MACSec, so it is hard for me to say if the > current architecture makes sense when using accelerators. In terms of overall concepts, I'd add the following: 1) With current implementation it's impossible to install SW macsec engine onto the device which supports HW offload. That could be a strong limitation in cases when user sees HW macsec offload is broken or work differently, and he/she wants to replace it with SW one. MACSec is a complex feature, and it may happen something is missing in HW. Trivial example is 256bit encryption, which is not always a musthave in HW implementations. 2) I think, Antoine, its not totally true that otherwise the user macsec API will be broken/changed. netlink api is the same, the only thing we may want to add is an optional parameter to force selection of SW macsec engine. I'm also eager to hear from sw macsec users/devs on whats better here. Regards, Igor
> 1) With current implementation it's impossible to install SW macsec engine onto > the device which supports HW offload. That could be a strong limitation in > cases when user sees HW macsec offload is broken or work differently, and he/she > wants to replace it with SW one. > MACSec is a complex feature, and it may happen something is missing in HW. > Trivial example is 256bit encryption, which is not always a musthave in HW > implementations. Ideally, we want the driver to return EOPNOTSUPP if it does not support something and the software implement should be used. If the offload is broken, we want a bug report! And if it works differently, it suggests there is also a bug we need to fix, or the standard is ambiguous. It would also be nice to add extra information to the netlink API to indicate if HW or SW is being used. In other places where we offload to accelerators we have such additional information. Andrew
Hi Igor, On Tue, Aug 13, 2019 at 04:18:40PM +0000, Igor Russkikh wrote: > On 13.08.2019 16:17, Andrew Lunn wrote: > > On Tue, Aug 13, 2019 at 10:58:17AM +0200, Antoine Tenart wrote: > >> I think this question is linked to the use of a MACsec virtual interface > >> when using h/w offloading. The starting point for me was that I wanted > >> to reuse the data structures and the API exposed to the userspace by the > >> s/w implementation of MACsec. I then had two choices: keeping the exact > >> same interface for the user (having a virtual MACsec interface), or > >> registering the MACsec genl ops onto the real net devices (and making > >> the s/w implementation a virtual net dev and a provider of the MACsec > >> "offloading" ops). > >> > >> The advantages of the first option were that nearly all the logic of the > >> s/w implementation could be kept and especially that it would be > >> transparent for the user to use both implementations of MACsec. > > > > We have always talked about offloading operations to the hardware, > > accelerating what the linux stack can do by making use of hardware > > accelerators. The basic user API should not change because of > > acceleration. Those are the general guidelines. > > > > It would however be interesting to get comments from those who did the > > software implementation and what they think of this architecture. I've > > no personal experience with MACSec, so it is hard for me to say if the > > current architecture makes sense when using accelerators. > > In terms of overall concepts, I'd add the following: > > 1) With current implementation it's impossible to install SW macsec engine onto > the device which supports HW offload. That could be a strong limitation in > cases when user sees HW macsec offload is broken or work differently, and he/she > wants to replace it with SW one. > MACSec is a complex feature, and it may happen something is missing in HW. > Trivial example is 256bit encryption, which is not always a musthave in HW > implementations. Agreed. I'm not sure it would be possible to have both used at the same time but there should be a way to switch between the two implementations. That is not supported for now, but I think that would be a good thing, and can probably come later on. > 2) I think, Antoine, its not totally true that otherwise the user macsec API > will be broken/changed. netlink api is the same, the only thing we may want to > add is an optional parameter to force selection of SW macsec engine. I meant that we can either have a virtual net device representing the MACsec feature and being the iface used to configure it, or we could have it only when s/w MACsec is used. That to me is part of the "API", or at least part of what's exposed to the user. > I'm also eager to hear from sw macsec users/devs on whats better here. I'd like more comments as well :) Thanks! Antoine
Hi Andrew, On Tue, Aug 13, 2019 at 06:28:23PM +0200, Andrew Lunn wrote: > > It would also be nice to add extra information to the netlink API to > indicate if HW or SW is being used. In other places where we offload > to accelerators we have such additional information. Yes, that would be very nice to have. Thanks! Antoine
On 8/13/19 9:28 AM, Andrew Lunn wrote: >> 1) With current implementation it's impossible to install SW macsec engine onto >> the device which supports HW offload. That could be a strong limitation in >> cases when user sees HW macsec offload is broken or work differently, and he/she >> wants to replace it with SW one. >> MACSec is a complex feature, and it may happen something is missing in HW. >> Trivial example is 256bit encryption, which is not always a musthave in HW >> implementations. > > Ideally, we want the driver to return EOPNOTSUPP if it does not > support something and the software implement should be used. > > If the offload is broken, we want a bug report! And if it works > differently, it suggests there is also a bug we need to fix, or the > standard is ambiguous. > > It would also be nice to add extra information to the netlink API to > indicate if HW or SW is being used. In other places where we offload > to accelerators we have such additional information. Igor's point is entirely valid in that you should allow both offload to HW for what is possible and offload to a software implementation for what is not supported in HW. Let's an example that is hopefully more familiar to the people in this thread. Consider a NIC that can do single VLAN tag offload on xmit (or receive, does not matter), and you find yourself using a double VLAN tag configuration. You would create a first VLAN stacked network device which is fully offloaded onto the underlying NIC, and a second VLAN stacked network device on top of the first once which is not offloaded. The way I would imagine a MACsec offload is kind of similar here, in that it would be a macsec virtual network device on top of an underlying physical device. If no offload is possible, the virtual network device's xmit/receive path is used. If the NIC driver can offload it, it does not. How it does it, whether at the MAC/DMA level, or at the PHY level can be a check added at the appropriate level.
2019-08-13, 10:58:17 +0200, Antoine Tenart wrote: > Hi Igor, > > On Sat, Aug 10, 2019 at 01:20:32PM +0000, Igor Russkikh wrote: > > On 08.08.2019 17:05, Antoine Tenart wrote: > > > > > The Rx and TX handlers are modified to take in account the special case > > > were the MACsec transformation happens in the hardware, whether in a PHY > > > or in a MAC, as the packets seen by the networking stack on both the > > > > Don't you think we may eventually may need xmit / handle_frame ops to be > > a part of macsec_ops? > > > > That way software macsec could be extract to just another type of offload. > > The drawback of current code is it doesn't show explicitly the path of > > offloaded packets. It is hidden in `handle_not_macsec` and in > > `macsec_start_xmit` branch. This makes incorrect counters to tick (see my below > > comment) > > > > Another thing is that both xmit / macsec_handle_frame can't now be customized > > by device driver. But this may be required. > > We for example have usecases and HW features to allow specific flows to bypass > > macsec encryption. This is normally used for macsec key control protocols, > > identified by ethertype. Your phy is also capable on that as I see. > > I think this question is linked to the use of a MACsec virtual interface > when using h/w offloading. The starting point for me was that I wanted > to reuse the data structures and the API exposed to the userspace by the > s/w implementation of MACsec. I then had two choices: keeping the exact > same interface for the user (having a virtual MACsec interface), or Unless it's really infeasible, yes, that's how things should be done IMO. > registering the MACsec genl ops onto the real net devices (and making > the s/w implementation a virtual net dev and a provider of the MACsec > "offloading" ops). Please, no :( Let's keep it as close as possible to the software implementation, unless there's a really good reason not to. It's not just "ip macsec" btw, wpa_supplicant can also configure MACsec and would also need some logic to pick the device on which to do the genl operations in that case. > The advantages of the first option were that nearly all the logic of the > s/w implementation could be kept and especially that it would be > transparent for the user to use both implementations of MACsec. But this > raised an issue as I had to modify the xmit / handle_frame ops to let > all the traffic pass. This is because we have no way of knowing if a > frame was handled by the MACsec h/w or not in ingress. So the virtual > interface here only serve as the entrypoint for the API... It's also the interface on which you'll run DHCP or install IP addresses. > The second option would have the advantage to better represent the actual > flow, but the way of configuring MACsec would be a bit different for the > user, whether he wants to use s/w or h/w MACsec. If we were to do this I > think we could extract the genl functions from the MACsec s/w > implementation, and let it implement the MACsec ops (exactly as the > offloading drivers). > > I'm open to discussing this :) > > As for the need for xmit / handle_frame ops (for a MAC w/ MACsec > offloading), I'd say the xmit / handle_frame ops of the real net device > driver could be used as the one of the MACsec virtual interface do not > do much (regardless of the implementation choice discussed above). There's no "handle_frame" op on a real device. macsec_handle_frame is an rx_handler specificity that grabs packets from a real device and sends them into a virtual device stacked on top of it. A real device just hands packets over to the stack via NAPI. > > > @@ -2546,11 +2814,15 @@ static netdev_tx_t macsec_start_xmit(struct sk_buff *skb, > > > { > > > struct macsec_dev *macsec = netdev_priv(dev); > > > struct macsec_secy *secy = &macsec->secy; > > > + struct macsec_tx_sc *tx_sc = &secy->tx_sc; > > > struct pcpu_secy_stats *secy_stats; > > > + struct macsec_tx_sa *tx_sa; > > > int ret, len; > > > > > > + tx_sa = macsec_txsa_get(tx_sc->sa[tx_sc->encoding_sa]); > > > > Declared, but not used? > > I'll remove it then. That's also a refcount leak, so, yes, please get rid of it. [I'll answer the rest of the patch separately]
2019-08-13, 18:28:23 +0200, Andrew Lunn wrote: > > 1) With current implementation it's impossible to install SW macsec engine onto > > the device which supports HW offload. That could be a strong limitation in > > cases when user sees HW macsec offload is broken or work differently, and he/she > > wants to replace it with SW one. > > MACSec is a complex feature, and it may happen something is missing in HW. > > Trivial example is 256bit encryption, which is not always a musthave in HW > > implementations. > > Ideally, we want the driver to return EOPNOTSUPP if it does not > support something and the software implement should be used. > > If the offload is broken, we want a bug report! And if it works > differently, it suggests there is also a bug we need to fix, or the > standard is ambiguous. Yes. But in the meantime, we want the user to be able to disable the offload. It's helpful for debugging purposes, and it can provide some level of functionality until the bug is fixed or non-buggy hardware becomes available. > It would also be nice to add extra information to the netlink API to > indicate if HW or SW is being used. In other places where we offload > to accelerators we have such additional information. +1
2019-08-13, 16:18:40 +0000, Igor Russkikh wrote: > On 13.08.2019 16:17, Andrew Lunn wrote: > > On Tue, Aug 13, 2019 at 10:58:17AM +0200, Antoine Tenart wrote: > >> I think this question is linked to the use of a MACsec virtual interface > >> when using h/w offloading. The starting point for me was that I wanted > >> to reuse the data structures and the API exposed to the userspace by the > >> s/w implementation of MACsec. I then had two choices: keeping the exact > >> same interface for the user (having a virtual MACsec interface), or > >> registering the MACsec genl ops onto the real net devices (and making > >> the s/w implementation a virtual net dev and a provider of the MACsec > >> "offloading" ops). > >> > >> The advantages of the first option were that nearly all the logic of the > >> s/w implementation could be kept and especially that it would be > >> transparent for the user to use both implementations of MACsec. > > > > Hi Antoine > > > > We have always talked about offloading operations to the hardware, > > accelerating what the linux stack can do by making use of hardware > > accelerators. The basic user API should not change because of > > acceleration. Those are the general guidelines. > > > > It would however be interesting to get comments from those who did the > > software implementation and what they think of this architecture. I've > > no personal experience with MACSec, so it is hard for me to say if the > > current architecture makes sense when using accelerators. > > In terms of overall concepts, I'd add the following: > > 1) With current implementation it's impossible to install SW macsec engine onto > the device which supports HW offload. You mean how it's implemented in this patchset? > That could be a strong limitation in > cases when user sees HW macsec offload is broken or work differently, and he/she > wants to replace it with SW one. Agreed, I think an offload that cannot be disabled is quite problematic. > MACSec is a complex feature, and it may happen something is missing in HW. > Trivial example is 256bit encryption, which is not always a musthave in HW > implementations. +1 > 2) I think, Antoine, its not totally true that otherwise the user macsec API > will be broken/changed. netlink api is the same, the only thing we may want to > add is an optional parameter to force selection of SW macsec engine. Yes, I think we need an offload on/off parameter (and IMO it should probably be off by default). Then, if offloading is requested but cannot be satisfied (unsupported key length, too many SAs, etc), or if incompatible settings are requested (mixing offloaded and non-offloaded SCs on a device that cannot do it), return an error. If we also export that offload parameter during netlink dumps, we can inspect the state of the system, which helps for debugging. > I'm also eager to hear from sw macsec users/devs on whats better here. I don't do much development on MACsec these days, and I don't personally use it outside of testing and development.
Hi Sabrina, On Fri, Aug 16, 2019 at 03:29:59PM +0200, Sabrina Dubroca wrote: > 2019-08-13, 16:18:40 +0000, Igor Russkikh wrote: > > On 13.08.2019 16:17, Andrew Lunn wrote: > > > That could be a strong limitation in > > cases when user sees HW macsec offload is broken or work differently, and he/she > > wants to replace it with SW one. > > Agreed, I think an offload that cannot be disabled is quite problematic. > > > MACSec is a complex feature, and it may happen something is missing in HW. > > Trivial example is 256bit encryption, which is not always a musthave in HW > > implementations. > > +1 > > > 2) I think, Antoine, its not totally true that otherwise the user macsec API > > will be broken/changed. netlink api is the same, the only thing we may want to > > add is an optional parameter to force selection of SW macsec engine. > > Yes, I think we need an offload on/off parameter (and IMO it should > probably be off by default). Then, if offloading is requested but > cannot be satisfied (unsupported key length, too many SAs, etc), or if > incompatible settings are requested (mixing offloaded and > non-offloaded SCs on a device that cannot do it), return an error. > > If we also export that offload parameter during netlink dumps, we can > inspect the state of the system, which helps for debugging. So it seems the ability to enable or disable the offloading on a given interface is the main missing feature. I'll add that, however I'll probably (at least at first): - Have the interface to be fully offloaded or fully handled in s/w (with errors being thrown if a given configuration isn't supported). Having both at the same time on a given interface would be tricky because of the MACsec validation parameter. - Won't allow to enable/disable the offloading of there are rules in place, as we're not sure the same rules would be accepted by the other implementation. I'm not sure if we should allow to mix the implementations on a given physical interface (by having two MACsec interfaces attached) as the validation would be impossible to do (we would have no idea if a packet was correctly handled by the offloading part or just not being a MACsec packet in the first place, in Rx). I agree the offloading should be disabled by default, and only enabled by an user explicitly. Thanks, Antoine
Hi Andrew, On Tue, Aug 13, 2019 at 06:28:23PM +0200, Andrew Lunn wrote: > > 1) With current implementation it's impossible to install SW macsec engine onto > > the device which supports HW offload. That could be a strong limitation in > > cases when user sees HW macsec offload is broken or work differently, and he/she > > wants to replace it with SW one. > > MACSec is a complex feature, and it may happen something is missing in HW. > > Trivial example is 256bit encryption, which is not always a musthave in HW > > implementations. > > It would also be nice to add extra information to the netlink API to > indicate if HW or SW is being used. In other places where we offload > to accelerators we have such additional information. Agreed, in addition to being able to enable/disable the offloading we should have a way to know if a MACsec interface is being offloaded or not. Thanks! Antoine
Hi Sabrina, On Fri, Aug 16, 2019 at 03:25:00PM +0200, Sabrina Dubroca wrote: > 2019-08-13, 10:58:17 +0200, Antoine Tenart wrote: > > > > As for the need for xmit / handle_frame ops (for a MAC w/ MACsec > > offloading), I'd say the xmit / handle_frame ops of the real net device > > driver could be used as the one of the MACsec virtual interface do not > > do much (regardless of the implementation choice discussed above). > > There's no "handle_frame" op on a real device. macsec_handle_frame is > an rx_handler specificity that grabs packets from a real device and > sends them into a virtual device stacked on top of it. A real device > just hands packets over to the stack via NAPI. Right, so if there is a need for a custom implementation of xmit / hamdle_frame we could let the offloading driver provide it. I'll probably let the first MAC implementation add it, as we agreed not to add an API with no user (and mine is done in the PHY). Thanks! Antoine
2019-08-20, 12:01:40 +0200, Antoine Tenart wrote: > So it seems the ability to enable or disable the offloading on a given > interface is the main missing feature. I'll add that, however I'll > probably (at least at first): > > - Have the interface to be fully offloaded or fully handled in s/w (with > errors being thrown if a given configuration isn't supported). Having > both at the same time on a given interface would be tricky because of > the MACsec validation parameter. > > - Won't allow to enable/disable the offloading of there are rules in > place, as we're not sure the same rules would be accepted by the other > implementation. That's probably quite problematic actually, because to do that you need to be able to resync the state between software and hardware, particularly packet numbers. So, yeah, we're better off having it completely blocked until we have a working implementation (if that ever happens). However, that means in case the user wants to set up something that's not offloadable, they'll have to: - configure the offloaded version until EOPNOTSUPP - tear everything down - restart from scratch without offloading That's inconvenient. Talking about packet numbers, can you describe how PN exhaustion is handled? I couldn't find much about packet numbers at all in the driver patches (I hope the hw doesn't wrap around from 2^32-1 to 0 on the same SA). At some point userspace needs to know that we're getting close to 2^32 and that it's time to re-key. Since the whole TX path of the software implementation is bypassed, it looks like the PN (as far as drivers/net/macsec.c is concerned) never increases, so userspace can't know when to negotiate a new SA. > I'm not sure if we should allow to mix the implementations on a given > physical interface (by having two MACsec interfaces attached) as the > validation would be impossible to do (we would have no idea if a > packet was correctly handled by the offloading part or just not being > a MACsec packet in the first place, in Rx). That's something that really bothers me with this proposal. Quoting from the commit message: > the packets seen by the networking stack on both the physical and > MACsec virtual interface are exactly the same If the HW/driver is expected to strip the sectag, I don't see how we could ever have multiple secy's at the same time and demultiplex properly between them. That's part of the reason why I chose to have virtual interfaces: without them, picking the right secy on TX gets weird. AFAICT, it means that any filters (tc, tcpdump) need to be different between offloaded and non-offloaded cases. And it's going to be confusing to the administrator when they look at tcpdumps expecting to see MACsec frames. I don't see how this implementation handles non-macsec traffic (on TX, that would be packets sent directly through the real interface, for example by wpa_supplicant - on RX, incoming MKA traffic for wpa_supplicant). Unless I missed something, incoming MKA traffic will end up on the macsec interface as well as the lower interface (not entirely critical, as long as wpa_supplicant can grab it on the lower device, but not consistent with the software implementation). How does the driver distinguish traffic that should pass through unmodified from traffic that the HW needs to encapsulate and encrypt? If you look at IPsec offloading, the networking stack builds up the ESP header, and passes the unencrypted data down to the driver. I'm wondering if the same would be possible with MACsec offloading: the macsec virtual interface adds the header (and maybe a dummy ICV), and then the HW does the encryption. In case of HW that needs to add the sectag itself, the driver would first strip the headers that the stack created. On receive, the driver would recreate a sectag and the macsec interface would just skip all verification (decrypt, PN).
> If you look at IPsec offloading, the networking stack builds up the > ESP header, and passes the unencrypted data down to the driver. I'm > wondering if the same would be possible with MACsec offloading: the > macsec virtual interface adds the header (and maybe a dummy ICV), and > then the HW does the encryption. In case of HW that needs to add the > sectag itself, the driver would first strip the headers that the stack > created. On receive, the driver would recreate a sectag and the macsec > interface would just skip all verification (decrypt, PN). Hi Sabrina I assume the software implementation cannot make use of TSO or GSO, letting the hardware segment a big buffer up into Ethernet frames? When using hardware MACSEC, is it possible to enable these? By the time the frames have reach the PHY GSO has been done. So it sees a stream of frames it needs to encode/decode. But if you are suggesting the extra headers are added by the virtual interface, i don't think GSO can be used? My guess would be, we get a performance boost from using hardware MAC sec, but there will also be a performance boost if GSO can be enabled when it was disabled before. Andrew
> > Talking about packet numbers, can you describe how PN exhaustion is > handled? I couldn't find much about packet numbers at all in the > driver patches (I hope the hw doesn't wrap around from 2^32-1 to 0 on > the same SA). At some point userspace needs to know that we're > getting close to 2^32 and that it's time to re-key. Since the whole > TX path of the software implementation is bypassed, it looks like the > PN (as far as drivers/net/macsec.c is concerned) never increases, so > userspace can't know when to negotiate a new SA. I think there should be driver specific implementation of this functionality. As an example, our macsec HW issues an interrupt towards the host to indicate PN threshold has reached and it's time for userspace to change the keys. In contrast, current SW macsec implementation just stops this SA/secy. > I don't see how this implementation handles non-macsec traffic (on TX, > that would be packets sent directly through the real interface, for > example by wpa_supplicant - on RX, incoming MKA traffic for > wpa_supplicant). Unless I missed something, incoming MKA traffic will > end up on the macsec interface as well as the lower interface (not > entirely critical, as long as wpa_supplicant can grab it on the lower > device, but not consistent with the software implementation). How does > the driver distinguish traffic that should pass through unmodified > from traffic that the HW needs to encapsulate and encrypt? I can comment on our HW engine - where it has special bypass rules for configured ethertypes. This way macsec engine skips encryption on TX and passes in RX unencrypted for the selected control packets. But thats true, realdev driver is hard to distinguish encrypted/unencrypted packets. In case realdev should make a decision where to put RX packet, it only may do some heuristic (since after macsec decription all the macsec related info is dropped. Thats true at least for our HW implementation). > If you look at IPsec offloading, the networking stack builds up the > ESP header, and passes the unencrypted data down to the driver. I'm > wondering if the same would be possible with MACsec offloading: the > macsec virtual interface adds the header (and maybe a dummy ICV), and > then the HW does the encryption. In case of HW that needs to add the > sectag itself, the driver would first strip the headers that the stack > created. On receive, the driver would recreate a sectag and the macsec > interface would just skip all verification (decrypt, PN). I don't think this way is good, as driver have to do per packet header mangling. That'll harm linerate performance heavily. Regards, Igor
Hi, I can add some information to the HW Antoine is working on, general design of it and the thoughts behind it. See below. The 08/20/2019 16:41, Sabrina Dubroca wrote: > 2019-08-20, 12:01:40 +0200, Antoine Tenart wrote: > > So it seems the ability to enable or disable the offloading on a given > > interface is the main missing feature. I'll add that, however I'll > > probably (at least at first): > > > > - Have the interface to be fully offloaded or fully handled in s/w (with > > errors being thrown if a given configuration isn't supported). Having > > both at the same time on a given interface would be tricky because of > > the MACsec validation parameter. > > > > - Won't allow to enable/disable the offloading of there are rules in > > place, as we're not sure the same rules would be accepted by the other > > implementation. > > That's probably quite problematic actually, because to do that you > need to be able to resync the state between software and hardware, > particularly packet numbers. So, yeah, we're better off having it > completely blocked until we have a working implementation (if that > ever happens). > > However, that means in case the user wants to set up something that's > not offloadable, they'll have to: > - configure the offloaded version until EOPNOTSUPP > - tear everything down > - restart from scratch without offloading > > That's inconvenient. > > Talking about packet numbers, can you describe how PN exhaustion is > handled? I couldn't find much about packet numbers at all in the > driver patches (I hope the hw doesn't wrap around from 2^32-1 to 0 on > the same SA). New SA's are suppose to be installed ahead of time. The HW will automatic move to the next SA and reset the PN. > At some point userspace needs to know that we're > getting close to 2^32 and that it's time to re-key. Since the whole > TX path of the software implementation is bypassed, it looks like the > PN (as far as drivers/net/macsec.c is concerned) never increases, so > userspace can't know when to negotiate a new SA. > > > I'm not sure if we should allow to mix the implementations on a given > > physical interface (by having two MACsec interfaces attached) as the > > validation would be impossible to do (we would have no idea if a > > packet was correctly handled by the offloading part or just not being > > a MACsec packet in the first place, in Rx). > > That's something that really bothers me with this proposal. Quoting > from the commit message: > > > the packets seen by the networking stack on both the physical and > > MACsec virtual interface are exactly the same > > If the HW/driver is expected to strip the sectag, I don't see how we > could ever have multiple secy's at the same time and demultiplex > properly between them. That's part of the reason why I chose to have > virtual interfaces: without them, picking the right secy on TX gets > weird. The HW does frame clasification, and use the claisfication to associate frames to a given secy. We we in SW have eth0, with 2 vlan-sub interfaces, and enable macsec on those, then we have: eth0 eth0.10 eth0.10.macsec eth0.20 eth0.20.macsec In this case the HW needs to be configured to match vlan 10 to one secy, and vlan 20 to an other one. This is nor supported in the current patch, but is something we can add later. We just wanted to get the basic functionallity done right before moving on to this. But in the current design, there is nothing that prevent us from adding this. If anyone is interested in the details of this then it is described in section 3.6.3 in http://ww1.microchip.com/downloads/en/DeviceDoc/VMDS-10455.pdf It is possible to construct an encapsulation that the HW can not classify correctly. If that is the case, then we should reject the HW offload, and use the SW. But it is a good point, which I think is missing. We should properly reject MACsec HW offload (with this driver, in this state) on virtual interfaces, if the encapsulation can not be handled (could start by reject all virtual interfaces) > AFAICT, it means that any filters (tc, tcpdump) need to be different > between offloaded and non-offloaded cases. It will see the result of the offloaded operation. But I guess that this is no different from when 'tc' operations are offloaded to HW. > How does the driver distinguish traffic that should pass through unmodified > from traffic that the HW needs to encapsulate and encrypt? It relay on frame classification (I think Antoine is missing a flow configuration to bypass all MKA traffic). > If you look at IPsec offloading, the networking stack builds up the > ESP header, and passes the unencrypted data down to the driver. I'm > wondering if the same would be possible with MACsec offloading: the > macsec virtual interface adds the header (and maybe a dummy ICV), and > then the HW does the encryption. In case of HW that needs to add the > sectag itself, the driver would first strip the headers that the stack > created. On receive, the driver would recreate a sectag and the macsec > interface would just skip all verification (decrypt, PN). I do not think this is possible with this HW, nor do I think this is desirable. One of the big differences between MACsec and IPsec, is the fact that it is a L2 protocol, designed to be running on switches (this is how MACsec claims to limit key-distributions issues, as all frames are decrypted when entering a switch, and encrypted with a new key when leaving). If a SW stack needs to add a tag to the frame, then we cannot offload traffic being bridged in HW, and never goes to the CPU. /Allan
The 08/21/2019 09:20, Igor Russkikh wrote: > > Talking about packet numbers, can you describe how PN exhaustion is > > handled? I couldn't find much about packet numbers at all in the > > driver patches (I hope the hw doesn't wrap around from 2^32-1 to 0 on > > the same SA). At some point userspace needs to know that we're > > getting close to 2^32 and that it's time to re-key. Since the whole > > TX path of the software implementation is bypassed, it looks like the > > PN (as far as drivers/net/macsec.c is concerned) never increases, so > > userspace can't know when to negotiate a new SA. > > I think there should be driver specific implementation of this functionality. > As an example, our macsec HW issues an interrupt towards the host to indicate > PN threshold has reached and it's time for userspace to change the keys. > > In contrast, current SW macsec implementation just stops this SA/secy. > > > I don't see how this implementation handles non-macsec traffic (on TX, > > that would be packets sent directly through the real interface, for > > example by wpa_supplicant - on RX, incoming MKA traffic for > > wpa_supplicant). Unless I missed something, incoming MKA traffic will > > end up on the macsec interface as well as the lower interface (not > > entirely critical, as long as wpa_supplicant can grab it on the lower > > device, but not consistent with the software implementation). How does > > the driver distinguish traffic that should pass through unmodified > > from traffic that the HW needs to encapsulate and encrypt? > > I can comment on our HW engine - where it has special bypass rules > for configured ethertypes. This way macsec engine skips encryption on TX and > passes in RX unencrypted for the selected control packets. In our case it is a TCAM which can look at various fields (including ethertype), but is sounds like we have a vary similar design. > But thats true, realdev driver is hard to distinguish encrypted/unencrypted > packets. In case realdev should make a decision where to put RX packet, > it only may do some heuristic (since after macsec decription all the > macsec related info is dropped. Thats true at least for our HW implementation). Same for ours. > > If you look at IPsec offloading, the networking stack builds up the > > ESP header, and passes the unencrypted data down to the driver. I'm > > wondering if the same would be possible with MACsec offloading: the > > macsec virtual interface adds the header (and maybe a dummy ICV), and > > then the HW does the encryption. In case of HW that needs to add the > > sectag itself, the driver would first strip the headers that the stack > > created. On receive, the driver would recreate a sectag and the macsec > > interface would just skip all verification (decrypt, PN). > > I don't think this way is good, as driver have to do per packet header mangling. > That'll harm linerate performance heavily. Agree, and it will also prevent MACsec offload in offloaded bridge devices. /Allan
Hi Sabrina, On Tue, Aug 20, 2019 at 04:41:19PM +0200, Sabrina Dubroca wrote: > 2019-08-20, 12:01:40 +0200, Antoine Tenart wrote: > > So it seems the ability to enable or disable the offloading on a given > > interface is the main missing feature. I'll add that, however I'll > > probably (at least at first): > > > > - Have the interface to be fully offloaded or fully handled in s/w (with > > errors being thrown if a given configuration isn't supported). Having > > both at the same time on a given interface would be tricky because of > > the MACsec validation parameter. > > > > - Won't allow to enable/disable the offloading of there are rules in > > place, as we're not sure the same rules would be accepted by the other > > implementation. > > That's probably quite problematic actually, because to do that you > need to be able to resync the state between software and hardware, > particularly packet numbers. So, yeah, we're better off having it > completely blocked until we have a working implementation (if that > ever happens). > > However, that means in case the user wants to set up something that's > not offloadable, they'll have to: > - configure the offloaded version until EOPNOTSUPP > - tear everything down > - restart from scratch without offloading > > That's inconvenient. That's right, the user might have to replay the whole configuration if on rule failed to match the h/w requirements. It's inconvenient, but I think it's better to be safe until we have (if that happens) a working implementation of synchronizing the rules' state. > Talking about packet numbers, can you describe how PN exhaustion is > handled? I couldn't find much about packet numbers at all in the > driver patches (I hope the hw doesn't wrap around from 2^32-1 to 0 on > the same SA). At some point userspace needs to know that we're > getting close to 2^32 and that it's time to re-key. Since the whole > TX path of the software implementation is bypassed, it looks like the > PN (as far as drivers/net/macsec.c is concerned) never increases, so > userspace can't know when to negotiate a new SA. That's a very good point. It actually was on my todo list for the next version (I wanted to discuss the other points first). We would also need to sync the stats at some point. > > I'm not sure if we should allow to mix the implementations on a given > > physical interface (by having two MACsec interfaces attached) as the > > validation would be impossible to do (we would have no idea if a > > packet was correctly handled by the offloading part or just not being > > a MACsec packet in the first place, in Rx). > > That's something that really bothers me with this proposal. Quoting > from the commit message: > > > the packets seen by the networking stack on both the physical and > > MACsec virtual interface are exactly the same That bothers me as well. > If the HW/driver is expected to strip the sectag, I don't see how we > could ever have multiple secy's at the same time and demultiplex > properly between them. That's part of the reason why I chose to have > virtual interfaces: without them, picking the right secy on TX gets > weird. > > AFAICT, it means that any filters (tc, tcpdump) need to be different > between offloaded and non-offloaded cases. > > And it's going to be confusing to the administrator when they look at > tcpdumps expecting to see MACsec frames. Right. I did not see how *not* to strip the sectag in the h/w back then, I'll have another look because that would improve things a lot. @all: do other MACsec offloading hardware allow not to stip the sectag? > I don't see how this implementation handles non-macsec traffic (on TX, > that would be packets sent directly through the real interface, for > example by wpa_supplicant - on RX, incoming MKA traffic for > wpa_supplicant). Unless I missed something, incoming MKA traffic will > end up on the macsec interface as well as the lower interface (not > entirely critical, as long as wpa_supplicant can grab it on the lower > device, but not consistent with the software implementation). That's right, as we have no way to tell if an Rx packet was MACsec or non-MACsec traffic, both will end up on both interfaces. Some h/w may be able to insert a custom header (or may allow not to strip the sectag), but I did not find anything related to this in mine (I'll double check). > How does the driver distinguish traffic that should pass through > unmodified from traffic that the HW needs to encapsulate and encrypt? At least in PHYs, packets go in a classification unit (that can match on multiple parts of the packet, given the hardware capabilities, eg. the MAC addresses). The result of the match is an action, which can be "bypass the MACsec block" or "go through it (which links the packet to a given configuration)". This is done in Rx and in Tx, and this is how the h/w block will know what to encapsulate and encrypt. Thanks, Antoine
> Right. I did not see how *not* to strip the sectag in the h/w back then, > I'll have another look because that would improve things a lot. > > @all: do other MACsec offloading hardware allow not to stip the sectag? I've just checked this, and see an action option in our HW classifier to keep macsec header with optional error information added. But we've never experimented configuring this honestly, I don't think we should rely in general design that such a feature is widely available in HW. Regards, Igor
diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index 3815cb6e9bf2..74f0e06a9fc2 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -11,11 +11,13 @@ #include <linux/module.h> #include <crypto/aead.h> #include <linux/etherdevice.h> +#include <linux/netdevice.h> #include <linux/rtnetlink.h> #include <linux/refcount.h> #include <net/genetlink.h> #include <net/sock.h> #include <net/gro_cells.h> +#include <linux/phy.h> #include <uapi/linux/if_macsec.h> @@ -318,6 +320,44 @@ static void macsec_set_shortlen(struct macsec_eth_header *h, size_t data_len) h->short_length = data_len; } +/* Checks if underlying layers implement MACsec offloading functions + * and returns a pointer to the MACsec ops struct if any (also updates + * the MACsec context device reference if provided). + */ +static const struct macsec_ops *macsec_get_ops(struct macsec_dev *dev, + struct macsec_context *ctx) +{ + struct phy_device *phydev; + + if (!dev || !dev->real_dev) + return NULL; + + /* Check if the PHY device provides MACsec ops */ + phydev = dev->real_dev->phydev; + if (phydev && phydev->macsec_ops) { + if (ctx) { + memset(ctx, 0, sizeof(*ctx)); + ctx->phydev = phydev; + ctx->is_phy = 1; + } + + return phydev->macsec_ops; + } + + /* Check if the net device provides MACsec ops */ + if (dev->real_dev->features & NETIF_F_HW_MACSEC && + dev->real_dev->macsec_ops) { + if (ctx) { + memset(ctx, 0, sizeof(*ctx)); + ctx->netdev = dev->real_dev; + } + + return dev->real_dev->macsec_ops; + } + + return NULL; +} + /* validate MACsec packet according to IEEE 802.1AE-2006 9.12 */ static bool macsec_validate_skb(struct sk_buff *skb, u16 icv_len) { @@ -867,8 +907,10 @@ static struct macsec_rx_sc *find_rx_sc_rtnl(struct macsec_secy *secy, sci_t sci) return NULL; } -static void handle_not_macsec(struct sk_buff *skb) +static enum rx_handler_result handle_not_macsec(struct sk_buff *skb) { + /* Deliver to the uncontrolled port by default */ + enum rx_handler_result ret = RX_HANDLER_PASS; struct macsec_rxh_data *rxd; struct macsec_dev *macsec; @@ -883,7 +925,8 @@ static void handle_not_macsec(struct sk_buff *skb) struct sk_buff *nskb; struct pcpu_secy_stats *secy_stats = this_cpu_ptr(macsec->stats); - if (macsec->secy.validate_frames == MACSEC_VALIDATE_STRICT) { + if (!macsec_get_ops(macsec, NULL) && + macsec->secy.validate_frames == MACSEC_VALIDATE_STRICT) { u64_stats_update_begin(&secy_stats->syncp); secy_stats->stats.InPktsNoTag++; u64_stats_update_end(&secy_stats->syncp); @@ -902,9 +945,17 @@ static void handle_not_macsec(struct sk_buff *skb) secy_stats->stats.InPktsUntagged++; u64_stats_update_end(&secy_stats->syncp); } + + if (netif_running(macsec->secy.netdev) && + macsec_get_ops(macsec, NULL)) { + ret = RX_HANDLER_EXACT; + goto out; + } } +out: rcu_read_unlock(); + return ret; } static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb) @@ -929,12 +980,8 @@ static rx_handler_result_t macsec_handle_frame(struct sk_buff **pskb) goto drop_direct; hdr = macsec_ethhdr(skb); - if (hdr->eth.h_proto != htons(ETH_P_MACSEC)) { - handle_not_macsec(skb); - - /* and deliver to the uncontrolled port */ - return RX_HANDLER_PASS; - } + if (hdr->eth.h_proto != htons(ETH_P_MACSEC)) + return handle_not_macsec(skb); skb = skb_unshare(skb, GFP_ATOMIC); *pskb = skb; @@ -1439,6 +1486,40 @@ static const struct nla_policy macsec_genl_sa_policy[NUM_MACSEC_SA_ATTR] = { .len = MACSEC_MAX_KEY_LEN, }, }; +/* Offloads an operation to a device driver */ +static int macsec_offload(int (* const func)(struct macsec_context *), + struct macsec_context *ctx) +{ + int ret; + + if (unlikely(!func)) + return 0; + + if (ctx->is_phy) + mutex_lock(&ctx->phydev->lock); + + /* Phase I: prepare. The drive should fail here if there are going to be + * issues in the commit phase. + */ + ctx->prepare = true; + ret = (*func)(ctx); + if (ret) + goto phy_unlock; + + /* Phase II: commit. This step cannot fail. */ + ctx->prepare = false; + ret = (*func)(ctx); + /* This should never happen: commit is not allowed to fail */ + if (unlikely(ret)) + WARN(1, "MACsec offloading commit failed (%d)\n", ret); + +phy_unlock: + if (ctx->is_phy) + mutex_unlock(&ctx->phydev->lock); + + return ret; +} + static int parse_sa_config(struct nlattr **attrs, struct nlattr **tb_sa) { if (!attrs[MACSEC_ATTR_SA_CONFIG]) @@ -1490,11 +1571,14 @@ static int macsec_add_rxsa(struct sk_buff *skb, struct genl_info *info) struct net_device *dev; struct nlattr **attrs = info->attrs; struct macsec_secy *secy; - struct macsec_rx_sc *rx_sc; + struct macsec_rx_sc *rx_sc, *prev_sc; struct macsec_rx_sa *rx_sa; + const struct macsec_ops *ops; + struct macsec_context ctx; unsigned char assoc_num; struct nlattr *tb_rxsc[MACSEC_RXSC_ATTR_MAX + 1]; struct nlattr *tb_sa[MACSEC_SA_ATTR_MAX + 1]; + bool was_active; int err; if (!attrs[MACSEC_ATTR_IFINDEX]) @@ -1551,11 +1635,32 @@ static int macsec_add_rxsa(struct sk_buff *skb, struct genl_info *info) spin_unlock_bh(&rx_sa->lock); } + was_active = rx_sa->active; if (tb_sa[MACSEC_SA_ATTR_ACTIVE]) rx_sa->active = !!nla_get_u8(tb_sa[MACSEC_SA_ATTR_ACTIVE]); - nla_memcpy(rx_sa->key.id, tb_sa[MACSEC_SA_ATTR_KEYID], MACSEC_KEYID_LEN); + prev_sc = rx_sa->sc; rx_sa->sc = rx_sc; + + /* If h/w offloading is available, propagate to the device */ + ops = macsec_get_ops(netdev_priv(dev), &ctx); + if (ops) { + ctx.sa.assoc_num = assoc_num; + ctx.sa.rx_sa = rx_sa; + memcpy(ctx.sa.key, nla_data(tb_sa[MACSEC_SA_ATTR_KEY]), + MACSEC_KEYID_LEN); + + err = macsec_offload(ops->mdo_add_rxsa, &ctx); + if (err) { + rx_sa->active = was_active; + rx_sa->sc = prev_sc; + kfree(rx_sa); + rtnl_unlock(); + return err; + } + } + + nla_memcpy(rx_sa->key.id, tb_sa[MACSEC_SA_ATTR_KEYID], MACSEC_KEYID_LEN); rcu_assign_pointer(rx_sc->sa[assoc_num], rx_sa); rtnl_unlock(); @@ -1583,6 +1688,10 @@ static int macsec_add_rxsc(struct sk_buff *skb, struct genl_info *info) struct nlattr **attrs = info->attrs; struct macsec_rx_sc *rx_sc; struct nlattr *tb_rxsc[MACSEC_RXSC_ATTR_MAX + 1]; + const struct macsec_ops *ops; + struct macsec_context ctx; + bool was_active; + int ret; if (!attrs[MACSEC_ATTR_IFINDEX]) return -EINVAL; @@ -1608,9 +1717,22 @@ static int macsec_add_rxsc(struct sk_buff *skb, struct genl_info *info) return PTR_ERR(rx_sc); } + was_active = rx_sc->active; if (tb_rxsc[MACSEC_RXSC_ATTR_ACTIVE]) rx_sc->active = !!nla_get_u8(tb_rxsc[MACSEC_RXSC_ATTR_ACTIVE]); + ops = macsec_get_ops(netdev_priv(dev), &ctx); + if (ops) { + ctx.rx_sc = rx_sc; + + ret = macsec_offload(ops->mdo_add_rxsc, &ctx); + if (ret) { + rx_sc->active = was_active; + rtnl_unlock(); + return ret; + } + } + rtnl_unlock(); return 0; @@ -1648,8 +1770,12 @@ static int macsec_add_txsa(struct sk_buff *skb, struct genl_info *info) struct macsec_secy *secy; struct macsec_tx_sc *tx_sc; struct macsec_tx_sa *tx_sa; + const struct macsec_ops *ops; + struct macsec_context ctx; unsigned char assoc_num; struct nlattr *tb_sa[MACSEC_SA_ATTR_MAX + 1]; + bool was_operational, was_active; + u32 prev_pn; int err; if (!attrs[MACSEC_ATTR_IFINDEX]) @@ -1700,18 +1826,42 @@ static int macsec_add_txsa(struct sk_buff *skb, struct genl_info *info) return err; } - nla_memcpy(tx_sa->key.id, tb_sa[MACSEC_SA_ATTR_KEYID], MACSEC_KEYID_LEN); - spin_lock_bh(&tx_sa->lock); + prev_pn = tx_sa->next_pn; tx_sa->next_pn = nla_get_u32(tb_sa[MACSEC_SA_ATTR_PN]); spin_unlock_bh(&tx_sa->lock); + was_active = tx_sa->active; if (tb_sa[MACSEC_SA_ATTR_ACTIVE]) tx_sa->active = !!nla_get_u8(tb_sa[MACSEC_SA_ATTR_ACTIVE]); + was_operational = secy->operational; if (assoc_num == tx_sc->encoding_sa && tx_sa->active) secy->operational = true; + /* If h/w offloading is available, propagate to the device */ + ops = macsec_get_ops(netdev_priv(dev), &ctx); + if (ops) { + ctx.sa.assoc_num = assoc_num; + ctx.sa.tx_sa = tx_sa; + memcpy(ctx.sa.key, nla_data(tb_sa[MACSEC_SA_ATTR_KEY]), + MACSEC_KEYID_LEN); + + err = macsec_offload(ops->mdo_add_txsa, &ctx); + if (err) { + spin_lock_bh(&tx_sa->lock); + tx_sa->next_pn = prev_pn; + spin_unlock_bh(&tx_sa->lock); + + tx_sa->active = was_active; + secy->operational = was_operational; + kfree(tx_sa); + rtnl_unlock(); + return err; + } + } + + nla_memcpy(tx_sa->key.id, tb_sa[MACSEC_SA_ATTR_KEYID], MACSEC_KEYID_LEN); rcu_assign_pointer(tx_sc->sa[assoc_num], tx_sa); rtnl_unlock(); @@ -1726,9 +1876,12 @@ static int macsec_del_rxsa(struct sk_buff *skb, struct genl_info *info) struct macsec_secy *secy; struct macsec_rx_sc *rx_sc; struct macsec_rx_sa *rx_sa; + const struct macsec_ops *ops; + struct macsec_context ctx; u8 assoc_num; struct nlattr *tb_rxsc[MACSEC_RXSC_ATTR_MAX + 1]; struct nlattr *tb_sa[MACSEC_SA_ATTR_MAX + 1]; + int ret; if (!attrs[MACSEC_ATTR_IFINDEX]) return -EINVAL; @@ -1752,6 +1905,19 @@ static int macsec_del_rxsa(struct sk_buff *skb, struct genl_info *info) return -EBUSY; } + /* If h/w offloading is available, propagate to the device */ + ops = macsec_get_ops(netdev_priv(dev), &ctx); + if (ops) { + ctx.sa.assoc_num = assoc_num; + ctx.sa.rx_sa = rx_sa; + + ret = macsec_offload(ops->mdo_del_rxsa, &ctx); + if (ret) { + rtnl_unlock(); + return ret; + } + } + RCU_INIT_POINTER(rx_sc->sa[assoc_num], NULL); clear_rx_sa(rx_sa); @@ -1766,8 +1932,11 @@ static int macsec_del_rxsc(struct sk_buff *skb, struct genl_info *info) struct net_device *dev; struct macsec_secy *secy; struct macsec_rx_sc *rx_sc; + const struct macsec_ops *ops; + struct macsec_context ctx; sci_t sci; struct nlattr *tb_rxsc[MACSEC_RXSC_ATTR_MAX + 1]; + int ret; if (!attrs[MACSEC_ATTR_IFINDEX]) return -EINVAL; @@ -1794,6 +1963,17 @@ static int macsec_del_rxsc(struct sk_buff *skb, struct genl_info *info) return -ENODEV; } + /* If h/w offloading is available, propagate to the device */ + ops = macsec_get_ops(netdev_priv(dev), &ctx); + if (ops) { + ctx.rx_sc = rx_sc; + ret = macsec_offload(ops->mdo_del_rxsc, &ctx); + if (ret) { + rtnl_unlock(); + return ret; + } + } + free_rx_sc(rx_sc); rtnl_unlock(); @@ -1807,8 +1987,11 @@ static int macsec_del_txsa(struct sk_buff *skb, struct genl_info *info) struct macsec_secy *secy; struct macsec_tx_sc *tx_sc; struct macsec_tx_sa *tx_sa; + const struct macsec_ops *ops; + struct macsec_context ctx; u8 assoc_num; struct nlattr *tb_sa[MACSEC_SA_ATTR_MAX + 1]; + int ret; if (!attrs[MACSEC_ATTR_IFINDEX]) return -EINVAL; @@ -1829,6 +2012,19 @@ static int macsec_del_txsa(struct sk_buff *skb, struct genl_info *info) return -EBUSY; } + /* If h/w offloading is available, propagate to the device */ + ops = macsec_get_ops(netdev_priv(dev), &ctx); + if (ops) { + ctx.sa.assoc_num = assoc_num; + ctx.sa.tx_sa = tx_sa; + + ret = macsec_offload(ops->mdo_del_txsa, &ctx); + if (ret) { + rtnl_unlock(); + return ret; + } + } + RCU_INIT_POINTER(tx_sc->sa[assoc_num], NULL); clear_tx_sa(tx_sa); @@ -1865,8 +2061,13 @@ static int macsec_upd_txsa(struct sk_buff *skb, struct genl_info *info) struct macsec_secy *secy; struct macsec_tx_sc *tx_sc; struct macsec_tx_sa *tx_sa; + const struct macsec_ops *ops; + struct macsec_context ctx; u8 assoc_num; struct nlattr *tb_sa[MACSEC_SA_ATTR_MAX + 1]; + bool was_operational, was_active; + u32 prev_pn = 0; + int ret = 0; if (!attrs[MACSEC_ATTR_IFINDEX]) return -EINVAL; @@ -1887,19 +2088,41 @@ static int macsec_upd_txsa(struct sk_buff *skb, struct genl_info *info) if (tb_sa[MACSEC_SA_ATTR_PN]) { spin_lock_bh(&tx_sa->lock); + prev_pn = tx_sa->next_pn; tx_sa->next_pn = nla_get_u32(tb_sa[MACSEC_SA_ATTR_PN]); spin_unlock_bh(&tx_sa->lock); } + was_active = tx_sa->active; if (tb_sa[MACSEC_SA_ATTR_ACTIVE]) tx_sa->active = nla_get_u8(tb_sa[MACSEC_SA_ATTR_ACTIVE]); + was_operational = secy->operational; if (assoc_num == tx_sc->encoding_sa) secy->operational = tx_sa->active; + /* If h/w offloading is available, propagate to the device */ + ops = macsec_get_ops(netdev_priv(dev), &ctx); + if (ops) { + ctx.sa.assoc_num = assoc_num; + ctx.sa.tx_sa = tx_sa; + + ret = macsec_offload(ops->mdo_upd_txsa, &ctx); + if (ret) { + if (tb_sa[MACSEC_SA_ATTR_PN]) { + spin_lock_bh(&tx_sa->lock); + tx_sa->next_pn = prev_pn; + spin_unlock_bh(&tx_sa->lock); + } + + tx_sa->active = was_active; + secy->operational = was_operational; + } + } + rtnl_unlock(); - return 0; + return ret; } static int macsec_upd_rxsa(struct sk_buff *skb, struct genl_info *info) @@ -1909,9 +2132,14 @@ static int macsec_upd_rxsa(struct sk_buff *skb, struct genl_info *info) struct macsec_secy *secy; struct macsec_rx_sc *rx_sc; struct macsec_rx_sa *rx_sa; + const struct macsec_ops *ops; + struct macsec_context ctx; u8 assoc_num; struct nlattr *tb_rxsc[MACSEC_RXSC_ATTR_MAX + 1]; struct nlattr *tb_sa[MACSEC_SA_ATTR_MAX + 1]; + bool was_active; + u32 prev_pn = 0; + int ret = 0; if (!attrs[MACSEC_ATTR_IFINDEX]) return -EINVAL; @@ -1935,15 +2163,35 @@ static int macsec_upd_rxsa(struct sk_buff *skb, struct genl_info *info) if (tb_sa[MACSEC_SA_ATTR_PN]) { spin_lock_bh(&rx_sa->lock); + prev_pn = rx_sa->next_pn; rx_sa->next_pn = nla_get_u32(tb_sa[MACSEC_SA_ATTR_PN]); spin_unlock_bh(&rx_sa->lock); } + was_active = rx_sa->active; if (tb_sa[MACSEC_SA_ATTR_ACTIVE]) rx_sa->active = nla_get_u8(tb_sa[MACSEC_SA_ATTR_ACTIVE]); + /* If h/w offloading is available, propagate to the device */ + ops = macsec_get_ops(netdev_priv(dev), &ctx); + if (ops) { + ctx.sa.assoc_num = assoc_num; + ctx.sa.rx_sa = rx_sa; + + ret = macsec_offload(ops->mdo_upd_rxsa, &ctx); + if (ret) { + if (tb_sa[MACSEC_SA_ATTR_PN]) { + spin_lock_bh(&rx_sa->lock); + rx_sa->next_pn = prev_pn; + spin_unlock_bh(&rx_sa->lock); + } + + rx_sa->active = was_active; + } + } + rtnl_unlock(); - return 0; + return ret; } static int macsec_upd_rxsc(struct sk_buff *skb, struct genl_info *info) @@ -1953,6 +2201,11 @@ static int macsec_upd_rxsc(struct sk_buff *skb, struct genl_info *info) struct macsec_secy *secy; struct macsec_rx_sc *rx_sc; struct nlattr *tb_rxsc[MACSEC_RXSC_ATTR_MAX + 1]; + const struct macsec_ops *ops; + struct macsec_context ctx; + unsigned int prev_n_rx_sc; + bool was_active; + int ret; if (!attrs[MACSEC_ATTR_IFINDEX]) return -EINVAL; @@ -1970,6 +2223,8 @@ static int macsec_upd_rxsc(struct sk_buff *skb, struct genl_info *info) return PTR_ERR(rx_sc); } + was_active = rx_sc->active; + prev_n_rx_sc = secy->n_rx_sc; if (tb_rxsc[MACSEC_RXSC_ATTR_ACTIVE]) { bool new = !!nla_get_u8(tb_rxsc[MACSEC_RXSC_ATTR_ACTIVE]); @@ -1979,6 +2234,19 @@ static int macsec_upd_rxsc(struct sk_buff *skb, struct genl_info *info) rx_sc->active = new; } + ops = macsec_get_ops(netdev_priv(dev), &ctx); + if (ops) { + ctx.rx_sc = rx_sc; + + ret = macsec_offload(ops->mdo_upd_rxsc, &ctx); + if (ret) { + secy->n_rx_sc = prev_n_rx_sc; + rx_sc->active = was_active; + rtnl_unlock(); + return 0; + } + } + rtnl_unlock(); return 0; @@ -2546,11 +2814,15 @@ static netdev_tx_t macsec_start_xmit(struct sk_buff *skb, { struct macsec_dev *macsec = netdev_priv(dev); struct macsec_secy *secy = &macsec->secy; + struct macsec_tx_sc *tx_sc = &secy->tx_sc; struct pcpu_secy_stats *secy_stats; + struct macsec_tx_sa *tx_sa; int ret, len; + tx_sa = macsec_txsa_get(tx_sc->sa[tx_sc->encoding_sa]); + /* 10.5 */ - if (!secy->protect_frames) { + if (!secy->protect_frames || macsec_get_ops(netdev_priv(dev), NULL)) { secy_stats = this_cpu_ptr(macsec->stats); u64_stats_update_begin(&secy_stats->syncp); secy_stats->stats.OutPktsUntagged++; @@ -2645,6 +2917,8 @@ static int macsec_dev_open(struct net_device *dev) { struct macsec_dev *macsec = macsec_priv(dev); struct net_device *real_dev = macsec->real_dev; + const struct macsec_ops *ops; + struct macsec_context ctx; int err; err = dev_uc_add(real_dev, dev->dev_addr); @@ -2663,6 +2937,14 @@ static int macsec_dev_open(struct net_device *dev) goto clear_allmulti; } + /* If h/w offloading is available, propagate to the device */ + ops = macsec_get_ops(netdev_priv(dev), &ctx); + if (ops) { + err = macsec_offload(ops->mdo_dev_open, &ctx); + if (err) + goto clear_allmulti; + } + if (netif_carrier_ok(real_dev)) netif_carrier_on(dev); @@ -2680,9 +2962,16 @@ static int macsec_dev_stop(struct net_device *dev) { struct macsec_dev *macsec = macsec_priv(dev); struct net_device *real_dev = macsec->real_dev; + const struct macsec_ops *ops; + struct macsec_context ctx; netif_carrier_off(dev); + /* If h/w offloading is available, propagate to the device */ + ops = macsec_get_ops(netdev_priv(dev), &ctx); + if (ops) + macsec_offload(ops->mdo_dev_stop, &ctx); + dev_mc_unsync(real_dev, dev); dev_uc_unsync(real_dev, dev); @@ -2922,6 +3211,11 @@ static int macsec_changelink(struct net_device *dev, struct nlattr *tb[], struct nlattr *data[], struct netlink_ext_ack *extack) { + struct macsec_dev *macsec = macsec_priv(dev); + struct macsec_context ctx; + const struct macsec_ops *ops; + int ret; + if (!data) return 0; @@ -2931,7 +3225,18 @@ static int macsec_changelink(struct net_device *dev, struct nlattr *tb[], data[IFLA_MACSEC_PORT]) return -EINVAL; - return macsec_changelink_common(dev, data); + /* If h/w offloading is available, propagate to the device */ + ops = macsec_get_ops(netdev_priv(dev), &ctx); + if (ops) { + ctx.secy = &macsec->secy; + return macsec_offload(ops->mdo_upd_secy, &ctx); + } + + ret = macsec_changelink_common(dev, data); + if (ret) + return ret; + + return 0; } static void macsec_del_dev(struct macsec_dev *macsec) @@ -2973,6 +3278,15 @@ static void macsec_dellink(struct net_device *dev, struct list_head *head) struct macsec_dev *macsec = macsec_priv(dev); struct net_device *real_dev = macsec->real_dev; struct macsec_rxh_data *rxd = macsec_data_rtnl(real_dev); + struct macsec_context ctx; + const struct macsec_ops *ops; + + /* If h/w offloading is available, propagate to the device */ + ops = macsec_get_ops(netdev_priv(dev), &ctx); + if (ops) { + ctx.secy = &macsec->secy; + macsec_offload(ops->mdo_del_secy, &ctx); + } macsec_common_dellink(dev, head); @@ -3069,7 +3383,10 @@ static int macsec_newlink(struct net *net, struct net_device *dev, struct netlink_ext_ack *extack) { struct macsec_dev *macsec = macsec_priv(dev); - struct net_device *real_dev; + struct net_device *real_dev, *loop_dev; + struct macsec_context ctx; + const struct macsec_ops *ops; + struct net *loop_net; int err; sci_t sci; u8 icv_len = DEFAULT_ICV_LEN; @@ -3081,6 +3398,25 @@ static int macsec_newlink(struct net *net, struct net_device *dev, if (!real_dev) return -ENODEV; + for_each_net(loop_net) { + for_each_netdev(loop_net, loop_dev) { + struct macsec_dev *priv; + + if (!netif_is_macsec(loop_dev)) + continue; + + priv = macsec_priv(loop_dev); + + /* A limitation of the MACsec h/w offloading is only a + * single MACsec interface can be created for a given + * real interface. + */ + if (macsec_get_ops(netdev_priv(dev), NULL) && + priv->real_dev == real_dev) + return -EBUSY; + } + } + dev->priv_flags |= IFF_MACSEC; macsec->real_dev = real_dev; @@ -3134,6 +3470,15 @@ static int macsec_newlink(struct net *net, struct net_device *dev, goto del_dev; } + /* If h/w offloading is available, propagate to the device */ + ops = macsec_get_ops(netdev_priv(dev), &ctx); + if (ops) { + ctx.secy = &macsec->secy; + err = macsec_offload(ops->mdo_add_secy, &ctx); + if (err) + goto del_dev; + } + err = register_macsec_dev(real_dev, dev); if (err < 0) goto del_dev;
This patch introduces the MACsec hardware offloading infrastructure. The main idea here is to re-use the logic and data structures of the software MACsec implementation. This allows not to duplicate definitions and structure storing the same kind of information. It also allows to use a unified genlink interface for both MACsec implementations (so that the same userspace tool, `ip macsec`, is used with the same arguments). The MACsec offloading support cannot be disabled if an interface supports it at the moment. The MACsec configuration is passed to device drivers supporting it through macsec_hw_offload() which is called from the MACsec genl helpers. This function calls the macsec ops of PHY and Ethernet drivers in two steps: a preparation one, and a commit one. The first step is allowed to fail and should be used to check if a provided configuration is compatible with the features provided by a MACsec engine, while the second step is not allowed to fail and should only be used to enable a given MACsec configuration. Two extra calls are made: when a virtual MACsec interface is created and when it is deleted, so that the hardware driver can stay in sync. The Rx and TX handlers are modified to take in account the special case were the MACsec transformation happens in the hardware, whether in a PHY or in a MAC, as the packets seen by the networking stack on both the physical and MACsec virtual interface are exactly the same. This leads to some limitations: the hardware and software implementations can't be used on the same physical interface, as the policies would be impossible to fulfill (such as strict validation of the frames). Also only a single virtual MACsec interface can be attached to a physical port supporting hardware offloading as it would be impossible to guess onto which interface a given packet should go (for ingress traffic). Another limitation as of now is that the counters and statistics are not reported back from the hardware to the software MACsec implementation. This isn't an issue when using offloaded MACsec transformations, but it should be added in the future so that the MACsec state can be reported to the user (which would also improve the debug). Signed-off-by: Antoine Tenart <antoine.tenart@bootlin.com> --- drivers/net/macsec.c | 379 +++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 362 insertions(+), 17 deletions(-)