Message ID | 20221124053049.1894509-2-mkp@redhat.com |
---|---|
State | Changes Requested |
Headers | show |
Series | Enhance support for checksum offloading | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/intel-ovs-compilation | success | test: success |
Hi, Mike and Flavio. On 11/24/22 06:30, Mike Pattrick wrote: > From: Flavio Leitner <fbl@sysclose.org> > > Document the implementation of netdev hardware offloading s/netdev hardware offloading/checksum offloading/ More on that below. > in userspace datapath. > > Signed-off-by: Flavio Leitner <fbl@sysclose.org> > Co-authored-by: Mike Pattrick <mkp@redhat.com> > Reviewed-by: David Marchand <david.marchand@redhat.com> > Signed-off-by: Mike Pattrick <mkp@redhat.com> > --- > Documentation/automake.mk | 1 + > Documentation/topics/index.rst | 1 + > Documentation/topics/netdev-offloads.rst | 95 ++++++++++++++++++++++++ > 3 files changed, 97 insertions(+) > create mode 100644 Documentation/topics/netdev-offloads.rst > > diff --git a/Documentation/automake.mk b/Documentation/automake.mk > index cdf3c9926..f7990af28 100644 > --- a/Documentation/automake.mk > +++ b/Documentation/automake.mk > @@ -49,6 +49,7 @@ DOC_SOURCE = \ > Documentation/topics/integration.rst \ > Documentation/topics/language-bindings.rst \ > Documentation/topics/networking-namespaces.rst \ > + Documentation/topics/netdev-offloads.rst \ Alphabetical order. > Documentation/topics/openflow.rst \ > Documentation/topics/ovs-extensions.rst \ > Documentation/topics/ovsdb-relay.rst \ > diff --git a/Documentation/topics/index.rst b/Documentation/topics/index.rst > index 90d4c66e6..55aab1c96 100644 > --- a/Documentation/topics/index.rst > +++ b/Documentation/topics/index.rst > @@ -44,6 +44,7 @@ OVS > openflow > bonding > networking-namespaces > + netdev-offloads Alphabetical order. I see that David asked to rename this document to 'netdev-offloads', but I don't think it is actually a good name. Unfortunately, netev-offload today is kind of associated with the flow offload, i.e. lib/netdev-offload-provider.h. So, having a document named very similarly may cause some confusion. In practice, this document only covers checksum offloads, so maybe we should just name it this way? One more thing is that this document is located in the common section of the documentation, but it never explicitly mentions that it applies only to the userspace datapath. We should add this information to the text and the title. See the userspace-tso document for example. > ovsdb-relay > ovsdb-replication > dpdk/index > diff --git a/Documentation/topics/netdev-offloads.rst b/Documentation/topics/netdev-offloads.rst > new file mode 100644 > index 000000000..eb02981b7 > --- /dev/null > +++ b/Documentation/topics/netdev-offloads.rst > @@ -0,0 +1,95 @@ > +.. > + Licensed under the Apache License, Version 2.0 (the "License"); you may > + not use this file except in compliance with the License. You may obtain > + a copy of the License at > + > + http://www.apache.org/licenses/LICENSE-2.0 > + > + Unless required by applicable law or agreed to in writing, software > + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT > + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the > + License for the specific language governing permissions and limitations > + under the License. > + > + Convention for heading levels in Open vSwitch documentation: > + > + ======= Heading 0 (reserved for the title in a document) > + ------- Heading 1 > + ~~~~~~~ Heading 2 > + +++++++ Heading 3 > + ''''''' Heading 4 > + > + Avoid deeper levels because they do not render well. > + > +============ > +NIC Offloads Title should generally be in line with the file name. > +============ > + > +This document explains the internals of Open vSwitch support for NIC offloads. This document only describes checksum offloads and I didn't notice any changes to it in the larger TSO patch set, so we should probably just stick to the checksum offload here and not try to generalize the purpose of the document. > + > +Design > +------ > + > +The Open vSwitch should strive to forward packets as they arrive regardless > +if the checksum is correct, for example. However, it cannot fix existing > +problems. Therefore, when the packet has the checksum verified or the packet > +is known to be good, the checksum calculation can be offloaded to the NIC, > +otherwise updates can be made as long as the previous situation doesn't > +change. For example, a packet with has corrupted IP checksum can be 'with has' ? > +accepted, and a flow rule can change the IP destination address to Extra spaces. > +another address. In that case, OVS needs to partially recompute the checksum > +instead of offloading or calculate all of it again which would fix the > +existing issue. But we just said that OVS cannot fix existing problems and the rules below say that extra care should be taken to not fix it. So, what OVS needs to do? It can't be both/uncertain. > + > +The netdev can set flags indicating if the checksum is good or bad. s/netdev/something else.../ see also below. 'can' or 'should' ? It may be not capable of any offloading. > +The checksum is considered unverified if no flag is set. > + > +When a packet ingress the data path with good checksum, OVS should > +enable checksum offload by default. This allows the data path to s/This allows the data path to/This allows to/ > +postpone checksum updates until the packet egress the data path. > + > +When a packet egress the data path, the packet flags and the egress > +port flags are verified to make sure all required NIC offload > +features to send out the packet are available. If not, the data 'to send out the packet' seems redundant. > +path will fall back to equivalent software implementation. 'data path' can, probably, be replaced with 'OVS' or 'Open vSwitch' in most of the occurrences. Lines in the section above are getting weirdly shorter and shorter. :) > + > + > +Netdev > +------ > + > +When the netdev initiates, it should set the flags to tell the data path > +which offload features are supported. For example, if the driver supports > +IP checksum offloading, then netdev->ol_flags should set the flag > +NETDEV_OFFLOAD_TX_IPV4_CSUM. I'm not sure if the word 'netdev' is a right choice for this kind of documentation. We should use something like 'interface' or 'port' instead as these are user-visible entities in OVS. Or the term should be introduced to the reader in some way, e.g. "interface (a.k.a. 'netdev')". Note that 'port' typically means OpenFlow port. > + > + > +Rules > +----- Empty line. > +1) OVS should strive to forward all packets regardless of checksum. > + > +2) OVS must not correct a bad packet/checksum. > + > +3) Packet with flag DP_PACKET_OL_RX_IP_CSUM_GOOD means that the "A packet with the DP_PACKET_OL_RX_IP_CSUM_GOOD flag" ? > + IP checksum is present in the packet and it is good. > + > +4) Packet with flag DP_PACKET_OL_RX_IP_CSUM_BAD means that the > + IP checksum is present in the packet and it is BAD. Extra care > + should be taken to not fix the packet during data path processing. > + > +5) The ingress packet parser can only set DP_PACKET_OL_TX_IP_CSUM > + if the packet has DP_PACKET_OL_RX_IP_CSUM_GOOD to not violate > + rule #2. > + > +6) Packet with flag DP_PACKET_OL_TX_IPV4 is a IPv4 packet. > + > +7) Packet with flag DP_PACKET_OL_TX_IPV6 is a IPv6 packet. s/a/an/ ? > + > +8) Packet with flag DP_PACKET_OL_TX_IP_CSUM tells the data path > + to skip updating the IP checksum if the packet is modified. The > + IP checksum will be calculated by the egress port if that > + supports IP checksum offload, otherwise the IP checksum will > + be done in software before handing over the packet to the port. > + > +9) When there are modifications to the packet that requires checksum 'that require a checksum ...' > + update, the data path needs to remove DP_PACKET_OL_RX_IP_CSUM_GOOD > + flag, otherwise the checksum is assumed to be good in the packet. Best regards, Ilya Maximets.
diff --git a/Documentation/automake.mk b/Documentation/automake.mk index cdf3c9926..f7990af28 100644 --- a/Documentation/automake.mk +++ b/Documentation/automake.mk @@ -49,6 +49,7 @@ DOC_SOURCE = \ Documentation/topics/integration.rst \ Documentation/topics/language-bindings.rst \ Documentation/topics/networking-namespaces.rst \ + Documentation/topics/netdev-offloads.rst \ Documentation/topics/openflow.rst \ Documentation/topics/ovs-extensions.rst \ Documentation/topics/ovsdb-relay.rst \ diff --git a/Documentation/topics/index.rst b/Documentation/topics/index.rst index 90d4c66e6..55aab1c96 100644 --- a/Documentation/topics/index.rst +++ b/Documentation/topics/index.rst @@ -44,6 +44,7 @@ OVS openflow bonding networking-namespaces + netdev-offloads ovsdb-relay ovsdb-replication dpdk/index diff --git a/Documentation/topics/netdev-offloads.rst b/Documentation/topics/netdev-offloads.rst new file mode 100644 index 000000000..eb02981b7 --- /dev/null +++ b/Documentation/topics/netdev-offloads.rst @@ -0,0 +1,95 @@ +.. + Licensed under the Apache License, Version 2.0 (the "License"); you may + not use this file except in compliance with the License. You may obtain + a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + License for the specific language governing permissions and limitations + under the License. + + Convention for heading levels in Open vSwitch documentation: + + ======= Heading 0 (reserved for the title in a document) + ------- Heading 1 + ~~~~~~~ Heading 2 + +++++++ Heading 3 + ''''''' Heading 4 + + Avoid deeper levels because they do not render well. + +============ +NIC Offloads +============ + +This document explains the internals of Open vSwitch support for NIC offloads. + +Design +------ + +The Open vSwitch should strive to forward packets as they arrive regardless +if the checksum is correct, for example. However, it cannot fix existing +problems. Therefore, when the packet has the checksum verified or the packet +is known to be good, the checksum calculation can be offloaded to the NIC, +otherwise updates can be made as long as the previous situation doesn't +change. For example, a packet with has corrupted IP checksum can be +accepted, and a flow rule can change the IP destination address to +another address. In that case, OVS needs to partially recompute the checksum +instead of offloading or calculate all of it again which would fix the +existing issue. + +The netdev can set flags indicating if the checksum is good or bad. +The checksum is considered unverified if no flag is set. + +When a packet ingress the data path with good checksum, OVS should +enable checksum offload by default. This allows the data path to +postpone checksum updates until the packet egress the data path. + +When a packet egress the data path, the packet flags and the egress +port flags are verified to make sure all required NIC offload +features to send out the packet are available. If not, the data +path will fall back to equivalent software implementation. + + +Netdev +------ + +When the netdev initiates, it should set the flags to tell the data path +which offload features are supported. For example, if the driver supports +IP checksum offloading, then netdev->ol_flags should set the flag +NETDEV_OFFLOAD_TX_IPV4_CSUM. + + +Rules +----- +1) OVS should strive to forward all packets regardless of checksum. + +2) OVS must not correct a bad packet/checksum. + +3) Packet with flag DP_PACKET_OL_RX_IP_CSUM_GOOD means that the + IP checksum is present in the packet and it is good. + +4) Packet with flag DP_PACKET_OL_RX_IP_CSUM_BAD means that the + IP checksum is present in the packet and it is BAD. Extra care + should be taken to not fix the packet during data path processing. + +5) The ingress packet parser can only set DP_PACKET_OL_TX_IP_CSUM + if the packet has DP_PACKET_OL_RX_IP_CSUM_GOOD to not violate + rule #2. + +6) Packet with flag DP_PACKET_OL_TX_IPV4 is a IPv4 packet. + +7) Packet with flag DP_PACKET_OL_TX_IPV6 is a IPv6 packet. + +8) Packet with flag DP_PACKET_OL_TX_IP_CSUM tells the data path + to skip updating the IP checksum if the packet is modified. The + IP checksum will be calculated by the egress port if that + supports IP checksum offload, otherwise the IP checksum will + be done in software before handing over the packet to the port. + +9) When there are modifications to the packet that requires checksum + update, the data path needs to remove DP_PACKET_OL_RX_IP_CSUM_GOOD + flag, otherwise the checksum is assumed to be good in the packet.