diff mbox series

[ovs-dev,PATCHv4] ofproto-dpif: Expose datapath capability to ovsdb.

Message ID 1572994666-26627-1-git-send-email-u9012063@gmail.com
State Accepted
Commit 27501802d09f782b8133031c1eae3394ae5ce147
Headers show
Series [ovs-dev,PATCHv4] ofproto-dpif: Expose datapath capability to ovsdb. | expand

Commit Message

William Tu Nov. 5, 2019, 10:57 p.m. UTC
The patch adds support for fetching the datapath's capabilities
from the result of 'check_support()', and write the supported capability
to a new database column, called 'capabilities' under Datapath table.

To see how it works, run:
 # ovs-vsctl -- add-br br0 -- set Bridge br0 datapath_type=netdev
 # ovs-vsctl -- --id=@m create Datapath datapath_version=0 \
     'ct_zones={}' 'capabilities={}' \
     -- set Open_vSwitch . datapaths:"netdev"=@m

 # ovs-vsctl list-dp-cap netdev
 ufid=true sample_nesting=true clone=true tnl_push_pop=true \
 ct_orig_tuple=true ct_eventmask=true ct_state=true \
 ct_clear=true max_vlan_headers=1 recirc=true ct_label=true \
 max_hash_alg=1 ct_state_nat=true ct_timeout=true \
 ct_mark=true ct_orig_tuple6=true check_pkt_len=true \
 masked_set_action=true max_mpls_depth=3 trunc=true ct_zone=true

Signed-off-by: William Tu <u9012063@gmail.com>
Tested-by: Greg Rose <gvrose8192@gmail.com>
Reviewed-by: Greg Rose <gvrose8192@gmail.com>
---
v4:
    rebase to master
v3:
    fix 32-bit build, reported by Greg
    travis: https://travis-ci.org/williamtu/ovs-travis/builds/599276267
v2:
	rebase to master
---
 ofproto/ofproto-dpif.c     | 51 +++++++++++++++++++++++++++++++
 ofproto/ofproto-provider.h |  2 ++
 ofproto/ofproto.c          | 12 ++++++++
 ofproto/ofproto.h          |  2 ++
 tests/ovs-vsctl.at         | 10 ++++++
 utilities/ovs-vsctl.8.in   |  6 ++++
 utilities/ovs-vsctl.c      | 28 +++++++++++++++++
 vswitchd/bridge.c          | 21 +++++++++++++
 vswitchd/vswitch.ovsschema |  5 ++-
 vswitchd/vswitch.xml       | 76 ++++++++++++++++++++++++++++++++++++++++++++++
 10 files changed, 212 insertions(+), 1 deletion(-)

Comments

Ben Pfaff Nov. 6, 2019, 12:49 a.m. UTC | #1
On Tue, Nov 05, 2019 at 02:57:46PM -0800, William Tu wrote:
> +    /* ODP_SUPPORT_FIELDS */
> +    str_value = xasprintf("%"PRIuSIZE, odp.max_vlan_headers);
> +    smap_add(cap, "max_vlan_headers", str_value);
> +    free(str_value);

I think that you can shorten the above to:
    smap_add_format(cap, "max_vlan_headers", "%"PRIuSIZE, odp.max_vlan_headers);
and similarly for other cases.

I think that we can improve the documentation.  I'm working on a
suggestion for that, please give me a while to write it up.
William Tu Nov. 6, 2019, 8:35 p.m. UTC | #2
On Tue, Nov 05, 2019 at 04:49:10PM -0800, Ben Pfaff wrote:
> On Tue, Nov 05, 2019 at 02:57:46PM -0800, William Tu wrote:
> > +    /* ODP_SUPPORT_FIELDS */
> > +    str_value = xasprintf("%"PRIuSIZE, odp.max_vlan_headers);
> > +    smap_add(cap, "max_vlan_headers", str_value);
> > +    free(str_value);
> 
> I think that you can shorten the above to:
>     smap_add_format(cap, "max_vlan_headers", "%"PRIuSIZE, odp.max_vlan_headers);
> and similarly for other cases.

Yes, thank you.
> 
> I think that we can improve the documentation.  I'm working on a
> suggestion for that, please give me a while to write it up.

Sure, I will wait.

--William
Ben Pfaff Nov. 20, 2019, 7:25 p.m. UTC | #3
On Wed, Nov 06, 2019 at 12:35:49PM -0800, William Tu wrote:
> On Tue, Nov 05, 2019 at 04:49:10PM -0800, Ben Pfaff wrote:
> > On Tue, Nov 05, 2019 at 02:57:46PM -0800, William Tu wrote:
> > > +    /* ODP_SUPPORT_FIELDS */
> > > +    str_value = xasprintf("%"PRIuSIZE, odp.max_vlan_headers);
> > > +    smap_add(cap, "max_vlan_headers", str_value);
> > > +    free(str_value);
> > 
> > I think that you can shorten the above to:
> >     smap_add_format(cap, "max_vlan_headers", "%"PRIuSIZE, odp.max_vlan_headers);
> > and similarly for other cases.
> 
> Yes, thank you.
> > 
> > I think that we can improve the documentation.  I'm working on a
> > suggestion for that, please give me a while to write it up.
> 
> Sure, I will wait.

Here's my suggestion, as an incremental:

-8<--------------------------cut here-------------------------->8--

diff --git a/lib/meta-flow.xml b/lib/meta-flow.xml
index d8763eb0b718..90b405c73750 100644
--- a/lib/meta-flow.xml
+++ b/lib/meta-flow.xml
@@ -2488,7 +2488,7 @@ actions=clone(load:0->NXM_OF_IN_PORT[],output:123)
 
   <group title="Connection Tracking">
     <p>
-      Open vSwitch 2.5 and later support ``connection tracking,'' which allows
+      Open vSwitch supports ``connection tracking,'' which allows
       bidirectional streams of packets to be statefully grouped into
       connections.  Open vSwitch connection tracking, for example, identifies
       the patterns of TCP packets that indicates a successfully initiated
@@ -2524,7 +2524,14 @@ actions=clone(load:0->NXM_OF_IN_PORT[],output:123)
     </p>
 
     <p>
-      Connection tracking is an Open vSwitch extension to OpenFlow.
+      Connection tracking is an Open vSwitch extension to OpenFlow.  Open
+      vSwitch 2.5 added the initial support for connection tracking.
+      Subsequent versions of Open vSwitch added many refinements and extensions
+      to the initial support.  Many of these capabilities depend on the Open
+      vSwitch datapath rather than simply the userspace version.  The
+      <code>capabilities</code> column in the <code>Datapath</code> table (see
+      <code>ovs-vswitchd.conf.db</code>(5)) reports the detailed capabilities
+      of a particular Open vSwitch datapath.
     </p>
 
     <field id="MFF_CT_STATE" title="Connection Tracking State">
@@ -2713,7 +2720,8 @@ actions=clone(load:0->NXM_OF_IN_PORT[],output:123)
     </p>
 
     <p>
-      The following fields are populated by the ct action, and require a
+      The following fields are populated by the <code>ct</code>
+      action, and require a
       match to a valid connection tracking state as a prerequisite, in
       addition to the IP or IPv6 ethertype match.  Examples of valid
       connection tracking state matches include <code>ct_state=+new</code>,
diff --git a/manpages.mk b/manpages.mk
index 281ebd8fe946..09b0f9eccf4e 100644
--- a/manpages.mk
+++ b/manpages.mk
@@ -1,23 +1,5 @@
 # Generated automatically -- do not modify!    -*- buffer-read-only: t -*-
 
-ovn/utilities/ovn-sbctl.8: \
-	ovn/utilities/ovn-sbctl.8.in \
-	lib/common.man \
-	lib/db-ctl-base.man \
-	lib/ovs.tmac \
-	lib/ssl-bootstrap.man \
-	lib/ssl.man \
-	lib/table.man \
-	lib/vlog.man
-ovn/utilities/ovn-sbctl.8.in:
-lib/common.man:
-lib/db-ctl-base.man:
-lib/ovs.tmac:
-lib/ssl-bootstrap.man:
-lib/ssl.man:
-lib/table.man:
-lib/vlog.man:
-
 ovsdb/ovsdb-client.1: \
 	ovsdb/ovsdb-client.1.in \
 	lib/common-syn.man \
@@ -116,6 +98,12 @@ lib/vlog-syn.man:
 lib/vlog.man:
 ovsdb/ovsdb-schemas.man:
 
+utilities/bugtool/ovs-bugtool.8: \
+	utilities/bugtool/ovs-bugtool.8.in \
+	lib/ovs.tmac
+utilities/bugtool/ovs-bugtool.8.in:
+lib/ovs.tmac:
+
 utilities/ovs-appctl.8: \
 	utilities/ovs-appctl.8.in \
 	lib/common.man \
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 8d86a5c5016b..7998bd091810 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -5662,81 +5662,234 @@ ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
       connection tracking-related OpenFlow matches and actions).
     </column>
 
-    <column name="capabilities" key="max_vlan_headers"
-            type='{"type": "integer", "minInteger": 0}'>
-      Maximum number of 802.1q VLAN headers to serialize in a mask.
-    </column>
-    <column name="capabilities" key="max_mpls_depth"
-            type='{"type": "integer", "minInteger": 0}'>
-      Maximum number of MPLS label stack entries to serialise in a mask.
-    </column>
-    <column name="capabilities" key="recirc" type='{"type": "boolean"}'>
-      If this is true, then recirculation fields will always be serialised.
-    </column>
-    <column name="capabilities" key="ct_state" type='{"type": "boolean"}'>
-      If true, datapath supports OVS_KEY_ATTR_CT_STATE.
-    </column>
-    <column name="capabilities" key="ct_zone" type='{"type": "boolean"}'>
-      If true, datapath supports OVS_KEY_ATTR_CT_ZONE.
-    </column>
-    <column name="capabilities" key="ct_mark" type='{"type": "boolean"}'>
-      If true, datapath supports OVS_KEY_ATTR_CT_MARK.
-    </column>
-    <column name="capabilities" key="ct_label" type='{"type": "boolean"}'>
-      If true, datapath supports OVS_KEY_ATTR_CT_LABEL.
-    </column>
-    <column name="capabilities" key="ct_state_nat" type='{"type": "boolean"}'>
-      If true, it means that the datapath supports the NAT bits in
-      'ct_state'.  The above 'ct_state' member must be true for this
-      to make sense.
-    </column>
-    <column name="capabilities" key="ct_orig_tuple"
-            type='{"type": "boolean"}'>
-      Conntrack original direction tuple matching * supported.
-    </column>
-    <column name="capabilities" key="ct_orig_tuple6"
-            type='{"type": "boolean"}'>
-      Conntrack original direction tuple6 matching * supported.
-    </column>
-    <column name="capabilities" key="masked_set_action"
-            type='{"type": "boolean"}'>
-      True if the datapath supports masked data in
-      OVS_ACTION_ATTR_SET actions.
-    </column>
-    <column name="capabilities" key="tnl_push_pop" type='{"type": "boolean"}'>
-      True if the datapath supports tnl_push and pop actions.
-    </column>
-    <column name="capabilities" key="ufid" type='{"type": "boolean"}'>
-      True if the datapath supports OVS_FLOW_ATTR_UFID.
-    </column>
-    <column name="capabilities" key="trunc" type='{"type": "boolean"}'>
-      True if the datapath supports OVS_ACTION_ATTR_TRUNC action.
-    </column>
-    <column name="capabilities" key="clone" type='{"type": "boolean"}'>
-      True if the datapath supports OVS_ACTION_ATTR_CLONE action.
-    </column>
-    <column name="capabilities" key="sample_nesting"
-            type='{"type": "integer", "minInteger": 0}'>
-      Maximum level of nesting allowed by OVS_ACTION_ATTR_SAMPLE action.
-    </column>
-    <column name="capabilities" key="ct_eventmask" type='{"type": "boolean"}'>
-      OVS_CT_ATTR_EVENTMASK supported by OVS_ACTION_ATTR_CT action.
-    </column>
-    <column name="capabilities" key="ct_clear" type='{"type": "boolean"}'>
-      True if the datapath supports OVS_ACTION_ATTR_CT_CLEAR action.
-    </column>
-    <column name="capabilities" key="max_hash_alg"
-            type='{"type": "integer", "minInteger": 0}'>
-      Highest supported dp_hash algorithm.
-    </column>
-    <column name="capabilities" key="check_pkt_len"
-            type='{"type": "boolean"}'>
-      True if the datapath supports OVS_ACTION_ATTR_CHECK_PKT_LEN.
-    </column>
-    <column name="capabilities" key="ct_timeout" type='{"type": "boolean"}'>
-      True if the datapath supports OVS_CT_ATTR_TIMEOUT in
-      OVS_ACTION_ATTR_CTaction.
-    </column>
+    <group title="Capabilities">
+      <p>
+        The <ref column="capabilities"/> column reports a datapath's
+        features.  For the <code>netdev</code> datapath, the
+        capabilities are fixed for a given version of Open vSwitch
+        because this datapath is built into the
+        <code>ovs-vswitchd</code> binary.  The Linux kernel and
+        Windows and other datapaths, which are external to OVS
+        userspace, can vary in version and capabilities independently
+        from <code>ovs-vswitchd</code>.
+      </p>
+
+      <p>
+	Some of these features indicate whether higher-level Open vSwitch
+	features are available.  For example, OpenFlow features for
+	connection-tracking are available only when <ref column="capabilities"
+	key="ct_state"/> is <code>true</code>.  A controller that wishes to
+	determine whether a feature is supported could, therefore, consult the
+	relevant capabilities in this table.  However, as a general rule, it is
+	better for a controller to try to use the higher-level feature and use
+	the result as an indication of support, since the low-level
+	capabilities are more likely to shift over time than the high-level
+	features that rely on them.
+      </p>
+
+      <column name="capabilities" key="max_vlan_headers"
+              type='{"type": "integer", "minInteger": 0}'>
+        Number of 802.1q VLAN headers supported by the datapath, as probed by
+        the <code>ovs-vswitchd</code> slow path.  If the datapath supports more
+        VLAN headers than the slow path, this reports the slow path's limit.
+        The value of <ref column="other-config" key="vlan-limit"/> in the <ref
+        table="Open_vSwitch"/> table does not influence the number reported
+        here.
+      </column>
+      <column name="capabilities" key="recirc" type='{"type": "boolean"}'>
+        If this is true, then the datapath supports recirculation,
+        specifically OVS_KEY_ATTR_RECIRC_ID.  Recirculation enables
+        higher performance for MPLS and active-active load balancing
+        bonding modes.
+      </column>
+      <group title="Connection-Tracking Capabilities">
+        <p>
+          These capabilities are granular because Open vSwitch and its
+          datapaths added support for connection tracking over several
+          releases, with features added individually over that time.
+        </p>
+
+        <column name="capabilities" key="ct_state" type='{"type": "boolean"}'>
+          <p>
+            If true, datapath supports OVS_KEY_ATTR_CT_STATE, which indicates
+            support for the bits in the OpenFlow <code>ct_state</code> field
+            (see <code>ovs-fields</code>(7)) other than <code>snat</code> and
+            <code>dnat</code>, which have a separate capability.
+          </p>
+
+          <p>
+            If this is false, the datapath does not support connection-tracking
+            at all and the remaining connection-tracking capabilities should
+            all be false.  In this case, Open vSwitch will reject flows that
+            match on the <code>ct_state</code> field or use the <code>ct</code>
+            action.
+          </p>
+        </column>
+        <column name="capabilities" key="ct_state_nat" type='{"type": "boolean"}'>
+	  <p>
+            If true, it means that the datapath supports the <code>snat</code>
+            and <code>dnat</code> flags in the OpenFlow <code>ct_state</code>
+            field.  The <code>ct_state</code> capability must be true for this to
+            make sense.
+	  </p>
+
+          <p>
+            If false, Open vSwitch will reject flows that match on the
+            <code>snat</code> or <code>dnat</code> bits in
+            <code>ct_state</code> or use <code>nat</code> in the
+            <code>ct</code> action.
+          </p>
+        </column>
+        <column name="capabilities" key="ct_zone" type='{"type": "boolean"}'>
+          If true, datapath supports OVS_KEY_ATTR_CT_ZONE.  If false, Open
+          vSwitch rejects flows that match on the <code>ct_zone</code> field or
+          that specify a nonzero zone or a zone field on the <code>ct</code>
+          action.
+        </column>
+        <column name="capabilities" key="ct_mark" type='{"type": "boolean"}'>
+          If true, datapath supports OVS_KEY_ATTR_CT_MARK.  If false, Open
+          vSwitch rejects flows that match on the <code>ct_mark</code> field or
+          that set <code>ct_mark</code> in the <code>ct</code> action.
+        </column>
+        <column name="capabilities" key="ct_label" type='{"type": "boolean"}'>
+          If true, datapath supports OVS_KEY_ATTR_CT_LABEL.  If false, Open
+          vSwitch rejects flows that match on the <code>ct_label</code> field or
+          that set <code>ct_label</code> in the <code>ct</code> action.
+        </column>
+        <column name="capabilities" key="ct_orig_tuple"
+                type='{"type": "boolean"}'>
+          <p>
+            If true, the datapath supports matching the 5-tuple from the
+            connection's original direction for IPv4 traffic.  If false, Open
+            vSwitch rejects flows that match on <code>ct_nw_src</code> or
+            <code>ct_nw_dst</code>, that use the <code>ct</code> feature of the
+            <code>resubmit</code> action, or the <code>force</code> keyword in
+            the <code>ct</code> action.  (The latter isn't tied to connection
+            tracking support of original tuples in any technical way.  They are
+            conflated because all current datapaths implemented the two
+            features at the same time.)
+          </p>
+
+          <p>
+            If this and <ref column="capabilities" key="ct_orig_tuple6"/> are
+            both false, Open vSwitch rejects flows that match on
+            <code>ct_nw_proto</code>, <code>ct_tp_src</code>, or
+            <code>ct_tp_dst</code>.
+          </p>
+        </column>
+        <column name="capabilities" key="ct_orig_tuple6"
+                type='{"type": "boolean"}'>
+          If true, the datapath supports matching the 5-tuple from the
+          connection's original direction for IPv6 traffic.  If false, Open
+          vSwitch rejects flows that match on <code>ct_ipv6_src</code> or
+          <code>ct_ipv6_dst</code>.
+        </column>
+      </group>
+      <column name="capabilities" key="masked_set_action"
+              type='{"type": "boolean"}'>
+        True if the datapath supports masked data in OVS_ACTION_ATTR_SET
+        actions.  Masked data can improve performance by allowing megaflows to
+        match on fewer fields.
+      </column>
+      <column name="capabilities" key="tnl_push_pop" type='{"type": "boolean"}'>
+        True if the datapath supports tnl_push and pop actions.  This is a
+        prerequisite for a datapath to support native tunneling.
+      </column>
+      <column name="capabilities" key="ufid" type='{"type": "boolean"}'>
+        True if the datapath supports OVS_FLOW_ATTR_UFID.  UFID support
+        improves revalidation performance by transferring less data between
+        the slow path and the datapath.
+      </column>
+      <column name="capabilities" key="trunc" type='{"type": "boolean"}'>
+        True if the datapath supports OVS_ACTION_ATTR_TRUNC action.  If false,
+        the <code>output</code> action with packet truncation requires every
+        packet to be sent to the Open vSwitch slow path, which is likely to
+        make it too slow for mirroring traffic in bulk.
+      </column>
+      <group title="Clone Actions">
+        <p>
+          When Open vSwitch translates actions from OpenFlow into the datapath
+          representation, some of the datapath actions may modify the packet or
+          have other side effects that later datapath actions can't undo.  The
+          OpenFlow <code>ct</code>, <code>meter</code>, <code>output</code>
+          with truncation, <code>encap</code>, <code>decap</code>, and
+          <code>dec_nsh_ttl</code> actions fall into this category.  Often,
+          this is not a problem because nothing later on needs the original
+          packet.
+        </p>
+
+        <p>
+          Such actions can, however, occur in circumstances where the
+          translation does require the original packet.  For example, an
+          OpenFlow <code>output</code> action might direct a packet to a patch
+          port, which might in turn lead to a <code>ct</code> action that NATs
+          the packet (which cannot be undone), and then afterward when control
+          flow pops back across the patch port some other action might need to
+          act on the original packet.
+        </p>
+
+        <p>
+          Open vSwitch has two different ways to implement this ``save and
+          restore'' via datapath actions.  These capabilities indicate which
+          one Open vSwitch will choose.  When neither is available, Open
+          vSwitch simply fails in situations that require this feature.
+        </p>
+
+        <column name="capabilities" key="clone" type='{"type": "boolean"}'>
+          <p>
+            True if the datapath supports OVS_ACTION_ATTR_CLONE action.  This
+            is the preferred option for saving and restoring packets, since it
+            is intended for the purpose, but old datapaths do not support it.
+            Open vSwitch will use it whenever it is available.
+          </p>
+
+          <p>
+            (The OpenFlow <code>clone</code> action does not always yield a
+            OVS_ACTION_ATTR_CLONE action.  It only does so when the datapath
+            supports it and the <code>clone</code> brackets actions that
+            otherwise cannot be undone.)
+          </p>
+        </column>
+        <column name="capabilities" key="sample_nesting"
+                type='{"type": "integer", "minInteger": 0}'>
+          Maximum level of nesting allowed by OVS_ACTION_ATTR_SAMPLE action.
+          Open vSwitch misuses this action for saving and restoring packets
+          when the datapath supports more than 3 levels of nesting and
+          OVS_ACTION_ATTR_CLONE is not available.
+        </column>
+      </group>
+      <column name="capabilities" key="ct_eventmask" type='{"type": "boolean"}'>
+        True if the datapath's OVS_ACTION_ATTR_CT action implements the
+        OVS_CT_ATTR_EVENTMASK attribute.  When this is true, Open vSwitch uses
+        the event mask feature to limit the kinds of events reported to
+        conntrack update listeners.  When Open vSwitch doesn't limit the event
+        mask, listeners receive reports of numerous usually unimportant events,
+        such as TCP state machine changes, which can waste CPU time.
+      </column>
+      <column name="capabilities" key="ct_clear" type='{"type": "boolean"}'>
+        True if the datapath supports OVS_ACTION_ATTR_CT_CLEAR action.
+        If false, the OpenFlow <code>ct_clear</code> action has no effect
+        on the datapath.
+      </column>
+      <column name="capabilities" key="max_hash_alg"
+              type='{"type": "integer", "minInteger": 0}'>
+        Highest supported dp_hash algorithm.  This allows Open vSwitch to avoid
+        requesting a packet hash that the datapath does not support.
+      </column>
+      <column name="capabilities" key="check_pkt_len"
+              type='{"type": "boolean"}'>
+        True if the datapath supports OVS_ACTION_ATTR_CHECK_PKT_LEN.  If false,
+        Open vSwitch implements the <code>check_pkt_larger</code> action by
+        sending every packet through the Open vSwitch slow path, which is
+        likely to make it too slow for handling traffic in bulk.
+      </column>
+      <column name="capabilities" key="ct_timeout" type='{"type": "boolean"}'>
+        True if the datapath supports OVS_CT_ATTR_TIMEOUT in the
+        OVS_ACTION_ATTR_CT action.  If false, Open vswitch cannot implement
+        timeout policies based on connection tracking zones, as configured
+        through the <code>CT_Timeout_Policy</code> table.
+      </column>
+    </group>
 
     <group title="Common Columns">
       The overall purpose of these columns is described under <code>Common
William Tu Nov. 21, 2019, 1:20 a.m. UTC | #4
On Wed, Nov 20, 2019 at 11:25:12AM -0800, Ben Pfaff wrote:
> On Wed, Nov 06, 2019 at 12:35:49PM -0800, William Tu wrote:
> > On Tue, Nov 05, 2019 at 04:49:10PM -0800, Ben Pfaff wrote:
> > > On Tue, Nov 05, 2019 at 02:57:46PM -0800, William Tu wrote:
> > > > +    /* ODP_SUPPORT_FIELDS */
> > > > +    str_value = xasprintf("%"PRIuSIZE, odp.max_vlan_headers);
> > > > +    smap_add(cap, "max_vlan_headers", str_value);
> > > > +    free(str_value);
> > > 
> > > I think that you can shorten the above to:
> > >     smap_add_format(cap, "max_vlan_headers", "%"PRIuSIZE, odp.max_vlan_headers);
> > > and similarly for other cases.
> > 
> > Yes, thank you.
> > > 
> > > I think that we can improve the documentation.  I'm working on a
> > > suggestion for that, please give me a while to write it up.
> > 
> > Sure, I will wait.
> 
> Here's my suggestion, as an incremental:

Thanks a lot for adding much better doc.
I will merge it into next version.
One comment inline below.

> 
> -8<--------------------------cut here-------------------------->8--
> 
> diff --git a/lib/meta-flow.xml b/lib/meta-flow.xml
> index d8763eb0b718..90b405c73750 100644
> --- a/lib/meta-flow.xml
> +++ b/lib/meta-flow.xml
> @@ -2488,7 +2488,7 @@ actions=clone(load:0->NXM_OF_IN_PORT[],output:123)
>  
>    <group title="Connection Tracking">
>      <p>
> -      Open vSwitch 2.5 and later support ``connection tracking,'' which allows
> +      Open vSwitch supports ``connection tracking,'' which allows
>        bidirectional streams of packets to be statefully grouped into
>        connections.  Open vSwitch connection tracking, for example, identifies
>        the patterns of TCP packets that indicates a successfully initiated
> @@ -2524,7 +2524,14 @@ actions=clone(load:0->NXM_OF_IN_PORT[],output:123)
>      </p>
>  
>      <p>
> -      Connection tracking is an Open vSwitch extension to OpenFlow.
> +      Connection tracking is an Open vSwitch extension to OpenFlow.  Open
> +      vSwitch 2.5 added the initial support for connection tracking.
> +      Subsequent versions of Open vSwitch added many refinements and extensions
> +      to the initial support.  Many of these capabilities depend on the Open
> +      vSwitch datapath rather than simply the userspace version.  The
> +      <code>capabilities</code> column in the <code>Datapath</code> table (see
> +      <code>ovs-vswitchd.conf.db</code>(5)) reports the detailed capabilities
> +      of a particular Open vSwitch datapath.
>      </p>
>  
>      <field id="MFF_CT_STATE" title="Connection Tracking State">
> @@ -2713,7 +2720,8 @@ actions=clone(load:0->NXM_OF_IN_PORT[],output:123)
>      </p>
>  
>      <p>
> -      The following fields are populated by the ct action, and require a
> +      The following fields are populated by the <code>ct</code>
> +      action, and require a
>        match to a valid connection tracking state as a prerequisite, in
>        addition to the IP or IPv6 ethertype match.  Examples of valid
>        connection tracking state matches include <code>ct_state=+new</code>,
> diff --git a/manpages.mk b/manpages.mk
> index 281ebd8fe946..09b0f9eccf4e 100644
> --- a/manpages.mk
> +++ b/manpages.mk
> @@ -1,23 +1,5 @@
>  # Generated automatically -- do not modify!    -*- buffer-read-only: t -*-
>  
> -ovn/utilities/ovn-sbctl.8: \
> -	ovn/utilities/ovn-sbctl.8.in \
> -	lib/common.man \
> -	lib/db-ctl-base.man \
> -	lib/ovs.tmac \
> -	lib/ssl-bootstrap.man \
> -	lib/ssl.man \
> -	lib/table.man \
> -	lib/vlog.man
> -ovn/utilities/ovn-sbctl.8.in:
> -lib/common.man:
> -lib/db-ctl-base.man:
> -lib/ovs.tmac:
> -lib/ssl-bootstrap.man:
> -lib/ssl.man:
> -lib/table.man:
> -lib/vlog.man:
> -
>  ovsdb/ovsdb-client.1: \
>  	ovsdb/ovsdb-client.1.in \
>  	lib/common-syn.man \
> @@ -116,6 +98,12 @@ lib/vlog-syn.man:
>  lib/vlog.man:
>  ovsdb/ovsdb-schemas.man:
>  
> +utilities/bugtool/ovs-bugtool.8: \
> +	utilities/bugtool/ovs-bugtool.8.in \
> +	lib/ovs.tmac
> +utilities/bugtool/ovs-bugtool.8.in:
> +lib/ovs.tmac:
> +
>  utilities/ovs-appctl.8: \
>  	utilities/ovs-appctl.8.in \
>  	lib/common.man \
I think the above is not related. I will remove the manpages.mk diff.
Rest below looks good to me.
--William
Ben Pfaff Nov. 21, 2019, 2 a.m. UTC | #5
On Wed, Nov 20, 2019 at 05:20:35PM -0800, William Tu wrote:
> I think the above is not related. I will remove the manpages.mk diff.
> Rest below looks good to me.

Yes, thanks!
William Tu Nov. 21, 2019, 5:18 p.m. UTC | #6
On Wed, Nov 20, 2019 at 06:00:44PM -0800, Ben Pfaff wrote:
> On Wed, Nov 20, 2019 at 05:20:35PM -0800, William Tu wrote:
> > I think the above is not related. I will remove the manpages.mk diff.
> > Rest below looks good to me.
> 
> Yes, thanks!

I fixed a couple minor issues (tab and lines > 79 char)
and applied to master.
Thank you!
William Tu Nov. 21, 2019, 5:38 p.m. UTC | #7
On Wed, Nov 20, 2019 at 05:20:35PM -0800, William Tu wrote:
> On Wed, Nov 20, 2019 at 11:25:12AM -0800, Ben Pfaff wrote:
> > On Wed, Nov 06, 2019 at 12:35:49PM -0800, William Tu wrote:
> > > On Tue, Nov 05, 2019 at 04:49:10PM -0800, Ben Pfaff wrote:
> > > > On Tue, Nov 05, 2019 at 02:57:46PM -0800, William Tu wrote:
> > > > > +    /* ODP_SUPPORT_FIELDS */
> > > > > +    str_value = xasprintf("%"PRIuSIZE, odp.max_vlan_headers);
> > > > > +    smap_add(cap, "max_vlan_headers", str_value);
> > > > > +    free(str_value);
> > > > 
> > > > I think that you can shorten the above to:
> > > >     smap_add_format(cap, "max_vlan_headers", "%"PRIuSIZE, odp.max_vlan_headers);
> > > > and similarly for other cases.

Hi Ben,

I forgot to add above fixes.
I will send out another patch, sorry about that.

William
diff mbox series

Patch

diff --git a/ofproto/ofproto-dpif.c b/ofproto/ofproto-dpif.c
index c35ec3e61592..fa73d06a82f4 100644
--- a/ofproto/ofproto-dpif.c
+++ b/ofproto/ofproto-dpif.c
@@ -5444,6 +5444,56 @@  ct_del_zone_timeout_policy(const char *datapath_type, uint16_t zone_id)
     }
 }
 
+static void
+get_datapath_cap(const char *datapath_type, struct smap *cap)
+{
+    char *str_value;
+    struct odp_support odp;
+    struct dpif_backer_support s;
+    struct dpif_backer *backer = shash_find_data(&all_dpif_backers,
+                                                 datapath_type);
+    if (!backer) {
+        return;
+    }
+    s = backer->rt_support;
+    odp = s.odp;
+
+    /* ODP_SUPPORT_FIELDS */
+    str_value = xasprintf("%"PRIuSIZE, odp.max_vlan_headers);
+    smap_add(cap, "max_vlan_headers", str_value);
+    free(str_value);
+
+    str_value = xasprintf("%"PRIuSIZE, odp.max_mpls_depth);
+    smap_add(cap, "max_mpls_depth", str_value);
+    free(str_value);
+
+    smap_add(cap, "recirc", odp.recirc ? "true" : "false");
+    smap_add(cap, "ct_state", odp.ct_state ? "true" : "false");
+    smap_add(cap, "ct_zone", odp.ct_zone ? "true" : "false");
+    smap_add(cap, "ct_mark", odp.ct_mark ? "true" : "false");
+    smap_add(cap, "ct_label", odp.ct_label ? "true" : "false");
+    smap_add(cap, "ct_state_nat", odp.ct_state_nat ? "true" : "false");
+    smap_add(cap, "ct_orig_tuple", odp.ct_orig_tuple ? "true" : "false");
+    smap_add(cap, "ct_orig_tuple6", odp.ct_orig_tuple6 ? "true" : "false");
+
+    /* DPIF_SUPPORT_FIELDS */
+    smap_add(cap, "masked_set_action", s.masked_set_action ? "true" : "false");
+    smap_add(cap, "tnl_push_pop", s.tnl_push_pop ? "true" : "false");
+    smap_add(cap, "ufid", s.ufid ? "true" : "false");
+    smap_add(cap, "trunc", s.trunc ? "true" : "false");
+    smap_add(cap, "clone", s.clone ? "true" : "false");
+    smap_add(cap, "sample_nesting", s.sample_nesting ? "true" : "false");
+    smap_add(cap, "ct_eventmask", s.ct_eventmask ? "true" : "false");
+    smap_add(cap, "ct_clear", s.ct_clear ? "true" : "false");
+
+    str_value = xasprintf("%"PRIuSIZE, s.max_hash_alg);
+    smap_add(cap, "max_hash_alg", str_value);
+    free(str_value);
+
+    smap_add(cap, "check_pkt_len", s.check_pkt_len ? "true" : "false");
+    smap_add(cap, "ct_timeout", s.ct_timeout ? "true" : "false");
+}
+
 /* Gets timeout policy name in 'backer' based on 'zone', 'dl_type' and
  * 'nw_proto'.  Returns true if the zone-based timeout policy is configured.
  * On success, stores the timeout policy name in 'tp_name', and sets
@@ -6585,4 +6635,5 @@  const struct ofproto_class ofproto_dpif_class = {
     ct_flush,                   /* ct_flush */
     ct_set_zone_timeout_policy,
     ct_del_zone_timeout_policy,
+    get_datapath_cap,
 };
diff --git a/ofproto/ofproto-provider.h b/ofproto/ofproto-provider.h
index c980e6bffff5..80c4fee06d01 100644
--- a/ofproto/ofproto-provider.h
+++ b/ofproto/ofproto-provider.h
@@ -1889,6 +1889,8 @@  struct ofproto_class {
     /* Deletes the timeout policy associated with 'zone' in datapath type
      * 'dp_type'. */
     void (*ct_del_zone_timeout_policy)(const char *dp_type, uint16_t zone);
+    /* Get the datapath's capabilities. */
+    void (*get_datapath_cap)(const char *dp_type, struct smap *caps);
 };
 
 extern const struct ofproto_class ofproto_dpif_class;
diff --git a/ofproto/ofproto.c b/ofproto/ofproto.c
index 3aaa45a9b3af..7535ba176b74 100644
--- a/ofproto/ofproto.c
+++ b/ofproto/ofproto.c
@@ -954,6 +954,18 @@  ofproto_get_flow_restore_wait(void)
     return flow_restore_wait;
 }
 
+/* Retrieve datapath capabilities. */
+void
+ofproto_get_datapath_cap(const char *datapath_type, struct smap *dp_cap)
+{
+    datapath_type = ofproto_normalize_type(datapath_type);
+    const struct ofproto_class *class = ofproto_class_find__(datapath_type);
+
+    if (class->get_datapath_cap) {
+        class->get_datapath_cap(datapath_type, dp_cap);
+    }
+}
+
 /* Connection tracking configuration. */
 void
 ofproto_ct_set_zone_timeout_policy(const char *datapath_type, uint16_t zone_id,
diff --git a/ofproto/ofproto.h b/ofproto/ofproto.h
index 033c4cf93e94..48a9d602c214 100644
--- a/ofproto/ofproto.h
+++ b/ofproto/ofproto.h
@@ -372,6 +372,8 @@  void ofproto_ct_set_zone_timeout_policy(const char *datapath_type,
                                         struct simap *timeout_policy);
 void ofproto_ct_del_zone_timeout_policy(const char *datapath_type,
                                         uint16_t zone);
+void ofproto_get_datapath_cap(const char *datapath_type,
+                              struct smap *dp_cap);
 
 /* Configuration of ports. */
 void ofproto_port_unregister(struct ofproto *, ofp_port_t ofp_port);
diff --git a/tests/ovs-vsctl.at b/tests/ovs-vsctl.at
index 4d55f89e4ae3..18b9cd7c89a5 100644
--- a/tests/ovs-vsctl.at
+++ b/tests/ovs-vsctl.at
@@ -819,6 +819,10 @@  AT_CHECK([RUN_OVS_VSCTL([del-zone-tp netdev zone=1])])
 AT_CHECK([RUN_OVS_VSCTL([--if-exists del-zone-tp netdev zone=1])])
 AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:2, Timeout Policies: icmp_first=2 icmp_reply=3
 ])
+
+AT_CHECK([RUN_OVS_VSCTL([-- --id=@m create Datapath datapath_version=0 'capabilities={recirc=true}' -- set Open_vSwitch . datapaths:"system"=@m])], [0], [stdout])
+AT_CHECK([RUN_OVS_VSCTL([list-dp-cap system])], [0], [recirc=true
+])
 OVS_VSCTL_CLEANUP
 AT_CLEANUP
 
@@ -962,6 +966,12 @@  AT_CHECK([RUN_OVS_VSCTL([del-zone-tp netdev zone=11])],
 ])
 AT_CHECK([RUN_OVS_VSCTL([list-zone-tp netdev])], [0], [Zone:2, Timeout Policies: icmp_first=2 icmp_reply=3
 ])
+
+AT_CHECK([RUN_OVS_VSCTL([-- --id=@m create Datapath datapath_version=0 'capabilities={recirc=true}' -- set Open_vSwitch . datapaths:"system"=@m])], [0], [stdout])
+AT_CHECK([RUN_OVS_VSCTL([list-dp-cap nosystem])],
+  [1], [], [ovs-vsctl: datapath "nosystem" record not found
+])
+
 OVS_VSCTL_CLEANUP
 AT_CLEANUP
 
diff --git a/utilities/ovs-vsctl.8.in b/utilities/ovs-vsctl.8.in
index 14a8aa4a48ac..ff97922dd0c6 100644
--- a/utilities/ovs-vsctl.8.in
+++ b/utilities/ovs-vsctl.8.in
@@ -379,6 +379,12 @@  delete a zone that does not exist has no effect.
 .IP "\fBlist\-zone\-tp \fIdatapath\fR"
 Prints the timeout policies of all zones in \fIdatapath\fR.
 .
+.SS "Datapath Capabilities Command"
+The command query datapath capabilities.
+.
+.IP "\fBlist\-dp\-cap \fIdatapath\fR"
+Prints the datapath's capabilities.
+.
 .SS "OpenFlow Controller Connectivity"
 .
 \fBovs\-vswitchd\fR can perform all configured bridging and switching
diff --git a/utilities/ovs-vsctl.c b/utilities/ovs-vsctl.c
index 7232471e68b9..bd3972636e66 100644
--- a/utilities/ovs-vsctl.c
+++ b/utilities/ovs-vsctl.c
@@ -1363,6 +1363,31 @@  pre_get_zone(struct ctl_context *ctx)
 }
 
 static void
+pre_get_dp_cap(struct ctl_context *ctx)
+{
+    ovsdb_idl_add_column(ctx->idl, &ovsrec_open_vswitch_col_datapaths);
+    ovsdb_idl_add_column(ctx->idl, &ovsrec_datapath_col_capabilities);
+}
+
+static void
+cmd_list_dp_cap(struct ctl_context *ctx)
+{
+    struct vsctl_context *vsctl_ctx = vsctl_context_cast(ctx);
+    struct smap_node *node;
+
+    struct ovsrec_datapath *dp = find_datapath(vsctl_ctx, ctx->argv[1]);
+    if (!dp) {
+        ctl_fatal("datapath \"%s\" record not found", ctx->argv[1]);
+    }
+
+    SMAP_FOR_EACH (node, &dp->capabilities) {
+        ds_put_format(&ctx->output, "%s=%s ",node->key, node->value);
+    }
+    ds_chomp(&ctx->output, ' ');
+    ds_put_char(&ctx->output, '\n');
+}
+
+static void
 cmd_add_br(struct ctl_context *ctx)
 {
     struct vsctl_context *vsctl_ctx = vsctl_context_cast(ctx);
@@ -3112,6 +3137,9 @@  static const struct ctl_command_syntax vsctl_commands[] = {
      "--if-exists", RW},
     {"list-zone-tp", 1, 1, "", pre_get_zone, cmd_list_zone_tp, NULL, "", RO},
 
+    /* Datapath capabilities. */
+    {"list-dp-cap", 1, 1, "", pre_get_dp_cap, cmd_list_dp_cap, NULL, "", RO},
+
     {NULL, 0, 0, NULL, NULL, NULL, NULL, NULL, RO},
 };
 
diff --git a/vswitchd/bridge.c b/vswitchd/bridge.c
index 9095ebf5d5e9..d402e39bdc68 100644
--- a/vswitchd/bridge.c
+++ b/vswitchd/bridge.c
@@ -171,6 +171,7 @@  struct datapath {
     struct hmap ct_zones;       /* Map of 'struct ct_zone' elements, indexed
                                  * by 'zone'. */
     struct hmap_node node;      /* Node in 'all_datapaths' hmap. */
+    struct smap caps;           /* Capabilities. */
     unsigned int last_used;     /* The last idl_seqno that this 'datapath'
                                  * used in OVSDB. This number is used for
                                  * garbage collection. */
@@ -700,6 +701,7 @@  datapath_create(const char *type)
     dp->type = xstrdup(type);
     hmap_init(&dp->ct_zones);
     hmap_insert(&all_datapaths, &dp->node, hash_string(type, 0));
+    smap_init(&dp->caps);
     return dp;
 }
 
@@ -716,6 +718,7 @@  datapath_destroy(struct datapath *dp)
         hmap_remove(&all_datapaths, &dp->node);
         hmap_destroy(&dp->ct_zones);
         free(dp->type);
+        smap_destroy(&dp->caps);
         free(dp);
     }
 }
@@ -759,6 +762,23 @@  ct_zones_reconfigure(struct datapath *dp, struct ovsrec_datapath *dp_cfg)
 }
 
 static void
+dp_capability_reconfigure(struct datapath *dp,
+                          struct ovsrec_datapath *dp_cfg)
+{
+    struct smap_node *node;
+    struct smap cap;
+
+    smap_init(&cap);
+    ofproto_get_datapath_cap(dp->type, &cap);
+
+    SMAP_FOR_EACH (node, &cap) {
+        ovsrec_datapath_update_capabilities_setkey(dp_cfg, node->key,
+                                                   node->value);
+    }
+    smap_destroy(&cap);
+}
+
+static void
 datapath_reconfigure(const struct ovsrec_open_vswitch *cfg)
 {
     struct datapath *dp, *next;
@@ -771,6 +791,7 @@  datapath_reconfigure(const struct ovsrec_open_vswitch *cfg)
         dp = datapath_lookup(dp_name);
         if (!dp) {
             dp = datapath_create(dp_name);
+            dp_capability_reconfigure(dp, dp_cfg);
         }
         dp->last_used = idl_seqno;
         ct_zones_reconfigure(dp, dp_cfg);
diff --git a/vswitchd/vswitch.ovsschema b/vswitchd/vswitch.ovsschema
index 02be5ddeec97..0666c8c76446 100644
--- a/vswitchd/vswitch.ovsschema
+++ b/vswitchd/vswitch.ovsschema
@@ -1,6 +1,6 @@ 
 {"name": "Open_vSwitch",
  "version": "8.2.0",
- "cksum": "4076590391 26298",
+ "cksum": "1076640191 26427",
  "tables": {
    "Open_vSwitch": {
      "columns": {
@@ -650,6 +650,9 @@ 
                   "value": {"type": "uuid",
                             "refTable": "CT_Zone"},
                   "min": 0, "max": "unlimited"}},
+       "capabilities": {
+         "type": {"key": "string", "value": "string",
+                  "min": 0, "max": "unlimited"}},
        "external_ids": {
          "type": {"key": "string", "value": "string",
                   "min": 0, "max": "unlimited"}}}},
diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml
index 00c6bd2d4ac2..8d86a5c5016b 100644
--- a/vswitchd/vswitch.xml
+++ b/vswitchd/vswitch.xml
@@ -5662,6 +5662,82 @@  ovs-vsctl add-port br0 p0 -- set Interface p0 type=patch options:peer=p1 \
       connection tracking-related OpenFlow matches and actions).
     </column>
 
+    <column name="capabilities" key="max_vlan_headers"
+            type='{"type": "integer", "minInteger": 0}'>
+      Maximum number of 802.1q VLAN headers to serialize in a mask.
+    </column>
+    <column name="capabilities" key="max_mpls_depth"
+            type='{"type": "integer", "minInteger": 0}'>
+      Maximum number of MPLS label stack entries to serialise in a mask.
+    </column>
+    <column name="capabilities" key="recirc" type='{"type": "boolean"}'>
+      If this is true, then recirculation fields will always be serialised.
+    </column>
+    <column name="capabilities" key="ct_state" type='{"type": "boolean"}'>
+      If true, datapath supports OVS_KEY_ATTR_CT_STATE.
+    </column>
+    <column name="capabilities" key="ct_zone" type='{"type": "boolean"}'>
+      If true, datapath supports OVS_KEY_ATTR_CT_ZONE.
+    </column>
+    <column name="capabilities" key="ct_mark" type='{"type": "boolean"}'>
+      If true, datapath supports OVS_KEY_ATTR_CT_MARK.
+    </column>
+    <column name="capabilities" key="ct_label" type='{"type": "boolean"}'>
+      If true, datapath supports OVS_KEY_ATTR_CT_LABEL.
+    </column>
+    <column name="capabilities" key="ct_state_nat" type='{"type": "boolean"}'>
+      If true, it means that the datapath supports the NAT bits in
+      'ct_state'.  The above 'ct_state' member must be true for this
+      to make sense.
+    </column>
+    <column name="capabilities" key="ct_orig_tuple"
+            type='{"type": "boolean"}'>
+      Conntrack original direction tuple matching * supported.
+    </column>
+    <column name="capabilities" key="ct_orig_tuple6"
+            type='{"type": "boolean"}'>
+      Conntrack original direction tuple6 matching * supported.
+    </column>
+    <column name="capabilities" key="masked_set_action"
+            type='{"type": "boolean"}'>
+      True if the datapath supports masked data in
+      OVS_ACTION_ATTR_SET actions.
+    </column>
+    <column name="capabilities" key="tnl_push_pop" type='{"type": "boolean"}'>
+      True if the datapath supports tnl_push and pop actions.
+    </column>
+    <column name="capabilities" key="ufid" type='{"type": "boolean"}'>
+      True if the datapath supports OVS_FLOW_ATTR_UFID.
+    </column>
+    <column name="capabilities" key="trunc" type='{"type": "boolean"}'>
+      True if the datapath supports OVS_ACTION_ATTR_TRUNC action.
+    </column>
+    <column name="capabilities" key="clone" type='{"type": "boolean"}'>
+      True if the datapath supports OVS_ACTION_ATTR_CLONE action.
+    </column>
+    <column name="capabilities" key="sample_nesting"
+            type='{"type": "integer", "minInteger": 0}'>
+      Maximum level of nesting allowed by OVS_ACTION_ATTR_SAMPLE action.
+    </column>
+    <column name="capabilities" key="ct_eventmask" type='{"type": "boolean"}'>
+      OVS_CT_ATTR_EVENTMASK supported by OVS_ACTION_ATTR_CT action.
+    </column>
+    <column name="capabilities" key="ct_clear" type='{"type": "boolean"}'>
+      True if the datapath supports OVS_ACTION_ATTR_CT_CLEAR action.
+    </column>
+    <column name="capabilities" key="max_hash_alg"
+            type='{"type": "integer", "minInteger": 0}'>
+      Highest supported dp_hash algorithm.
+    </column>
+    <column name="capabilities" key="check_pkt_len"
+            type='{"type": "boolean"}'>
+      True if the datapath supports OVS_ACTION_ATTR_CHECK_PKT_LEN.
+    </column>
+    <column name="capabilities" key="ct_timeout" type='{"type": "boolean"}'>
+      True if the datapath supports OVS_CT_ATTR_TIMEOUT in
+      OVS_ACTION_ATTR_CTaction.
+    </column>
+
     <group title="Common Columns">
       The overall purpose of these columns is described under <code>Common
       Columns</code> at the beginning of this document.