mbox series

[ovs-dev,v4,0/9] Support zone-based conntrack timeout policy

Message ID 1565897480-120133-1-git-send-email-yihung.wei@gmail.com
Headers show
Series Support zone-based conntrack timeout policy | expand

Message

Yi-Hung Wei Aug. 15, 2019, 7:31 p.m. UTC
This patch series enables zone-based conntrack timeout policy support in OVS.
Timeout policy is a set of timeout attributes that can be associated with a
connection when it is committed.  Then, the connection tracking system will
expire a connection based on its connection state.  For example, one use
case would be to extend the timeout of TCP connection in the established
state to avoid re-connect overhead. Other use case may be to shorten the
connection timeout so that the system can reclaim resources faster.
The idea of zone-based conntrack timeout policy is to group connections
with similar characteristics in a conntrack zone, and assign timeout policy
to the conntrack zone.  In this way, all the connections in that zone will share
the same timeout policy.

For zone-based timeout policy configuration, the association of conntrack
zone and conntrack timeout policy is defined per datapath in vswitchd ovsdb
schema.  User can program the database through ovs-vsctl or using ovsdb
protocol directly.  Once the zone-based timeout policy configuration is
in the database, vswitchd will read those configuration and organize it
in internal datapath structure, and push the timeout policy into datapath.
Currently, only the kernel datapath supports customized timeout policy.

When a packet is committed to connection tracking system, during flow
translation in ofproto-dpif-xlate, vswitchd will lookup the internal
data structure to figure out which timeout policy to associate with
the connection.  If timeout policy is not specified to the committed
zone, it defaults to the timeout policy in the default zone (zone 0).
If the timeout policy is not specified in the default zone, it defaults
to the system default timeouts.

Here are some more details about each patch
* p01: Introduce ovsdb schema for ct timeout policy.
* p02: ovs-vsctl commands to configure zone-based ct timeout policy.
* p03: Expose a utility functions.
* p04: dpif interface along with dpif-netlink implementation to support
       ct timeout policy.
* p05: Consume ct timeout policy configuration from ovsdb server,
       keep it in internal data structure, and push configuration to
       datapath.
* p06: Add utility function to help compare two simaps.
* p07-08: Kernel datapath support for the new ct action attribute.
* p09: Translate timeout policy in ofproto-dpif-xlate and system traffic test.

v3->v4:
* ofproto-dpif
    - Probe datapath for timeout policy support.
    - With the probing information only translate timeout policy when
      the datapath is supported.
    - Resolve the old kernel compatibility issue reported by Darrell.
* system-traffic
    - Simplify the testing script (diff from Darrell).
* Address various code changes as in the mailing list discussion.

v2->v3
* ovsdb schema
    - Fold in changes from Justin.
    - Make ct timeout policy key to be in a pre-defined set.
* ovs-vsctl
    - Bug fixes.
* ct-dpif
    - Fold in diff suggestion from Justin.
* bridge, ofproto-dpif
    - Restruct the ofproto and dpif layer support for zone based timeout
      policy.
* system traffic test
    - Fix bug reported by Darrell.
* Address review comments from Justin and Darrell.

v1->v2

* ovs-vsctl
    - Remove add-dp,del-dp,list-dp ovs-vsctl commands.
    - Add --may-exist and --if-exists to ovs-vsctl add-zone-tp command.
    - Improve ovs-vsctl test.
* ct-dpif, dpif-netlink
    - Remove support to change default timeout policy in the datapath.
    - Squash ct-dpif and dpif-netlink layer implementation altogether.
    - Address review comments from William.
* ofproto-dpif
    - Remove changes from datapath-config module to ofproto-dpif layer.
    - Maintain zone-based timeout policy in dpif-backer since this is
      per datapath type configuration. This will not break the OVS
      hierarchy as Ilya suggested.
    - Allocate timeout policy id using id_pool instead of idl_seqno
      as Darrell suggested.
    - Add a timeout policy sweep function that clean up unnecessary
      timeout policy regularly in the datapath.
* ofproto-dpif-xlate
    - Only translate ct timeout policy if it is a ct commit action
      in kernel datapath.
* system-traffic test
    - Update system traffic test with low level ovs-vsctl command.
    - Make system traffic test to be datapath type agnostic.
    - Improve system traffic test as Darrell suggested.
* Rebase to master

Ben Pfaff (1):
  simap: Add utility function to help compare two simaps.

Justin Pettit (1):
  ovs-vswitchd: Add Datapath, CT_Zone, and CT_Zone_Policy tables.

William Tu (1):
  ovs-vsctl: Add conntrack zone commands.

Yi-Hung Wei (6):
  ct-dpif: Export ct_dpif_format_ipproto()
  ct-dpif, dpif-netlink: Add conntrack timeout policy support
  ofproto-dpif: Consume CT_Zone, and CT_Timeout_Policy tables
  datapath: compat: Backport nf_conntrack_timeout support
  datapath: Add support for conntrack timeout policy
  ofproto-dpif-xlate: Translate timeout policy in ct action

 Documentation/faq/releases.rst                     |   3 +-
 NEWS                                               |   1 +
 acinclude.m4                                       |   7 +
 datapath-windows/include/OvsDpInterfaceCtExt.h     | 114 +++++
 datapath-windows/ovsext/Netlink/NetlinkProto.h     |   8 +-
 datapath/conntrack.c                               |  30 +-
 datapath/linux/Modules.mk                          |   2 +
 datapath/linux/compat/include/linux/openvswitch.h  |   4 +
 .../include/net/netfilter/nf_conntrack_timeout.h   |  34 ++
 datapath/linux/compat/nf_conntrack_timeout.c       | 102 +++++
 include/windows/automake.mk                        |   1 +
 .../windows/linux/netfilter/nfnetlink_cttimeout.h  |   0
 lib/ct-dpif.c                                      | 116 ++++-
 lib/ct-dpif.h                                      |  60 +++
 lib/dpif-netdev.c                                  |  11 +
 lib/dpif-netlink.c                                 | 490 +++++++++++++++++++++
 lib/dpif-netlink.h                                 |   1 -
 lib/dpif-provider.h                                |  54 +++
 lib/netlink-conntrack.c                            | 301 +++++++++++++
 lib/netlink-conntrack.h                            |  27 +-
 lib/netlink-protocol.h                             |   8 +-
 lib/odp-util.c                                     |  29 +-
 lib/simap.c                                        |  15 +-
 lib/simap.h                                        |   1 +
 ofproto/ofproto-dpif-xlate.c                       |  23 +
 ofproto/ofproto-dpif.c                             | 383 ++++++++++++++++
 ofproto/ofproto-dpif.h                             |  22 +-
 ofproto/ofproto-provider.h                         |  10 +
 ofproto/ofproto.c                                  |  26 ++
 ofproto/ofproto.h                                  |   5 +
 tests/odp.at                                       |   1 +
 tests/ovs-vsctl.at                                 |  34 +-
 tests/system-kmod-macros.at                        |  20 +
 tests/system-traffic.at                            |  66 +++
 tests/system-userspace-macros.at                   |  19 +
 utilities/ovs-vsctl.8.in                           |  26 ++
 utilities/ovs-vsctl.c                              | 202 ++++++++-
 vswitchd/bridge.c                                  | 198 +++++++++
 vswitchd/vswitch.ovsschema                         |  51 ++-
 vswitchd/vswitch.xml                               | 275 ++++++++++--
 40 files changed, 2708 insertions(+), 72 deletions(-)
 create mode 100644 datapath/linux/compat/include/net/netfilter/nf_conntrack_timeout.h
 create mode 100644 datapath/linux/compat/nf_conntrack_timeout.c
 create mode 100644 include/windows/linux/netfilter/nfnetlink_cttimeout.h

Comments

Darrell Ball Aug. 16, 2019, 1:17 a.m. UTC | #1
Thanks for the patches

I did some quick adhoc testing with 4.4.0-119 and 5.0.0-23.

Some results worth noting:

1/ Failing to delete timeout policies with non-zero refcounts results in
WARN logging

> 2019-08-16T00:21:32.240Z|00153|dpif_netlink|WARN|failed to delete timeout
policy ovs_tp_0_udp4 (Device or resource busy)
> 2019-08-16T00:21:32.240Z|00154|ofproto_dpif|INFO|failed to delete timeout
policy id = 0 Device or resource busy
> 2019-08-16T00:21:32.241Z|00155|dpif_netlink|WARN|failed to delete timeout
policy ovs_tp_0_udp4 (Device or resource busy)
> 2019-08-16T00:21:32.241Z|00156|ofproto_dpif|INFO|failed to delete timeout
policy id = 0 Device or resource busy
> 2019-08-16T00:21:32.243Z|00157|dpif_netlink|WARN|failed to delete timeout
policy ovs_tp_0_udp4 (Device or resource busy)
> 2019-08-16T00:21:32.243Z|00158|ofproto_dpif|INFO|failed to delete timeout
policy id = 0 Device or resource busy
> 2019-08-16T00:21:32.243Z|00159|dpif_netlink|WARN|failed to delete timeout
policy ovs_tp_0_udp4 (Device or resource busy)
> 2019-08-16T00:21:32.243Z|00160|ofproto_dpif|INFO|failed to delete timeout
policy id = 0 Device or resource busy

Lets change to INFO level since this can legitimately and expectantly occur

Also eliminate one of the redundant logs - dpif-netlink or ofproto_dpif
> 2019-08-16T00:21:32.243Z|00159|dpif_netlink|WARN|failed to delete timeout
policy ovs_tp_0_udp4 (Device or resource busy)
> 2019-08-16T00:21:32.243Z|00160|ofproto_dpif|INFO|failed to delete timeout
policy id = 0 Device or resource busy
while still maintaining the full profile name (e,g, ovs_tp_0_udp4) in the
kernel.

2/ Also, the cleanup attempts are too aggressive
bridge_run() ... ->....->....type_run()->ct_zone_timeout_policy_sweep()
IIRC, we had a debounce in previous version(s); maybe I missed it in my
quick check of v4; I'll check more later.
Anyways, we need some debouncing.
Also the vlog rate limiting may need adjustment to throttle more.

Thanks Darrell

On Thu, Aug 15, 2019 at 12:31 PM Yi-Hung Wei <yihung.wei@gmail.com> wrote:

> This patch series enables zone-based conntrack timeout policy support in
> OVS.
> Timeout policy is a set of timeout attributes that can be associated with a
> connection when it is committed.  Then, the connection tracking system will
> expire a connection based on its connection state.  For example, one use
> case would be to extend the timeout of TCP connection in the established
> state to avoid re-connect overhead. Other use case may be to shorten the
> connection timeout so that the system can reclaim resources faster.
> The idea of zone-based conntrack timeout policy is to group connections
> with similar characteristics in a conntrack zone, and assign timeout policy
> to the conntrack zone.  In this way, all the connections in that zone will
> share
> the same timeout policy.
>
> For zone-based timeout policy configuration, the association of conntrack
> zone and conntrack timeout policy is defined per datapath in vswitchd ovsdb
> schema.  User can program the database through ovs-vsctl or using ovsdb
> protocol directly.  Once the zone-based timeout policy configuration is
> in the database, vswitchd will read those configuration and organize it
> in internal datapath structure, and push the timeout policy into datapath.
> Currently, only the kernel datapath supports customized timeout policy.
>
> When a packet is committed to connection tracking system, during flow
> translation in ofproto-dpif-xlate, vswitchd will lookup the internal
> data structure to figure out which timeout policy to associate with
> the connection.  If timeout policy is not specified to the committed
> zone, it defaults to the timeout policy in the default zone (zone 0).
> If the timeout policy is not specified in the default zone, it defaults
> to the system default timeouts.
>
> Here are some more details about each patch
> * p01: Introduce ovsdb schema for ct timeout policy.
> * p02: ovs-vsctl commands to configure zone-based ct timeout policy.
> * p03: Expose a utility functions.
> * p04: dpif interface along with dpif-netlink implementation to support
>        ct timeout policy.
> * p05: Consume ct timeout policy configuration from ovsdb server,
>        keep it in internal data structure, and push configuration to
>        datapath.
> * p06: Add utility function to help compare two simaps.
> * p07-08: Kernel datapath support for the new ct action attribute.
> * p09: Translate timeout policy in ofproto-dpif-xlate and system traffic
> test.
>
> v3->v4:
> * ofproto-dpif
>     - Probe datapath for timeout policy support.
>     - With the probing information only translate timeout policy when
>       the datapath is supported.
>     - Resolve the old kernel compatibility issue reported by Darrell.
> * system-traffic
>     - Simplify the testing script (diff from Darrell).
> * Address various code changes as in the mailing list discussion.
>
> v2->v3
> * ovsdb schema
>     - Fold in changes from Justin.
>     - Make ct timeout policy key to be in a pre-defined set.
> * ovs-vsctl
>     - Bug fixes.
> * ct-dpif
>     - Fold in diff suggestion from Justin.
> * bridge, ofproto-dpif
>     - Restruct the ofproto and dpif layer support for zone based timeout
>       policy.
> * system traffic test
>     - Fix bug reported by Darrell.
> * Address review comments from Justin and Darrell.
>
> v1->v2
>
> * ovs-vsctl
>     - Remove add-dp,del-dp,list-dp ovs-vsctl commands.
>     - Add --may-exist and --if-exists to ovs-vsctl add-zone-tp command.
>     - Improve ovs-vsctl test.
> * ct-dpif, dpif-netlink
>     - Remove support to change default timeout policy in the datapath.
>     - Squash ct-dpif and dpif-netlink layer implementation altogether.
>     - Address review comments from William.
> * ofproto-dpif
>     - Remove changes from datapath-config module to ofproto-dpif layer.
>     - Maintain zone-based timeout policy in dpif-backer since this is
>       per datapath type configuration. This will not break the OVS
>       hierarchy as Ilya suggested.
>     - Allocate timeout policy id using id_pool instead of idl_seqno
>       as Darrell suggested.
>     - Add a timeout policy sweep function that clean up unnecessary
>       timeout policy regularly in the datapath.
> * ofproto-dpif-xlate
>     - Only translate ct timeout policy if it is a ct commit action
>       in kernel datapath.
> * system-traffic test
>     - Update system traffic test with low level ovs-vsctl command.
>     - Make system traffic test to be datapath type agnostic.
>     - Improve system traffic test as Darrell suggested.
> * Rebase to master
>
> Ben Pfaff (1):
>   simap: Add utility function to help compare two simaps.
>
> Justin Pettit (1):
>   ovs-vswitchd: Add Datapath, CT_Zone, and CT_Zone_Policy tables.
>
> William Tu (1):
>   ovs-vsctl: Add conntrack zone commands.
>
> Yi-Hung Wei (6):
>   ct-dpif: Export ct_dpif_format_ipproto()
>   ct-dpif, dpif-netlink: Add conntrack timeout policy support
>   ofproto-dpif: Consume CT_Zone, and CT_Timeout_Policy tables
>   datapath: compat: Backport nf_conntrack_timeout support
>   datapath: Add support for conntrack timeout policy
>   ofproto-dpif-xlate: Translate timeout policy in ct action
>
>  Documentation/faq/releases.rst                     |   3 +-
>  NEWS                                               |   1 +
>  acinclude.m4                                       |   7 +
>  datapath-windows/include/OvsDpInterfaceCtExt.h     | 114 +++++
>  datapath-windows/ovsext/Netlink/NetlinkProto.h     |   8 +-
>  datapath/conntrack.c                               |  30 +-
>  datapath/linux/Modules.mk                          |   2 +
>  datapath/linux/compat/include/linux/openvswitch.h  |   4 +
>  .../include/net/netfilter/nf_conntrack_timeout.h   |  34 ++
>  datapath/linux/compat/nf_conntrack_timeout.c       | 102 +++++
>  include/windows/automake.mk                        |   1 +
>  .../windows/linux/netfilter/nfnetlink_cttimeout.h  |   0
>  lib/ct-dpif.c                                      | 116 ++++-
>  lib/ct-dpif.h                                      |  60 +++
>  lib/dpif-netdev.c                                  |  11 +
>  lib/dpif-netlink.c                                 | 490
> +++++++++++++++++++++
>  lib/dpif-netlink.h                                 |   1 -
>  lib/dpif-provider.h                                |  54 +++
>  lib/netlink-conntrack.c                            | 301 +++++++++++++
>  lib/netlink-conntrack.h                            |  27 +-
>  lib/netlink-protocol.h                             |   8 +-
>  lib/odp-util.c                                     |  29 +-
>  lib/simap.c                                        |  15 +-
>  lib/simap.h                                        |   1 +
>  ofproto/ofproto-dpif-xlate.c                       |  23 +
>  ofproto/ofproto-dpif.c                             | 383 ++++++++++++++++
>  ofproto/ofproto-dpif.h                             |  22 +-
>  ofproto/ofproto-provider.h                         |  10 +
>  ofproto/ofproto.c                                  |  26 ++
>  ofproto/ofproto.h                                  |   5 +
>  tests/odp.at                                       |   1 +
>  tests/ovs-vsctl.at                                 |  34 +-
>  tests/system-kmod-macros.at                        |  20 +
>  tests/system-traffic.at                            |  66 +++
>  tests/system-userspace-macros.at                   |  19 +
>  utilities/ovs-vsctl.8.in                           |  26 ++
>  utilities/ovs-vsctl.c                              | 202 ++++++++-
>  vswitchd/bridge.c                                  | 198 +++++++++
>  vswitchd/vswitch.ovsschema                         |  51 ++-
>  vswitchd/vswitch.xml                               | 275 ++++++++++--
>  40 files changed, 2708 insertions(+), 72 deletions(-)
>  create mode 100644
> datapath/linux/compat/include/net/netfilter/nf_conntrack_timeout.h
>  create mode 100644 datapath/linux/compat/nf_conntrack_timeout.c
>  create mode 100644 include/windows/linux/netfilter/nfnetlink_cttimeout.h
>
> --
> 2.7.4
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>