diff mbox series

[ovs-dev,v4,01/17] python: add generic Key-Value parser

Message ID 20220616063247.517147-2-amorenoz@redhat.com
State Changes Requested
Headers show
Series python: add flow parsing library | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test fail github build: failed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Adrian Moreno June 16, 2022, 6:32 a.m. UTC
Most of ofproto and dpif flows are based on key-value pairs. These
key-value pairs can be represented in several ways, eg: key:value,
key=value, key(value).

Add the following classes that allow parsing of key-value strings:
* KeyValue: holds a key-value pair
* KeyMetadata: holds some metadata associated with a KeyValue such as
  the original key and value strings and their position in the global
  string
* KVParser: is able to parse a string and extract it's key-value pairs
  as KeyValue instances. Before creating the KeyValue instance it tries
  to decode the value via the KVDecoders
* KVDecoders holds a number of decoders that KVParser can use to decode
  key-value pairs. It accepts a dictionary of keys and callables to
  allow users to specify what decoder (i.e: callable) to use for each
  key

Also, flake8 seems to be incorrectly reporting an error (E203) in:
"slice[index + offset : index + offset]" which is PEP8 compliant. So,
ignore this error.

Acked-by: Terry Wilson <twilson@redhat.com>
Acked-by: Eelco Chaudron <echaudro@redhat.com>
Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 Makefile.am                 |   3 +-
 python/automake.mk          |   6 +-
 python/ovs/flow/__init__.py |   0
 python/ovs/flow/decoders.py |  18 +++
 python/ovs/flow/kv.py       | 314 ++++++++++++++++++++++++++++++++++++
 python/setup.py             |   2 +-
 6 files changed, 340 insertions(+), 3 deletions(-)
 create mode 100644 python/ovs/flow/__init__.py
 create mode 100644 python/ovs/flow/decoders.py
 create mode 100644 python/ovs/flow/kv.py
diff mbox series

Patch

diff --git a/Makefile.am b/Makefile.am
index cb8076433..4f51d225e 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -391,6 +391,7 @@  ALL_LOCAL += flake8-check
 #   E128 continuation line under-indented for visual indent
 #   E129 visually indented line with same indent as next logical line
 #   E131 continuation line unaligned for hanging indent
+#   E203 whitespace before ':'
 #   E722 do not use bare except, specify exception instead
 #   W503 line break before binary operator
 #   W504 line break after binary operator
@@ -403,7 +404,7 @@  ALL_LOCAL += flake8-check
 #   H233 Python 3.x incompatible use of print operator
 #   H238 old style class declaration, use new style (inherit from `object`)
 FLAKE8_SELECT = H231,H232,H233,H238
-FLAKE8_IGNORE = E121,E123,E125,E126,E127,E128,E129,E131,E722,W503,W504,F811,D,H,I
+FLAKE8_IGNORE = E121,E123,E125,E126,E127,E128,E129,E131,E203,E722,W503,W504,F811,D,H,I
 flake8-check: $(FLAKE8_PYFILES)
 	$(FLAKE8_WERROR)$(AM_V_GEN) \
 	  src='$^' && \
diff --git a/python/automake.mk b/python/automake.mk
index 767512f17..e3da6bdbe 100644
--- a/python/automake.mk
+++ b/python/automake.mk
@@ -16,7 +16,6 @@  ovs_pyfiles = \
 	python/ovs/compat/sortedcontainers/sorteddict.py \
 	python/ovs/compat/sortedcontainers/sortedset.py \
 	python/ovs/daemon.py \
-	python/ovs/fcntl_win.py \
 	python/ovs/db/__init__.py \
 	python/ovs/db/custom_index.py \
 	python/ovs/db/data.py \
@@ -26,6 +25,10 @@  ovs_pyfiles = \
 	python/ovs/db/schema.py \
 	python/ovs/db/types.py \
 	python/ovs/fatal_signal.py \
+	python/ovs/fcntl_win.py \
+	python/ovs/flow/__init__.py \
+	python/ovs/flow/decoders.py \
+	python/ovs/flow/kv.py \
 	python/ovs/json.py \
 	python/ovs/jsonrpc.py \
 	python/ovs/ovsuuid.py \
@@ -42,6 +45,7 @@  ovs_pyfiles = \
 	python/ovs/version.py \
 	python/ovs/vlog.py \
 	python/ovs/winutils.py
+
 # These python files are used at build time but not runtime,
 # so they are not installed.
 EXTRA_DIST += \
diff --git a/python/ovs/flow/__init__.py b/python/ovs/flow/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/python/ovs/flow/decoders.py b/python/ovs/flow/decoders.py
new file mode 100644
index 000000000..0c2259c76
--- /dev/null
+++ b/python/ovs/flow/decoders.py
@@ -0,0 +1,18 @@ 
+"""Defines helpful decoders that can be used to decode information from the
+flows.
+
+A decoder is generally a callable that accepts a string and returns the value
+object.
+"""
+
+
+def decode_default(value):
+    """Default decoder.
+
+    It tries to convert into an integer value and, if it fails, just
+    returns the string.
+    """
+    try:
+        return int(value, 0)
+    except ValueError:
+        return value
diff --git a/python/ovs/flow/kv.py b/python/ovs/flow/kv.py
new file mode 100644
index 000000000..cceb95e43
--- /dev/null
+++ b/python/ovs/flow/kv.py
@@ -0,0 +1,314 @@ 
+"""Common helper classes for flow Key-Value parsing."""
+
+import functools
+import re
+
+from ovs.flow.decoders import decode_default
+
+
+class ParseError(RuntimeError):
+    """Exception raised when an error occurs during parsing."""
+
+    pass
+
+
+class KeyMetadata(object):
+    """Class for keeping key metadata.
+
+    Attributes:
+        kpos (int): The position of the keyword in the parent string.
+        vpos (int): The position of the value in the parent string.
+        kstring (string): The keyword string as found in the flow string.
+        vstring (string): The value as found in the flow string.
+        delim (string): Optional, the string used as delimiter between the key
+            and the value.
+        end_delim (string): Optional, the string used as end delimiter
+    """
+
+    def __init__(self, kpos, vpos, kstring, vstring, delim="", end_delim=""):
+        """Constructor."""
+        self.kpos = kpos
+        self.vpos = vpos
+        self.kstring = kstring
+        self.vstring = vstring
+        self.delim = delim
+        self.end_delim = end_delim
+
+    def __str__(self):
+        return "key: [{},{}), val:[{}, {})".format(
+            self.kpos,
+            self.kpos + len(self.kstring),
+            self.vpos,
+            self.vpos + len(self.vstring),
+        )
+
+    def __repr__(self):
+        return "{}('{}')".format(self.__class__.__name__, self)
+
+
+class KeyValue(object):
+    """Class for keeping key-value data.
+
+    Attributes:
+        key (str): The key string.
+        value (any): The value data.
+        meta (KeyMetadata): The key metadata.
+    """
+
+    def __init__(self, key, value, meta=None):
+        """Constructor."""
+        self.key = key
+        self.value = value
+        self.meta = meta
+
+    def __str__(self):
+        return "{}: {} ({})".format(self.key, str(self.value), str(self.meta))
+
+    def __repr__(self):
+        return "{}('{}')".format(self.__class__.__name__, self)
+
+
+class KVDecoders(object):
+    """KVDecoders class is used by KVParser to select how to decode the value
+    of a specific keyword.
+
+    A decoder is simply a function that accepts a value string and returns
+    the value objects to be stored.
+    The returned value may be of any type.
+
+    Decoders may return a KeyValue instance to indicate that the keyword should
+    also be modified to match the one provided in the returned KeyValue.
+
+    The decoder to be used will be selected using the key as an index. If not
+    found, the default decoder will be used. If free keys are found (i.e:
+    keys without a value), the default_free decoder will be used. For that
+    reason, the default_free decoder, must return both the key and value to be
+    stored.
+
+    Args:
+        decoders (dict): Optional; A dictionary of decoders indexed by keyword.
+        default (callable): Optional; A decoder used if a match is not found in
+            configured decoders. If not provided, the default behavior is to
+            try to decode the value into an integer and, if that fails,
+            just return the string as-is.
+        default_free (callable): Optional; The decoder used if a match is not
+            found in configured decoders and it's a free value (e.g:
+            a value without a key) Defaults to returning the free value as
+            keyword and "True" as value.
+            The callable must accept a string and return a key-value pair.
+    """
+
+    def __init__(self, decoders=None, default=None, default_free=None):
+        self._decoders = decoders or dict()
+        self._default = default or decode_default
+        self._default_free = default_free or self._default_free_decoder
+
+    def decode(self, keyword, value_str):
+        """Decode a keyword and value.
+
+        Args:
+            keyword (str): The keyword whose value is to be decoded.
+            value_str (str): The value string.
+
+        Returns:
+            The key (str) and value(any) to be stored.
+        """
+
+        decoder = self._decoders.get(keyword)
+        if decoder:
+            result = decoder(value_str)
+            if isinstance(result, KeyValue):
+                keyword = result.key
+                value = result.value
+            else:
+                value = result
+
+            return keyword, value
+        else:
+            if value_str:
+                return keyword, self._default(value_str)
+            else:
+                return self._default_free(keyword)
+
+    @staticmethod
+    def _default_free_decoder(key):
+        """Default decoder for free keywords."""
+        return key, True
+
+
+delim_pattern = re.compile(r"(\(|=|:|,|\n|\r|\t)")
+parenthesis = re.compile(r"(\(|\))")
+end_pattern = re.compile(r"( |,|\n|\r|\t)")
+separators = (" ", ",")
+end_of_string = (",", "\n", "\t", "\r", "")
+
+
+class KVParser(object):
+    """KVParser parses a string looking for key-value pairs.
+
+    Args:
+        string (str): The string to parse.
+        decoders (KVDecoders): Optional; the KVDecoders instance to use.
+    """
+
+    def __init__(self, string, decoders=None):
+        """Constructor."""
+        self._decoders = decoders or KVDecoders()
+        self._keyval = list()
+        self._string = string
+
+    def keys(self):
+        return list(kv.key for kv in self._keyval)
+
+    def kv(self):
+        return self._keyval
+
+    def __iter__(self):
+        return iter(self._keyval)
+
+    def parse(self):
+        """Parse the key-value pairs in string.
+
+        The input string is assumed to contain a list of comma (or space)
+        separated key-value pairs.
+
+        Key-values pairs can have multiple different delimiters, eg:
+            "key1:value1,key2=value2,key3(value3)".
+
+        Also, we can stumble upon a "free" keywords, e.g:
+            "key1=value1,key2=value2,free_keyword".
+        We consider this as keys without a value.
+
+        So, to parse the string we do the following until the end of the
+        string is found:
+
+            1 - Skip any leading comma's or spaces.
+            2 - Find the next delimiter (or end_of_string character).
+            3 - Depending on the delimiter, obtain the key and the value.
+                For instance, if the delimiter is "(", find the next matching
+                ")".
+            4 - Use the KVDecoders to decode the key-value.
+            5 - Store the KeyValue object with the corresponding metadata.
+
+        Raises:
+            ParseError if any parsing error occurs.
+        """
+        kpos = 0
+        while kpos < len(self._string) and self._string[kpos] != "\n":
+            keyword = ""
+            delimiter = ""
+            rest = ""
+
+            # 1. Skip separator characters.
+            if self._string[kpos] in separators:
+                kpos += 1
+                continue
+
+            # 2. Find the next delimiter or end of string character.
+            try:
+                keyword, delimiter, rest = delim_pattern.split(
+                    self._string[kpos:], 1
+                )
+            except ValueError:
+                keyword = self._string[kpos:]  # Free keyword
+
+            # 3. Extract the value from the rest of the string.
+            value_str = ""
+            vpos = kpos + len(keyword) + 1
+            end_delimiter = ""
+
+            if delimiter in ("=", ":"):
+                # If the delimiter is ':' or '=', the end of the value is the
+                # end of the string or a ', '.
+                value_parts = end_pattern.split(rest, 1)
+                value_str = value_parts[0]
+                next_kpos = vpos + len(value_str)
+
+            elif delimiter == "(":
+                # Find matching ")".
+                level = 1
+                index = 0
+                value_parts = parenthesis.split(rest)
+                for val in value_parts:
+                    if val == "(":
+                        level += 1
+                    elif val == ")":
+                        level -= 1
+                    index += len(val)
+                    if level == 0:
+                        break
+
+                if level != 0:
+                    raise ParseError(
+                        "Error parsing string {}: "
+                        "Failed to find matching ')' in {}".format(
+                            self._string, rest
+                        )
+                    )
+
+                value_str = rest[: index - 1]
+                next_kpos = vpos + len(value_str) + 1
+                end_delimiter = ")"
+
+                # Exceptionally, if after the () we find -> {}, do not treat
+                # the content of the parenthesis as the value, consider
+                # ({})->{} as the string value.
+                if index < len(rest) - 2 and rest[index : index + 2] == "->":
+                    extra_val = rest[index + 2 :].split(",")[0]
+                    value_str = "({})->{}".format(value_str, extra_val)
+                    # remove the first "(".
+                    vpos -= 1
+                    next_kpos = vpos + len(value_str)
+                    end_delimiter = ""
+
+            elif delimiter in end_of_string:
+                # Key without a value.
+                next_kpos = kpos + len(keyword)
+                vpos = -1
+
+            # 4. Use KVDecoders to decode the key-value.
+            try:
+                key, val = self._decoders.decode(keyword, value_str)
+            except Exception as e:
+                raise ParseError(
+                    "Error parsing key-value ({}, {})".format(
+                        keyword, value_str
+                    )
+                ) from e
+
+            # Store the KeyValue object with the corresponding metadata.
+            meta = KeyMetadata(
+                kpos=kpos,
+                vpos=vpos,
+                kstring=keyword,
+                vstring=value_str,
+                delim=delimiter,
+                end_delim=end_delimiter,
+            )
+
+            self._keyval.append(KeyValue(key, val, meta))
+
+            kpos = next_kpos
+
+
+def decode_nested_kv(decoders, value):
+    """A key-value decoder that extracts nested key-value pairs and returns
+    them in a dictionary.
+
+    Args:
+        decoders (KVDecoders): The KVDecoders to use.
+        value (str): The value string to decode.
+    """
+    if not value:
+        # Mark as flag
+        return True
+
+    parser = KVParser(value, decoders)
+    parser.parse()
+    return {kv.key: kv.value for kv in parser.kv()}
+
+
+def nested_kv_decoder(decoders=None):
+    """Helper function that creates a nested kv decoder with given
+    KVDecoders."""
+    return functools.partial(decode_nested_kv, decoders)
diff --git a/python/setup.py b/python/setup.py
index cfe01763f..7ac3c3662 100644
--- a/python/setup.py
+++ b/python/setup.py
@@ -71,7 +71,7 @@  setup_args = dict(
     author='Open vSwitch',
     author_email='dev@openvswitch.org',
     packages=['ovs', 'ovs.compat', 'ovs.compat.sortedcontainers',
-              'ovs.db', 'ovs.unixctl'],
+              'ovs.db', 'ovs.unixctl', 'ovs.flow'],
     keywords=['openvswitch', 'ovs', 'OVSDB'],
     license='Apache 2.0',
     classifiers=[