diff mbox series

[ovs-dev,v1,01/18] python: add generic Key-Value parser

Message ID 20211122112256.2011194-2-amorenoz@redhat.com
State Changes Requested
Headers show
Series python: add flow parsing library | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test fail github build: failed

Commit Message

Adrian Moreno Nov. 22, 2021, 11:22 a.m. UTC
Most of ofproto and dpif flows are based on key-value pairs. These
key-value pairs can be represented in several ways, eg: key:value,
key=value, key(value).

Add the following classes that allow parsing of key-value strings:
* KeyValue: holds a key-value pair
* KeyMetadata: holds some metadata associated with a KeyValue such as
  the original key and value strings and their position in the global
  string
* KVParser: is able to parse a string and extract it's key-value pairs
  as KeyValue instances. Before creating the KeyValue instance it tries
  to decode the value via the KVDecoders
* KVDecoders holds a number of decoders that KVParser can use to decode
  key-value pairs. It accepts a dictionary of keys and callables to
  allow users to specify what decoder (i.e: callable) to use for each
  key

Also, flake8 seems to be incorrectly reporting an error (E203) in:
"slice[index + offset : index + offset]" which is PEP8 compliant. So,
ignore this error.

Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
---
 Makefile.am                  |   3 +-
 python/automake.mk           |   6 +-
 python/ovs/flows/__init__.py |   0
 python/ovs/flows/decoders.py |  19 +++
 python/ovs/flows/kv.py       | 282 +++++++++++++++++++++++++++++++++++
 python/setup.py              |   2 +-
 6 files changed, 309 insertions(+), 3 deletions(-)
 create mode 100644 python/ovs/flows/__init__.py
 create mode 100644 python/ovs/flows/decoders.py
 create mode 100644 python/ovs/flows/kv.py

Comments

Eelco Chaudron Dec. 10, 2021, 3:58 p.m. UTC | #1
On 22 Nov 2021, at 12:22, Adrian Moreno wrote:

> Most of ofproto and dpif flows are based on key-value pairs. These
> key-value pairs can be represented in several ways, eg: key:value,
> key=value, key(value).
>
> Add the following classes that allow parsing of key-value strings:
> * KeyValue: holds a key-value pair
> * KeyMetadata: holds some metadata associated with a KeyValue such as
>   the original key and value strings and their position in the global
>   string
> * KVParser: is able to parse a string and extract it's key-value pairs
>   as KeyValue instances. Before creating the KeyValue instance it tries
>   to decode the value via the KVDecoders
> * KVDecoders holds a number of decoders that KVParser can use to decode
>   key-value pairs. It accepts a dictionary of keys and callables to
>   allow users to specify what decoder (i.e: callable) to use for each
>   key
>
> Also, flake8 seems to be incorrectly reporting an error (E203) in:
> "slice[index + offset : index + offset]" which is PEP8 compliant. So,
> ignore this error.

Slowly starting to review this series :)

> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
> ---
>  Makefile.am                  |   3 +-
>  python/automake.mk           |   6 +-
>  python/ovs/flows/__init__.py |   0
>  python/ovs/flows/decoders.py |  19 +++
>  python/ovs/flows/kv.py       | 282 +++++++++++++++++++++++++++++++++++
>  python/setup.py              |   2 +-
>  6 files changed, 309 insertions(+), 3 deletions(-)
>  create mode 100644 python/ovs/flows/__init__.py
>  create mode 100644 python/ovs/flows/decoders.py
>  create mode 100644 python/ovs/flows/kv.py
>
> diff --git a/Makefile.am b/Makefile.am
> index cb8076433..e38dd607c 100644
> --- a/Makefile.am
> +++ b/Makefile.am
> @@ -392,6 +392,7 @@ ALL_LOCAL += flake8-check
>  #   E129 visually indented line with same indent as next logical line
>  #   E131 continuation line unaligned for hanging indent
>  #   E722 do not use bare except, specify exception instead
> +#   E203 whitespace before ':'

Can you put this in numerical order?

>  #   W503 line break before binary operator
>  #   W504 line break after binary operator
>  # F*** -- warnings native to flake8
> @@ -403,7 +404,7 @@ ALL_LOCAL += flake8-check
>  #   H233 Python 3.x incompatible use of print operator
>  #   H238 old style class declaration, use new style (inherit from `object`)
>  FLAKE8_SELECT = H231,H232,H233,H238
> -FLAKE8_IGNORE = E121,E123,E125,E126,E127,E128,E129,E131,E722,W503,W504,F811,D,H,I
> +FLAKE8_IGNORE = E121,E123,E125,E126,E127,E128,E129,E131,E722,E203,W503,W504,F811,D,H,I

Same here, so E129,E131,E203,E722,W503,...

>  flake8-check: $(FLAKE8_PYFILES)
>  	$(FLAKE8_WERROR)$(AM_V_GEN) \
>  	  src='$^' && \
> diff --git a/python/automake.mk b/python/automake.mk
> index 767512f17..13aa2b4c3 100644
> --- a/python/automake.mk
> +++ b/python/automake.mk
> @@ -41,7 +41,11 @@ ovs_pyfiles = \
>  	python/ovs/util.py \
>  	python/ovs/version.py \
>  	python/ovs/vlog.py \
> -	python/ovs/winutils.py
> +	python/ovs/winutils.py \
> +	python/ovs/flows/__init__.py \
> +	python/ovs/flows/decoders.py \
> +	python/ovs/flows/kv.py

Please also add these in alphabetical order. And maybe also move python/ovs/fcntl_win.py to be in order while you are at it.

> +
>  # These python files are used at build time but not runtime,
>  # so they are not installed.
>  EXTRA_DIST += \
> diff --git a/python/ovs/flows/__init__.py b/python/ovs/flows/__init__.py
> new file mode 100644
> index 000000000..e69de29bb
> diff --git a/python/ovs/flows/decoders.py b/python/ovs/flows/decoders.py
> new file mode 100644
> index 000000000..bfb64e70e
> --- /dev/null
> +++ b/python/ovs/flows/decoders.py
> @@ -0,0 +1,19 @@
> +""" Defines helpful decoders that can be used to decode information from the
> +flows
> +
> +A decoder is generally a callable that accepts a string and returns the value
> +object.
> +"""
> +
> +
> +def decode_default(value):
> +    """Default decoder.
> +
> +    It tries to convert into an integer value and, if it fails, just
> +    returns the string.
> +    """
> +    try:
> +        ival = int(value, 0)
> +        return ival

Would return int(value, 0) work?

> +    except ValueError:
> +        return value
> diff --git a/python/ovs/flows/kv.py b/python/ovs/flows/kv.py
> new file mode 100644
> index 000000000..0093a4e90
> --- /dev/null
> +++ b/python/ovs/flows/kv.py
> @@ -0,0 +1,282 @@
> +""" Common helper classes for flow (ofproto/dpif) parsing
> +"""
> +
> +import re
> +import functools
> +
> +from ovs.flows.decoders import decode_default
> +
> +
> +class ParseError(RuntimeError):
> +    """Exception raised when an error occurs during parsing."""
> +
> +    pass
> +
> +
> +class KeyMetadata:
> +    """Class for keeping key metadata.
> +
> +    Attributes:
> +        kpos (int): The position of the keyword in the parent string.
> +        vpos (int): The position of the value in the parent string.
> +        kstring (string): The keyword string as found in the flow string.
> +        vstring (string): The value as found in the flow string.
> +        end_del (bool): Whether the key has end delimiter.

end_del does not exist, there is end_delim, but below its not a bool but a character?


Is delim also an attribute you could use? If it’s supposed to be private maybe mark it as such (__)?

> +    """
> +
> +    def __init__(self, kpos, vpos, kstring, vstring, delim="", end_delim=""):
> +        """Constructor"""
> +        self.kpos = kpos
> +        self.vpos = vpos
> +        self.kstring = kstring
> +        self.vstring = vstring
> +        self.delim = delim
> +        self.end_delim = end_delim
> +
> +    def __str__(self):
> +        return "key: [{},{}), val:[{}, {})".format(
> +            self.kpos,
> +            self.kpos + len(self.kstring),
> +            self.vpos,
> +            self.vpos + len(self.vstring),
> +        )
> +
> +    def __repr__(self):
> +        return "%s('%s')" % (self.__class__.__name__, self)
> +
> +
> +class KeyValue:
> +    """Class for keeping key-value data
> +
> +    Attributes:
> +        key (str): The key string.
> +        value (any): The value data.
> +        meta (KeyMetadata): The key metadata.
> +    """
> +
> +    def __init__(self, key, value, meta=None):
> +        """Constructor"""
> +        self.key = key
> +        self.value = value
> +        self.meta = meta
> +
> +    def __str__(self):
> +        return "{}: {} ({})".format(self.key, str(self.value), str(self.meta))
> +
> +    def __repr__(self):
> +        return "%s('%s')" % (self.__class__.__name__, self)
> +
> +
> +class KVDecoders:
> +    """KVDecoders class is used by KVParser to select how to decoode the value
> +    of a specific keyword.
> +
> +    A decoder is simply a function that accepts a value string
> +    and returns the value objects to be stored.
> +    The returned value may be of any type.
> +
> +    Decoders may return a KeyValue instance to indicate that the keyword should
> +    also be modified to match the one provided in the returned KeyValue
> +
> +    The free_decoder, however, must return the key and value to be stored
> +
> +    Args:
> +        decoders (dict): Optional; A dictionary of decoders indexed by keyword.
> +        default (callable): Optional; A decoder used if a match is not found in
> +            configured decoders. If not provided, the default behavior is to
> +            try to decode the value into an integer and, if that fails,
> +            just return the string as-is.
> +        default_free (callable): Optional; The decoder used if a match is not
> +            found in configured decoders and it's a free value (e.g:
> +            a value without a key) Defaults to returning the free value as
> +            keyword and "True" as value.
> +            The callable must accept a string and return a key, value pair
> +    """
> +
> +    def __init__(self, decoders=None, default=None, default_free=None):
> +        self._decoders = decoders or dict()
> +        self._default = default or decode_default
> +        self._default_free = default_free or self._default_free_decoder
> +
> +    def decode(self, keyword, value_str):
> +        """Decode a keyword and value.
> +
> +        Args:
> +            keyword (str): The keyword whose value is to be decoded.
> +            value_str (str): The value string.
> +
> +        Returns:
> +            The key (str) and value(any) to be stored.
> +        """
> +
> +        decoder = self._decoders.get(keyword)
> +        if decoder:
> +            result = decoder(value_str)
> +            if isinstance(result, KeyValue):
> +                keyword = result.key
> +                value = result.value
> +            else:
> +                value = result
> +
> +            return keyword, value
> +        else:
> +            if value_str:
> +                return keyword, self._default(value_str)
> +            else:
> +                return self._default_free(keyword)
> +
> +    @staticmethod
> +    def _default_free_decoder(key):
> +        """Default decoder for free kewords."""
> +        return key, True
> +
> +
> +delim_pattern = re.compile(r"(\(|=|:|,|\n|\r|\t|$)")
> +parenthesys_pattern = re.compile(r"(\(|\))")
> +end_pattern = re.compile(r"( |,|\n|\r|\t)")
> +
> +
> +class KVParser:
> +    """KVParser parses a string looking for key-value pairs.

Guess here we could use some explanation on how it parses lines. Or else we have to keep reverse-engineering the code below ;)

> +
> +    Args:
> +        decoders (KVDecoders): Optional; the KVDecoders instance to use.
> +    """
> +
> +    def __init__(self, decoders=None):
> +        """Constructor"""
> +        self._decoders = decoders or KVDecoders()
> +        self._keyval = list()
> +
> +    def keys(self):
> +        return list(kv.key for kv in self._keyval)
> +
> +    def kv(self):
> +        return self._keyval
> +
> +    def __iter__(self):
> +        return iter(self._keyval)
> +
> +    def parse(self, string):
> +        """Parse the key-value pairs in string.
> +
> +        Args:
> +            string (str): the string to parse.

“The string..."

> +
> +        Raises:
> +            ParseError if any parsing error occurs.
> +        """
> +        kpos = 0
> +        while kpos < len(string) and string[kpos] != "\n":
> +            # strip string

“#Stri...”

> +            if string[kpos] == "," or string[kpos] == " ":
> +                kpos += 1
> +                continue
> +
> +            split_parts = delim_pattern.split(string[kpos:], 1)
> +            # the delimiter should be included in the returned list

“# The delimiter…” “list.”

I think the OVS comments always end with a dot.

> +            if len(split_parts) < 3:
> +                break
> +
> +            keyword = split_parts[0]
> +            delimiter = split_parts[1]
> +            rest = split_parts[2]
> +
> +            value_str = ""
> +            vpos = kpos + len(keyword) + 1
> +            end_delimiter = ""
> +
> +            # Figure out the end of the value

“# Figure out the end of the value.”

> +            # If the delimiter is ':' or '=', the end of the value is the end
> +            # of the string or a ', '
> +            if delimiter in ("=", ":"):
> +                value_parts = end_pattern.split(rest, 1)
> +                value_str = value_parts[0] if len(value_parts) == 3 else rest
> +                next_kpos = vpos + len(value_str)
> +
> +            elif delimiter == "(":
> +                # Find the next ')'
> +                level = 1
> +                index = 0
> +                value_parts = parenthesys_pattern.split(rest)
> +                for val in value_parts:
> +                    if val == "(":
> +                        level += 1
> +                    elif val == ")":
> +                        level -= 1
> +                    index += len(val)
> +                    if level == 0:
> +                        break
> +
> +                if level != 0:
> +                    raise ParseError(
> +                        "Error parsing string {}: "
> +                        "Failed to find matching ')' in {}".format(
> +                            string, rest
> +                        )
> +                    )
> +
> +                value_str = rest[: index - 1]
> +                next_kpos = vpos + len(value_str) + 1
> +                end_delimiter = ")"
> +
> +                # Exceptionally, if after the () we find -> {}, do not treat
> +                # the content of the parenthesis as the value, consider
> +                # ({})->{} as the string value
> +                if index < len(rest) - 2 and rest[index : index + 2] == "->":
> +                    extra_val = rest[index + 2 :].split(",")[0]
> +                    value_str = "({})->{}".format(value_str, extra_val)
> +                    # remove the first "("
> +                    vpos -= 1
> +                    next_kpos = vpos + len(value_str)
> +                    end_delimiter = ""
> +
> +            elif delimiter in (",", "\n", "\t", "\r", ""):
> +                # key with no value
> +                next_kpos = kpos + len(keyword)
> +                vpos = -1
> +
> +            try:
> +                key, val = self._decoders.decode(keyword, value_str)
> +            except Exception as e:
> +                raise ParseError(
> +                    "Error parsing key-value ({}, {})".format(
> +                        keyword, value_str
> +                    )
> +                ) from e
> +
> +            meta = KeyMetadata(
> +                kpos=kpos,
> +                vpos=vpos,
> +                kstring=keyword,
> +                vstring=value_str,
> +                delim=delimiter,
> +                end_delim=end_delimiter,
> +            )
> +
> +            self._keyval.append(KeyValue(key, val, meta))
> +
> +            kpos = next_kpos
> +
> +
> +def decode_nested_kv(decoders, value):
> +    """A key-value decoder that extracts nested key-value pairs and returns
> +    them in a dictionary

At . At the end.

> +
> +    Args:
> +        decoders (KVDecoders): the KVDecoders to use.
> +        value (str): the value string to decode.

Capital T for both arguments The

> +    """
> +    if not value:
> +        # Mark as flag
> +        return True
> +
> +    parser = KVParser(decoders)
> +    parser.parse(value)
> +    return {kv.key: kv.value for kv in parser.kv()}
> +
> +
> +def nested_kv_decoder(decoders=None):
> +    """Helper function that creates a nested kv decoder with given
> +    KVDecoders"""

End line with .

> +    return functools.partial(decode_nested_kv, decoders)
> diff --git a/python/setup.py b/python/setup.py
> index cfe01763f..0e6b0ea39 100644
> --- a/python/setup.py
> +++ b/python/setup.py
> @@ -71,7 +71,7 @@ setup_args = dict(
>      author='Open vSwitch',
>      author_email='dev@openvswitch.org',
>      packages=['ovs', 'ovs.compat', 'ovs.compat.sortedcontainers',
> -              'ovs.db', 'ovs.unixctl'],
> +              'ovs.db', 'ovs.unixctl', 'ovs.flows'],
>      keywords=['openvswitch', 'ovs', 'OVSDB'],
>      license='Apache 2.0',
>      classifiers=[
> -- 
> 2.31.1
Adrian Moreno Dec. 20, 2021, 9:21 a.m. UTC | #2
On 12/10/21 16:58, Eelco Chaudron wrote:
> 
> 
> On 22 Nov 2021, at 12:22, Adrian Moreno wrote:
> 
>> Most of ofproto and dpif flows are based on key-value pairs. These
>> key-value pairs can be represented in several ways, eg: key:value,
>> key=value, key(value).
>>
>> Add the following classes that allow parsing of key-value strings:
>> * KeyValue: holds a key-value pair
>> * KeyMetadata: holds some metadata associated with a KeyValue such as
>>    the original key and value strings and their position in the global
>>    string
>> * KVParser: is able to parse a string and extract it's key-value pairs
>>    as KeyValue instances. Before creating the KeyValue instance it tries
>>    to decode the value via the KVDecoders
>> * KVDecoders holds a number of decoders that KVParser can use to decode
>>    key-value pairs. It accepts a dictionary of keys and callables to
>>    allow users to specify what decoder (i.e: callable) to use for each
>>    key
>>
>> Also, flake8 seems to be incorrectly reporting an error (E203) in:
>> "slice[index + offset : index + offset]" which is PEP8 compliant. So,
>> ignore this error.
> 
> Slowly starting to review this series :)

I know it's big. Thanks for the effort.

> 
>> Signed-off-by: Adrian Moreno <amorenoz@redhat.com>
>> ---
>>   Makefile.am                  |   3 +-
>>   python/automake.mk           |   6 +-
>>   python/ovs/flows/__init__.py |   0
>>   python/ovs/flows/decoders.py |  19 +++
>>   python/ovs/flows/kv.py       | 282 +++++++++++++++++++++++++++++++++++
>>   python/setup.py              |   2 +-
>>   6 files changed, 309 insertions(+), 3 deletions(-)
>>   create mode 100644 python/ovs/flows/__init__.py
>>   create mode 100644 python/ovs/flows/decoders.py
>>   create mode 100644 python/ovs/flows/kv.py
>>
>> diff --git a/Makefile.am b/Makefile.am
>> index cb8076433..e38dd607c 100644
>> --- a/Makefile.am
>> +++ b/Makefile.am
>> @@ -392,6 +392,7 @@ ALL_LOCAL += flake8-check
>>   #   E129 visually indented line with same indent as next logical line
>>   #   E131 continuation line unaligned for hanging indent
>>   #   E722 do not use bare except, specify exception instead
>> +#   E203 whitespace before ':'
> 
> Can you put this in numerical order?
> 
>>   #   W503 line break before binary operator
>>   #   W504 line break after binary operator
>>   # F*** -- warnings native to flake8
>> @@ -403,7 +404,7 @@ ALL_LOCAL += flake8-check
>>   #   H233 Python 3.x incompatible use of print operator
>>   #   H238 old style class declaration, use new style (inherit from `object`)
>>   FLAKE8_SELECT = H231,H232,H233,H238
>> -FLAKE8_IGNORE = E121,E123,E125,E126,E127,E128,E129,E131,E722,W503,W504,F811,D,H,I
>> +FLAKE8_IGNORE = E121,E123,E125,E126,E127,E128,E129,E131,E722,E203,W503,W504,F811,D,H,I
> 
> Same here, so E129,E131,E203,E722,W503,...

Sure, will do.

> 
>>   flake8-check: $(FLAKE8_PYFILES)
>>   	$(FLAKE8_WERROR)$(AM_V_GEN) \
>>   	  src='$^' && \
>> diff --git a/python/automake.mk b/python/automake.mk
>> index 767512f17..13aa2b4c3 100644
>> --- a/python/automake.mk
>> +++ b/python/automake.mk
>> @@ -41,7 +41,11 @@ ovs_pyfiles = \
>>   	python/ovs/util.py \
>>   	python/ovs/version.py \
>>   	python/ovs/vlog.py \
>> -	python/ovs/winutils.py
>> +	python/ovs/winutils.py \
>> +	python/ovs/flows/__init__.py \
>> +	python/ovs/flows/decoders.py \
>> +	python/ovs/flows/kv.py
> 
> Please also add these in alphabetical order. And maybe also move python/ovs/fcntl_win.py to be in order while you are at it.
> 

Ack. Will do.

>> +
>>   # These python files are used at build time but not runtime,
>>   # so they are not installed.
>>   EXTRA_DIST += \
>> diff --git a/python/ovs/flows/__init__.py b/python/ovs/flows/__init__.py
>> new file mode 100644
>> index 000000000..e69de29bb
>> diff --git a/python/ovs/flows/decoders.py b/python/ovs/flows/decoders.py
>> new file mode 100644
>> index 000000000..bfb64e70e
>> --- /dev/null
>> +++ b/python/ovs/flows/decoders.py
>> @@ -0,0 +1,19 @@
>> +""" Defines helpful decoders that can be used to decode information from the
>> +flows
>> +
>> +A decoder is generally a callable that accepts a string and returns the value
>> +object.
>> +"""
>> +
>> +
>> +def decode_default(value):
>> +    """Default decoder.
>> +
>> +    It tries to convert into an integer value and, if it fails, just
>> +    returns the string.
>> +    """
>> +    try:
>> +        ival = int(value, 0)
>> +        return ival
> 
> Would return int(value, 0) work?

Surely. Probably a debug leftover of having added a print in the middle :)

> 
>> +    except ValueError:
>> +        return value
>> diff --git a/python/ovs/flows/kv.py b/python/ovs/flows/kv.py
>> new file mode 100644
>> index 000000000..0093a4e90
>> --- /dev/null
>> +++ b/python/ovs/flows/kv.py
>> @@ -0,0 +1,282 @@
>> +""" Common helper classes for flow (ofproto/dpif) parsing
>> +"""
>> +
>> +import re
>> +import functools
>> +
>> +from ovs.flows.decoders import decode_default
>> +
>> +
>> +class ParseError(RuntimeError):
>> +    """Exception raised when an error occurs during parsing."""
>> +
>> +    pass
>> +
>> +
>> +class KeyMetadata:
>> +    """Class for keeping key metadata.
>> +
>> +    Attributes:
>> +        kpos (int): The position of the keyword in the parent string.
>> +        vpos (int): The position of the value in the parent string.
>> +        kstring (string): The keyword string as found in the flow string.
>> +        vstring (string): The value as found in the flow string.
>> +        end_del (bool): Whether the key has end delimiter.
> 
> end_del does not exist, there is end_delim, but below its not a bool but a character?
> 

Good catch, thanks.

> 
> Is delim also an attribute you could use? If it’s supposed to be private maybe mark it as such (__)?
> 

Yes, they can be used. For instance you could implement a printing function that 
prints the delimiters "(" and ")" with some other colors.

>> +    """
>> +
>> +    def __init__(self, kpos, vpos, kstring, vstring, delim="", end_delim=""):
>> +        """Constructor"""
>> +        self.kpos = kpos
>> +        self.vpos = vpos
>> +        self.kstring = kstring
>> +        self.vstring = vstring
>> +        self.delim = delim
>> +        self.end_delim = end_delim
>> +
>> +    def __str__(self):
>> +        return "key: [{},{}), val:[{}, {})".format(
>> +            self.kpos,
>> +            self.kpos + len(self.kstring),
>> +            self.vpos,
>> +            self.vpos + len(self.vstring),
>> +        )
>> +
>> +    def __repr__(self):
>> +        return "%s('%s')" % (self.__class__.__name__, self)
>> +
>> +
>> +class KeyValue:
>> +    """Class for keeping key-value data
>> +
>> +    Attributes:
>> +        key (str): The key string.
>> +        value (any): The value data.
>> +        meta (KeyMetadata): The key metadata.
>> +    """
>> +
>> +    def __init__(self, key, value, meta=None):
>> +        """Constructor"""
>> +        self.key = key
>> +        self.value = value
>> +        self.meta = meta
>> +
>> +    def __str__(self):
>> +        return "{}: {} ({})".format(self.key, str(self.value), str(self.meta))
>> +
>> +    def __repr__(self):
>> +        return "%s('%s')" % (self.__class__.__name__, self)
>> +
>> +
>> +class KVDecoders:
>> +    """KVDecoders class is used by KVParser to select how to decoode the value
>> +    of a specific keyword.
>> +
>> +    A decoder is simply a function that accepts a value string
>> +    and returns the value objects to be stored.
>> +    The returned value may be of any type.
>> +
>> +    Decoders may return a KeyValue instance to indicate that the keyword should
>> +    also be modified to match the one provided in the returned KeyValue
>> +
>> +    The free_decoder, however, must return the key and value to be stored
>> +
>> +    Args:
>> +        decoders (dict): Optional; A dictionary of decoders indexed by keyword.
>> +        default (callable): Optional; A decoder used if a match is not found in
>> +            configured decoders. If not provided, the default behavior is to
>> +            try to decode the value into an integer and, if that fails,
>> +            just return the string as-is.
>> +        default_free (callable): Optional; The decoder used if a match is not
>> +            found in configured decoders and it's a free value (e.g:
>> +            a value without a key) Defaults to returning the free value as
>> +            keyword and "True" as value.
>> +            The callable must accept a string and return a key, value pair
>> +    """
>> +
>> +    def __init__(self, decoders=None, default=None, default_free=None):
>> +        self._decoders = decoders or dict()
>> +        self._default = default or decode_default
>> +        self._default_free = default_free or self._default_free_decoder
>> +
>> +    def decode(self, keyword, value_str):
>> +        """Decode a keyword and value.
>> +
>> +        Args:
>> +            keyword (str): The keyword whose value is to be decoded.
>> +            value_str (str): The value string.
>> +
>> +        Returns:
>> +            The key (str) and value(any) to be stored.
>> +        """
>> +
>> +        decoder = self._decoders.get(keyword)
>> +        if decoder:
>> +            result = decoder(value_str)
>> +            if isinstance(result, KeyValue):
>> +                keyword = result.key
>> +                value = result.value
>> +            else:
>> +                value = result
>> +
>> +            return keyword, value
>> +        else:
>> +            if value_str:
>> +                return keyword, self._default(value_str)
>> +            else:
>> +                return self._default_free(keyword)
>> +
>> +    @staticmethod
>> +    def _default_free_decoder(key):
>> +        """Default decoder for free kewords."""
>> +        return key, True
>> +
>> +
>> +delim_pattern = re.compile(r"(\(|=|:|,|\n|\r|\t|$)")
>> +parenthesys_pattern = re.compile(r"(\(|\))")
>> +end_pattern = re.compile(r"( |,|\n|\r|\t)")
>> +
>> +
>> +class KVParser:
>> +    """KVParser parses a string looking for key-value pairs.
> 
> Guess here we could use some explanation on how it parses lines. Or else we have to keep reverse-engineering the code below ;)
> 

Agree. Will do.

>> +
>> +    Args:
>> +        decoders (KVDecoders): Optional; the KVDecoders instance to use.
>> +    """
>> +
>> +    def __init__(self, decoders=None):
>> +        """Constructor"""
>> +        self._decoders = decoders or KVDecoders()
>> +        self._keyval = list()
>> +
>> +    def keys(self):
>> +        return list(kv.key for kv in self._keyval)
>> +
>> +    def kv(self):
>> +        return self._keyval
>> +
>> +    def __iter__(self):
>> +        return iter(self._keyval)
>> +
>> +    def parse(self, string):
>> +        """Parse the key-value pairs in string.
>> +
>> +        Args:
>> +            string (str): the string to parse.
> 
> “The string..."
> 
>> +
>> +        Raises:
>> +            ParseError if any parsing error occurs.
>> +        """
>> +        kpos = 0
>> +        while kpos < len(string) and string[kpos] != "\n":
>> +            # strip string
> 
> “#Stri...”
> 
>> +            if string[kpos] == "," or string[kpos] == " ":
>> +                kpos += 1
>> +                continue
>> +
>> +            split_parts = delim_pattern.split(string[kpos:], 1)
>> +            # the delimiter should be included in the returned list
> 
> “# The delimiter…” “list.”
> 
> I think the OVS comments always end with a dot.
> 
>> +            if len(split_parts) < 3:
>> +                break
>> +
>> +            keyword = split_parts[0]
>> +            delimiter = split_parts[1]
>> +            rest = split_parts[2]
>> +
>> +            value_str = ""
>> +            vpos = kpos + len(keyword) + 1
>> +            end_delimiter = ""
>> +
>> +            # Figure out the end of the value
> 
> “# Figure out the end of the value.”
> 
>> +            # If the delimiter is ':' or '=', the end of the value is the end
>> +            # of the string or a ', '
>> +            if delimiter in ("=", ":"):
>> +                value_parts = end_pattern.split(rest, 1)
>> +                value_str = value_parts[0] if len(value_parts) == 3 else rest
>> +                next_kpos = vpos + len(value_str)
>> +
>> +            elif delimiter == "(":
>> +                # Find the next ')'
>> +                level = 1
>> +                index = 0
>> +                value_parts = parenthesys_pattern.split(rest)
>> +                for val in value_parts:
>> +                    if val == "(":
>> +                        level += 1
>> +                    elif val == ")":
>> +                        level -= 1
>> +                    index += len(val)
>> +                    if level == 0:
>> +                        break
>> +
>> +                if level != 0:
>> +                    raise ParseError(
>> +                        "Error parsing string {}: "
>> +                        "Failed to find matching ')' in {}".format(
>> +                            string, rest
>> +                        )
>> +                    )
>> +
>> +                value_str = rest[: index - 1]
>> +                next_kpos = vpos + len(value_str) + 1
>> +                end_delimiter = ")"
>> +
>> +                # Exceptionally, if after the () we find -> {}, do not treat
>> +                # the content of the parenthesis as the value, consider
>> +                # ({})->{} as the string value
>> +                if index < len(rest) - 2 and rest[index : index + 2] == "->":
>> +                    extra_val = rest[index + 2 :].split(",")[0]
>> +                    value_str = "({})->{}".format(value_str, extra_val)
>> +                    # remove the first "("
>> +                    vpos -= 1
>> +                    next_kpos = vpos + len(value_str)
>> +                    end_delimiter = ""
>> +
>> +            elif delimiter in (",", "\n", "\t", "\r", ""):
>> +                # key with no value
>> +                next_kpos = kpos + len(keyword)
>> +                vpos = -1
>> +
>> +            try:
>> +                key, val = self._decoders.decode(keyword, value_str)
>> +            except Exception as e:
>> +                raise ParseError(
>> +                    "Error parsing key-value ({}, {})".format(
>> +                        keyword, value_str
>> +                    )
>> +                ) from e
>> +
>> +            meta = KeyMetadata(
>> +                kpos=kpos,
>> +                vpos=vpos,
>> +                kstring=keyword,
>> +                vstring=value_str,
>> +                delim=delimiter,
>> +                end_delim=end_delimiter,
>> +            )
>> +
>> +            self._keyval.append(KeyValue(key, val, meta))
>> +
>> +            kpos = next_kpos
>> +
>> +
>> +def decode_nested_kv(decoders, value):
>> +    """A key-value decoder that extracts nested key-value pairs and returns
>> +    them in a dictionary
> 
> At . At the end.
> 
>> +
>> +    Args:
>> +        decoders (KVDecoders): the KVDecoders to use.
>> +        value (str): the value string to decode.
> 
> Capital T for both arguments The
> 
>> +    """
>> +    if not value:
>> +        # Mark as flag
>> +        return True
>> +
>> +    parser = KVParser(decoders)
>> +    parser.parse(value)
>> +    return {kv.key: kv.value for kv in parser.kv()}
>> +
>> +
>> +def nested_kv_decoder(decoders=None):
>> +    """Helper function that creates a nested kv decoder with given
>> +    KVDecoders"""
> 
> End line with .
> 
>> +    return functools.partial(decode_nested_kv, decoders)
>> diff --git a/python/setup.py b/python/setup.py
>> index cfe01763f..0e6b0ea39 100644
>> --- a/python/setup.py
>> +++ b/python/setup.py
>> @@ -71,7 +71,7 @@ setup_args = dict(
>>       author='Open vSwitch',
>>       author_email='dev@openvswitch.org',
>>       packages=['ovs', 'ovs.compat', 'ovs.compat.sortedcontainers',
>> -              'ovs.db', 'ovs.unixctl'],
>> +              'ovs.db', 'ovs.unixctl', 'ovs.flows'],
>>       keywords=['openvswitch', 'ovs', 'OVSDB'],
>>       license='Apache 2.0',
>>       classifiers=[
>> -- 
>> 2.31.1
> 

Ack all styling comments. Thanks Eelco.
diff mbox series

Patch

diff --git a/Makefile.am b/Makefile.am
index cb8076433..e38dd607c 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -392,6 +392,7 @@  ALL_LOCAL += flake8-check
 #   E129 visually indented line with same indent as next logical line
 #   E131 continuation line unaligned for hanging indent
 #   E722 do not use bare except, specify exception instead
+#   E203 whitespace before ':'
 #   W503 line break before binary operator
 #   W504 line break after binary operator
 # F*** -- warnings native to flake8
@@ -403,7 +404,7 @@  ALL_LOCAL += flake8-check
 #   H233 Python 3.x incompatible use of print operator
 #   H238 old style class declaration, use new style (inherit from `object`)
 FLAKE8_SELECT = H231,H232,H233,H238
-FLAKE8_IGNORE = E121,E123,E125,E126,E127,E128,E129,E131,E722,W503,W504,F811,D,H,I
+FLAKE8_IGNORE = E121,E123,E125,E126,E127,E128,E129,E131,E722,E203,W503,W504,F811,D,H,I
 flake8-check: $(FLAKE8_PYFILES)
 	$(FLAKE8_WERROR)$(AM_V_GEN) \
 	  src='$^' && \
diff --git a/python/automake.mk b/python/automake.mk
index 767512f17..13aa2b4c3 100644
--- a/python/automake.mk
+++ b/python/automake.mk
@@ -41,7 +41,11 @@  ovs_pyfiles = \
 	python/ovs/util.py \
 	python/ovs/version.py \
 	python/ovs/vlog.py \
-	python/ovs/winutils.py
+	python/ovs/winutils.py \
+	python/ovs/flows/__init__.py \
+	python/ovs/flows/decoders.py \
+	python/ovs/flows/kv.py
+
 # These python files are used at build time but not runtime,
 # so they are not installed.
 EXTRA_DIST += \
diff --git a/python/ovs/flows/__init__.py b/python/ovs/flows/__init__.py
new file mode 100644
index 000000000..e69de29bb
diff --git a/python/ovs/flows/decoders.py b/python/ovs/flows/decoders.py
new file mode 100644
index 000000000..bfb64e70e
--- /dev/null
+++ b/python/ovs/flows/decoders.py
@@ -0,0 +1,19 @@ 
+""" Defines helpful decoders that can be used to decode information from the
+flows
+
+A decoder is generally a callable that accepts a string and returns the value
+object.
+"""
+
+
+def decode_default(value):
+    """Default decoder.
+
+    It tries to convert into an integer value and, if it fails, just
+    returns the string.
+    """
+    try:
+        ival = int(value, 0)
+        return ival
+    except ValueError:
+        return value
diff --git a/python/ovs/flows/kv.py b/python/ovs/flows/kv.py
new file mode 100644
index 000000000..0093a4e90
--- /dev/null
+++ b/python/ovs/flows/kv.py
@@ -0,0 +1,282 @@ 
+""" Common helper classes for flow (ofproto/dpif) parsing
+"""
+
+import re
+import functools
+
+from ovs.flows.decoders import decode_default
+
+
+class ParseError(RuntimeError):
+    """Exception raised when an error occurs during parsing."""
+
+    pass
+
+
+class KeyMetadata:
+    """Class for keeping key metadata.
+
+    Attributes:
+        kpos (int): The position of the keyword in the parent string.
+        vpos (int): The position of the value in the parent string.
+        kstring (string): The keyword string as found in the flow string.
+        vstring (string): The value as found in the flow string.
+        end_del (bool): Whether the key has end delimiter.
+    """
+
+    def __init__(self, kpos, vpos, kstring, vstring, delim="", end_delim=""):
+        """Constructor"""
+        self.kpos = kpos
+        self.vpos = vpos
+        self.kstring = kstring
+        self.vstring = vstring
+        self.delim = delim
+        self.end_delim = end_delim
+
+    def __str__(self):
+        return "key: [{},{}), val:[{}, {})".format(
+            self.kpos,
+            self.kpos + len(self.kstring),
+            self.vpos,
+            self.vpos + len(self.vstring),
+        )
+
+    def __repr__(self):
+        return "%s('%s')" % (self.__class__.__name__, self)
+
+
+class KeyValue:
+    """Class for keeping key-value data
+
+    Attributes:
+        key (str): The key string.
+        value (any): The value data.
+        meta (KeyMetadata): The key metadata.
+    """
+
+    def __init__(self, key, value, meta=None):
+        """Constructor"""
+        self.key = key
+        self.value = value
+        self.meta = meta
+
+    def __str__(self):
+        return "{}: {} ({})".format(self.key, str(self.value), str(self.meta))
+
+    def __repr__(self):
+        return "%s('%s')" % (self.__class__.__name__, self)
+
+
+class KVDecoders:
+    """KVDecoders class is used by KVParser to select how to decoode the value
+    of a specific keyword.
+
+    A decoder is simply a function that accepts a value string
+    and returns the value objects to be stored.
+    The returned value may be of any type.
+
+    Decoders may return a KeyValue instance to indicate that the keyword should
+    also be modified to match the one provided in the returned KeyValue
+
+    The free_decoder, however, must return the key and value to be stored
+
+    Args:
+        decoders (dict): Optional; A dictionary of decoders indexed by keyword.
+        default (callable): Optional; A decoder used if a match is not found in
+            configured decoders. If not provided, the default behavior is to
+            try to decode the value into an integer and, if that fails,
+            just return the string as-is.
+        default_free (callable): Optional; The decoder used if a match is not
+            found in configured decoders and it's a free value (e.g:
+            a value without a key) Defaults to returning the free value as
+            keyword and "True" as value.
+            The callable must accept a string and return a key, value pair
+    """
+
+    def __init__(self, decoders=None, default=None, default_free=None):
+        self._decoders = decoders or dict()
+        self._default = default or decode_default
+        self._default_free = default_free or self._default_free_decoder
+
+    def decode(self, keyword, value_str):
+        """Decode a keyword and value.
+
+        Args:
+            keyword (str): The keyword whose value is to be decoded.
+            value_str (str): The value string.
+
+        Returns:
+            The key (str) and value(any) to be stored.
+        """
+
+        decoder = self._decoders.get(keyword)
+        if decoder:
+            result = decoder(value_str)
+            if isinstance(result, KeyValue):
+                keyword = result.key
+                value = result.value
+            else:
+                value = result
+
+            return keyword, value
+        else:
+            if value_str:
+                return keyword, self._default(value_str)
+            else:
+                return self._default_free(keyword)
+
+    @staticmethod
+    def _default_free_decoder(key):
+        """Default decoder for free kewords."""
+        return key, True
+
+
+delim_pattern = re.compile(r"(\(|=|:|,|\n|\r|\t|$)")
+parenthesys_pattern = re.compile(r"(\(|\))")
+end_pattern = re.compile(r"( |,|\n|\r|\t)")
+
+
+class KVParser:
+    """KVParser parses a string looking for key-value pairs.
+
+    Args:
+        decoders (KVDecoders): Optional; the KVDecoders instance to use.
+    """
+
+    def __init__(self, decoders=None):
+        """Constructor"""
+        self._decoders = decoders or KVDecoders()
+        self._keyval = list()
+
+    def keys(self):
+        return list(kv.key for kv in self._keyval)
+
+    def kv(self):
+        return self._keyval
+
+    def __iter__(self):
+        return iter(self._keyval)
+
+    def parse(self, string):
+        """Parse the key-value pairs in string.
+
+        Args:
+            string (str): the string to parse.
+
+        Raises:
+            ParseError if any parsing error occurs.
+        """
+        kpos = 0
+        while kpos < len(string) and string[kpos] != "\n":
+            # strip string
+            if string[kpos] == "," or string[kpos] == " ":
+                kpos += 1
+                continue
+
+            split_parts = delim_pattern.split(string[kpos:], 1)
+            # the delimiter should be included in the returned list
+            if len(split_parts) < 3:
+                break
+
+            keyword = split_parts[0]
+            delimiter = split_parts[1]
+            rest = split_parts[2]
+
+            value_str = ""
+            vpos = kpos + len(keyword) + 1
+            end_delimiter = ""
+
+            # Figure out the end of the value
+            # If the delimiter is ':' or '=', the end of the value is the end
+            # of the string or a ', '
+            if delimiter in ("=", ":"):
+                value_parts = end_pattern.split(rest, 1)
+                value_str = value_parts[0] if len(value_parts) == 3 else rest
+                next_kpos = vpos + len(value_str)
+
+            elif delimiter == "(":
+                # Find the next ')'
+                level = 1
+                index = 0
+                value_parts = parenthesys_pattern.split(rest)
+                for val in value_parts:
+                    if val == "(":
+                        level += 1
+                    elif val == ")":
+                        level -= 1
+                    index += len(val)
+                    if level == 0:
+                        break
+
+                if level != 0:
+                    raise ParseError(
+                        "Error parsing string {}: "
+                        "Failed to find matching ')' in {}".format(
+                            string, rest
+                        )
+                    )
+
+                value_str = rest[: index - 1]
+                next_kpos = vpos + len(value_str) + 1
+                end_delimiter = ")"
+
+                # Exceptionally, if after the () we find -> {}, do not treat
+                # the content of the parenthesis as the value, consider
+                # ({})->{} as the string value
+                if index < len(rest) - 2 and rest[index : index + 2] == "->":
+                    extra_val = rest[index + 2 :].split(",")[0]
+                    value_str = "({})->{}".format(value_str, extra_val)
+                    # remove the first "("
+                    vpos -= 1
+                    next_kpos = vpos + len(value_str)
+                    end_delimiter = ""
+
+            elif delimiter in (",", "\n", "\t", "\r", ""):
+                # key with no value
+                next_kpos = kpos + len(keyword)
+                vpos = -1
+
+            try:
+                key, val = self._decoders.decode(keyword, value_str)
+            except Exception as e:
+                raise ParseError(
+                    "Error parsing key-value ({}, {})".format(
+                        keyword, value_str
+                    )
+                ) from e
+
+            meta = KeyMetadata(
+                kpos=kpos,
+                vpos=vpos,
+                kstring=keyword,
+                vstring=value_str,
+                delim=delimiter,
+                end_delim=end_delimiter,
+            )
+
+            self._keyval.append(KeyValue(key, val, meta))
+
+            kpos = next_kpos
+
+
+def decode_nested_kv(decoders, value):
+    """A key-value decoder that extracts nested key-value pairs and returns
+    them in a dictionary
+
+    Args:
+        decoders (KVDecoders): the KVDecoders to use.
+        value (str): the value string to decode.
+    """
+    if not value:
+        # Mark as flag
+        return True
+
+    parser = KVParser(decoders)
+    parser.parse(value)
+    return {kv.key: kv.value for kv in parser.kv()}
+
+
+def nested_kv_decoder(decoders=None):
+    """Helper function that creates a nested kv decoder with given
+    KVDecoders"""
+    return functools.partial(decode_nested_kv, decoders)
diff --git a/python/setup.py b/python/setup.py
index cfe01763f..0e6b0ea39 100644
--- a/python/setup.py
+++ b/python/setup.py
@@ -71,7 +71,7 @@  setup_args = dict(
     author='Open vSwitch',
     author_email='dev@openvswitch.org',
     packages=['ovs', 'ovs.compat', 'ovs.compat.sortedcontainers',
-              'ovs.db', 'ovs.unixctl'],
+              'ovs.db', 'ovs.unixctl', 'ovs.flows'],
     keywords=['openvswitch', 'ovs', 'OVSDB'],
     license='Apache 2.0',
     classifiers=[