[ovs-dev,RFC] 1/1 Add multi-column index support for the Python IDL
diff mbox series

Message ID 1523304183-19472-2-git-send-email-twilson@redhat.com
State RFC
Headers show
Series
  • [ovs-dev,RFC] 1/1 Add multi-column index support for the Python IDL
Related show

Commit Message

Terry Wilson April 9, 2018, 8:03 p.m. UTC
From: Terry Wilson <twilson@redhat.com>

This adds multi-column index support for the Python IDL that is
similar to the feature in the C IDL.

Signed-off-by: Terry Wilson <twilson@redhat.com>
---
 python/automake.mk            |   1 +
 python/ovs/db/custom_index.py | 151 ++++++++++++++++++++++++++++++++++++++++++
 python/ovs/db/idl.py          |  55 ++++++++++++---
 tests/test-ovsdb.py           |   7 +-
 4 files changed, 199 insertions(+), 15 deletions(-)
 create mode 100644 python/ovs/db/custom_index.py

Comments

Ben Pfaff April 10, 2018, 10:20 p.m. UTC | #1
On Mon, Apr 09, 2018 at 03:03:03PM -0500, twilson@redhat.com wrote:
> From: Terry Wilson <twilson@redhat.com>
> 
> This adds multi-column index support for the Python IDL that is
> similar to the feature in the C IDL.
> 
> Signed-off-by: Terry Wilson <twilson@redhat.com>

Thanks for working on this.

I think that this adds a new Python module dependency on
"sortedcontainers".  I didn't have it installed for python2 or
python3--at least on Debian, it is not installed by default--so many
tests failed.  I guess that we should at least document that, and
probably it means that there should be new dependencies.

(When I installed the Python module, everything was fine.)

Thanks,

Ben.
Timothy Redaelli April 11, 2018, 5:23 p.m. UTC | #2
On Tue, 10 Apr 2018 15:20:54 -0700
Ben Pfaff <blp@ovn.org> wrote:

> On Mon, Apr 09, 2018 at 03:03:03PM -0500, twilson@redhat.com wrote:
> > From: Terry Wilson <twilson@redhat.com>
> > 
> > This adds multi-column index support for the Python IDL that is
> > similar to the feature in the C IDL.
> > 
> > Signed-off-by: Terry Wilson <twilson@redhat.com>  
> 
> Thanks for working on this.
> 
> I think that this adds a new Python module dependency on
> "sortedcontainers".  I didn't have it installed for python2 or
> python3--at least on Debian, it is not installed by default--so many
> tests failed.  I guess that we should at least document that, and
> probably it means that there should be new dependencies.
> 
> (When I installed the Python module, everything was fine.)

Hi,
unfortunately "python-sortedcontainers" packages are not available on
RHEL repositories.

This means, if this commit is approved, it will not be possible to build
OVS on RHEL anymore and, if we skip the tests, it will not be possible
to use the python OVS extensions.

so I suggest, if possible, to find and alternative to
"sortedcontainers".

Thank you

> Thanks,
> 
> Ben.
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Flavio Leitner April 11, 2018, 5:52 p.m. UTC | #3
On Wed, Apr 11, 2018 at 07:23:07PM +0200, Timothy Redaelli wrote:
> On Tue, 10 Apr 2018 15:20:54 -0700
> Ben Pfaff <blp@ovn.org> wrote:
> 
> > On Mon, Apr 09, 2018 at 03:03:03PM -0500, twilson@redhat.com wrote:
> > > From: Terry Wilson <twilson@redhat.com>
> > > 
> > > This adds multi-column index support for the Python IDL that is
> > > similar to the feature in the C IDL.
> > > 
> > > Signed-off-by: Terry Wilson <twilson@redhat.com>  
> > 
> > Thanks for working on this.
> > 
> > I think that this adds a new Python module dependency on
> > "sortedcontainers".  I didn't have it installed for python2 or
> > python3--at least on Debian, it is not installed by default--so many
> > tests failed.  I guess that we should at least document that, and
> > probably it means that there should be new dependencies.
> > 
> > (When I installed the Python module, everything was fine.)
> 
> Hi,
> unfortunately "python-sortedcontainers" packages are not available on
> RHEL repositories.
> 
> This means, if this commit is approved, it will not be possible to build
> OVS on RHEL anymore and, if we skip the tests, it will not be possible
> to use the python OVS extensions.
> 
> so I suggest, if possible, to find and alternative to
> "sortedcontainers".

Hi Terry,

Not trying to be the obstructionist here, but as you pointed out in
the cover letter and Tim highlighted here, that dependency isn't
packaged in neither RHEL (7 or 8) or EPEL, which means it breaks OVS
backwards compatibility. I suspect others not-so-recent distros might
have the same problem.  For instance, only F27 includes that IIRC.

Do you have a plan to address that issue?
Ben Pfaff April 11, 2018, 5:58 p.m. UTC | #4
On Wed, Apr 11, 2018 at 02:52:51PM -0300, Flavio Leitner wrote:
> On Wed, Apr 11, 2018 at 07:23:07PM +0200, Timothy Redaelli wrote:
> > On Tue, 10 Apr 2018 15:20:54 -0700
> > Ben Pfaff <blp@ovn.org> wrote:
> > 
> > > On Mon, Apr 09, 2018 at 03:03:03PM -0500, twilson@redhat.com wrote:
> > > > From: Terry Wilson <twilson@redhat.com>
> > > > 
> > > > This adds multi-column index support for the Python IDL that is
> > > > similar to the feature in the C IDL.
> > > > 
> > > > Signed-off-by: Terry Wilson <twilson@redhat.com>  
> > > 
> > > Thanks for working on this.
> > > 
> > > I think that this adds a new Python module dependency on
> > > "sortedcontainers".  I didn't have it installed for python2 or
> > > python3--at least on Debian, it is not installed by default--so many
> > > tests failed.  I guess that we should at least document that, and
> > > probably it means that there should be new dependencies.
> > > 
> > > (When I installed the Python module, everything was fine.)
> > 
> > Hi,
> > unfortunately "python-sortedcontainers" packages are not available on
> > RHEL repositories.
> > 
> > This means, if this commit is approved, it will not be possible to build
> > OVS on RHEL anymore and, if we skip the tests, it will not be possible
> > to use the python OVS extensions.
> > 
> > so I suggest, if possible, to find and alternative to
> > "sortedcontainers".
> 
> Hi Terry,
> 
> Not trying to be the obstructionist here, but as you pointed out in
> the cover letter and Tim highlighted here, that dependency isn't
> packaged in neither RHEL (7 or 8) or EPEL, which means it breaks OVS
> backwards compatibility. I suspect others not-so-recent distros might
> have the same problem.  For instance, only F27 includes that IIRC.
> 
> Do you have a plan to address that issue?

We used to install some Python compatibility modules on some OSes.  See
commit e23812fc60af (Increase prerequisite from Python 2.4 to Python
2.7.) to see how we did it before (which was pretty simple).  The same
thing might work here.
Terry Wilson April 11, 2018, 8:54 p.m. UTC | #5
On Wed, Apr 11, 2018 at 12:52 PM, Flavio Leitner <fbl@sysclose.org> wrote:
> On Wed, Apr 11, 2018 at 07:23:07PM +0200, Timothy Redaelli wrote:
>> On Tue, 10 Apr 2018 15:20:54 -0700
>> Ben Pfaff <blp@ovn.org> wrote:
>>
>> > On Mon, Apr 09, 2018 at 03:03:03PM -0500, twilson@redhat.com wrote:
>> > > From: Terry Wilson <twilson@redhat.com>
>> > >
>> > > This adds multi-column index support for the Python IDL that is
>> > > similar to the feature in the C IDL.
>> > >
>> > > Signed-off-by: Terry Wilson <twilson@redhat.com>
>> >
>> > Thanks for working on this.
>> >
>> > I think that this adds a new Python module dependency on
>> > "sortedcontainers".  I didn't have it installed for python2 or
>> > python3--at least on Debian, it is not installed by default--so many
>> > tests failed.  I guess that we should at least document that, and
>> > probably it means that there should be new dependencies.
>> >
>> > (When I installed the Python module, everything was fine.)
>>
>> Hi,
>> unfortunately "python-sortedcontainers" packages are not available on
>> RHEL repositories.
>>
>> This means, if this commit is approved, it will not be possible to build
>> OVS on RHEL anymore and, if we skip the tests, it will not be possible
>> to use the python OVS extensions.
>>
>> so I suggest, if possible, to find and alternative to
>> "sortedcontainers".
>
> Hi Terry,
>
> Not trying to be the obstructionist here, but as you pointed out in
> the cover letter and Tim highlighted here, that dependency isn't
> packaged in neither RHEL (7 or 8) or EPEL, which means it breaks OVS
> backwards compatibility. I suspect others not-so-recent distros might
> have the same problem.  For instance, only F27 includes that IIRC.
>
> Do you have a plan to address that issue?

My plan, if people are in general OK with adding the dependency at all
(and the architecture of the RFC patch), is 1) Take the Fedora
packages and try to get them in RHEL and 2) make only the feature
dependent on sortedcontainers with some simple try/except magic.
Another option would be to just take the part of sortedcontainers
(which is Apache 2.0) that we need, drop it in the tree, and have the
exception on import actually import from the local copy. In general I
don't like copying projects into other projects, though.
sortedcontainers is a very mature, fast, pure-python library. It would
be handy to be able to use it even in other parts of the library. For
instance, we call sorted() a lot in data.py for doing things like
to_json(), to_python(), to_string(), etc. Sorting on each call instead
of storing in a sorted object in the first place. And those methods
are called *a lot*.

Terry
Flavio Leitner April 12, 2018, 12:55 p.m. UTC | #6
On Wed, Apr 11, 2018 at 03:54:24PM -0500, Terry Wilson wrote:
> On Wed, Apr 11, 2018 at 12:52 PM, Flavio Leitner <fbl@sysclose.org> wrote:
> > On Wed, Apr 11, 2018 at 07:23:07PM +0200, Timothy Redaelli wrote:
> >> On Tue, 10 Apr 2018 15:20:54 -0700
> >> Ben Pfaff <blp@ovn.org> wrote:
> >>
> >> > On Mon, Apr 09, 2018 at 03:03:03PM -0500, twilson@redhat.com wrote:
> >> > > From: Terry Wilson <twilson@redhat.com>
> >> > >
> >> > > This adds multi-column index support for the Python IDL that is
> >> > > similar to the feature in the C IDL.
> >> > >
> >> > > Signed-off-by: Terry Wilson <twilson@redhat.com>
> >> >
> >> > Thanks for working on this.
> >> >
> >> > I think that this adds a new Python module dependency on
> >> > "sortedcontainers".  I didn't have it installed for python2 or
> >> > python3--at least on Debian, it is not installed by default--so many
> >> > tests failed.  I guess that we should at least document that, and
> >> > probably it means that there should be new dependencies.
> >> >
> >> > (When I installed the Python module, everything was fine.)
> >>
> >> Hi,
> >> unfortunately "python-sortedcontainers" packages are not available on
> >> RHEL repositories.
> >>
> >> This means, if this commit is approved, it will not be possible to build
> >> OVS on RHEL anymore and, if we skip the tests, it will not be possible
> >> to use the python OVS extensions.
> >>
> >> so I suggest, if possible, to find and alternative to
> >> "sortedcontainers".
> >
> > Hi Terry,
> >
> > Not trying to be the obstructionist here, but as you pointed out in
> > the cover letter and Tim highlighted here, that dependency isn't
> > packaged in neither RHEL (7 or 8) or EPEL, which means it breaks OVS
> > backwards compatibility. I suspect others not-so-recent distros might
> > have the same problem.  For instance, only F27 includes that IIRC.
> >
> > Do you have a plan to address that issue?
> 
> My plan, if people are in general OK with adding the dependency at all
> (and the architecture of the RFC patch), is 1) Take the Fedora
> packages and try to get them in RHEL and 2) make only the feature
> dependent on sortedcontainers with some simple try/except magic.
> Another option would be to just take the part of sortedcontainers
> (which is Apache 2.0) that we need, drop it in the tree, and have the
> exception on import actually import from the local copy. In general I
> don't like copying projects into other projects, though.
> sortedcontainers is a very mature, fast, pure-python library. It would
> be handy to be able to use it even in other parts of the library. For
> instance, we call sorted() a lot in data.py for doing things like
> to_json(), to_python(), to_string(), etc. Sorting on each call instead
> of storing in a sorted object in the first place. And those methods
> are called *a lot*.

The problem on getting the it packaged on those distributions is that
they won't update older versions.  It is today in F27, but it will be
very hard to make it available on F26, for instance.

I am not a big fan of copying either, but it will depend on our options.
If you use "try/except", for example, then you don't get the improvements
until that package is available. Does that work for you?

Patch
diff mbox series

diff --git a/python/automake.mk b/python/automake.mk
index 9b5d3d8..583a4e9 100644
--- a/python/automake.mk
+++ b/python/automake.mk
@@ -13,6 +13,7 @@  ovs_pyfiles = \
 	python/ovs/daemon.py \
 	python/ovs/fcntl_win.py \
 	python/ovs/db/__init__.py \
+	python/ovs/db/custom_index.py \
 	python/ovs/db/data.py \
 	python/ovs/db/error.py \
 	python/ovs/db/idl.py \
diff --git a/python/ovs/db/custom_index.py b/python/ovs/db/custom_index.py
new file mode 100644
index 0000000..878cf37
--- /dev/null
+++ b/python/ovs/db/custom_index.py
@@ -0,0 +1,151 @@ 
+import collections
+import functools
+import operator
+try:
+    from UserDict import IterableUserDict as DictBase
+except ImportError:
+    from collections import UserDict as DictBase
+
+import sortedcontainers
+
+from ovs.db import data
+
+OVSDB_INDEX_ASC = "ASC"
+OVSDB_INDEX_DESC = "DESC"
+ColumnIndex = collections.namedtuple('ColumnIndex',
+                                     ['column', 'direction', 'key'])
+
+
+class MultiColumnIndex(object):
+    def __init__(self, name):
+        self.name = name
+        self.columns = []
+        self.clear()
+
+    def __repr__(self):
+        return "{}(name={})".format(self.__class__.__name__, self.name)
+
+    def __str__(self):
+        return repr(self) + " columns={} values={}".format(
+            self.columns, [str(v) for v in self.values])
+
+    def add_column(self, column, direction=OVSDB_INDEX_ASC, key=None):
+        self.columns.append(ColumnIndex(column, direction,
+                             key or operator.attrgetter(column)))
+
+    def add_columns(self, *columns):
+        self.columns.extend(ColumnIndex(col, OVSDB_INDEX_ASC,
+                                        operator.attrgetter(col))
+                            for col in columns)
+
+    def _cmp(self, a, b):
+        for col, direction, key in self.columns:
+            aval, bval = key(a), key(b)
+            if aval == bval:
+                continue
+            result = (aval > bval) - (aval < bval)
+            return result if direction == OVSDB_INDEX_ASC else -result
+        return 0
+
+    def index_entry_from_row(self, row):
+        return row._table.rows.IndexEntry(
+            uuid=row.uuid,
+            **{c.column: getattr(row, c.column) for c in self.columns})
+
+    def add(self, row):
+        if not all(hasattr(row, col.column) for col in self.columns):
+            # This is a new row, but it hasn't had the necessary columns set
+            # We'll add it later
+            return
+        self.values.add(self.index_entry_from_row(row))
+
+    def remove(self, row):
+        self.values.remove(self.index_entry_from_row(row))
+
+    def clear(self):
+        self.values = sortedcontainers.SortedListWithKey(
+            key=functools.cmp_to_key(self._cmp))
+
+    def irange(self, start, end):
+        return iter(r._table.rows[r.uuid]
+                    for r in self.values.irange(start, end))
+
+    def __iter__(self):
+        return iter(r._table.rows[r.uuid] for r in self.values)
+
+
+class IndexedRows(DictBase, object):
+    def __init__(self, table, *args, **kwargs):
+        super(IndexedRows, self).__init__(*args, **kwargs)
+        self.table = table
+        self.indexes = {}
+        self.IndexEntry = IndexEntryClass(table)
+
+    def index_create(self, name):
+        if name in self.indexes:
+            raise ValueError("An index named {} already exists".format(name))
+        index = self.indexes[name] = MultiColumnIndex(name)
+        return index
+
+    def __setitem__(self, key, item):
+        self.data[key] = item
+        for index in self.indexes.values():
+            index.add(item)
+
+    def __delitem__(self, key):
+        val = self.data[key]
+        del self.data[key]
+        for index in self.indexes.values():
+            index.remove(val)
+
+    def clear(self):
+        self.data.clear()
+        for index in self.indexes.values():
+            index.clear()
+
+    # Nothing uses the methods below, though they'd be easy to implement
+    def update(self, dict=None, **kwargs):
+        raise NotImplementedError()
+
+    def setdefault(self, key, failobj=None):
+        raise NotImplementedError()
+
+    def pop(self, key, *args):
+        raise NotImplementedError()
+
+    def popitem(self):
+        raise NotImplementedError()
+
+    @classmethod
+    def fromkeys(cls, iterable, value=None):
+        raise NotImplementedError()
+
+
+def IndexEntryClass(table):
+    """Create a class used represent Rows in indexes
+
+    ovs.db.idl.Row, being inherently tied to transaction processing and being
+    initialized with dicts of Datums, is not really useable as an object to
+    pass to and store in indexes. This method will create a class named after
+    the table's name that is initialized with that Table Row's default values.
+    For example:
+
+    Port = IndexEntryClass(idl.tables['Port'])
+
+    will create a Port class. This class can then be used to search custom
+    indexes. For example:
+
+    for port in idx.iranage(Port(name="test1"), Port(name="test9")):
+       ...
+    """
+
+    def defaults_uuid_to_row(atom, base):
+        return atom.value
+
+    columns = ['uuid'] + list(table.columns.keys())
+    cls = collections.namedtuple(table.name, columns)
+    cls._table = table
+    cls.__new__.__defaults__ = (None,) + tuple(
+        data.Datum.default(c.type).to_python(defaults_uuid_to_row)
+        for c in table.columns.values())
+    return cls
diff --git a/python/ovs/db/idl.py b/python/ovs/db/idl.py
index 5a4d129..564977c 100644
--- a/python/ovs/db/idl.py
+++ b/python/ovs/db/idl.py
@@ -22,6 +22,7 @@  import ovs.jsonrpc
 import ovs.ovsuuid
 import ovs.poller
 import ovs.vlog
+from ovs.db import custom_index
 from ovs.db import error
 
 import six
@@ -148,11 +149,23 @@  class Idl(object):
                 if not hasattr(column, 'alert'):
                     column.alert = True
             table.need_table = False
-            table.rows = {}
+            table.rows = custom_index.IndexedRows(table)
             table.idl = self
             table.condition = [True]
             table.cond_changed = False
 
+    def index_create(self, table, name):
+        """Create a named multi-column index on a table"""
+        return self.tables[table].rows.index_create(name)
+
+    def index_irange(self, table, name, start, end):
+        """Return items in a named index between start/end inclusive"""
+        return self.tables[table].rows.indexes[name].irange(start, end)
+
+    def index_equal(self, table, name, value):
+        """Return items in a named index matching a value"""
+        return self.tables[table].rows.indexes[name].irange(value, value)
+
     def close(self):
         """Closes the connection to the database.  The IDL will no longer
         update."""
@@ -359,7 +372,7 @@  class Idl(object):
         for table in six.itervalues(self.tables):
             if table.rows:
                 changed = True
-                table.rows = {}
+                table.rows = custom_index.IndexedRows(table)
 
         if changed:
             self.change_seqno += 1
@@ -511,8 +524,9 @@  class Idl(object):
             else:
                 row_update = row_update['initial']
             self.__add_default(table, row_update)
-            if self.__row_update(table, row, row_update):
-                changed = True
+            changed = self.__row_update(table, row, row_update)
+            table.rows[uuid] = row
+            if changed:
                 self.notify(ROW_CREATE, row)
         elif "modify" in row_update:
             if not row:
@@ -542,15 +556,19 @@  class Idl(object):
                           % (uuid, table.name))
         elif not old:
             # Insert row.
+            op = ROW_CREATE
             if not row:
                 row = self.__create_row(table, uuid)
                 changed = True
             else:
                 # XXX rate-limit
+                op = ROW_UPDATE
                 vlog.warn("cannot add existing row %s to table %s"
                           % (uuid, table.name))
-            if self.__row_update(table, row, new):
-                changed = True
+            changed |= self.__row_update(table, row, new)
+            if op == ROW_CREATE:
+                table.rows[uuid] = row
+            if changed:
                 self.notify(ROW_CREATE, row)
         else:
             op = ROW_UPDATE
@@ -561,8 +579,10 @@  class Idl(object):
                 # XXX rate-limit
                 vlog.warn("cannot modify missing row %s in table %s"
                           % (uuid, table.name))
-            if self.__row_update(table, row, new):
-                changed = True
+            changed |= self.__row_update(table, row, new)
+            if op == ROW_CREATE:
+                table.rows[uuid] = row
+            if changed:
                 self.notify(op, row, Row.from_json(self, table, uuid, old))
         return changed
 
@@ -638,8 +658,7 @@  class Idl(object):
         data = {}
         for column in six.itervalues(table.columns):
             data[column.name] = ovs.db.data.Datum.default(column.type)
-        row = table.rows[uuid] = Row(self, table, uuid, data)
-        return row
+        return Row(self, table, uuid, data)
 
     def __error(self):
         self._session.force_reconnect()
@@ -844,7 +863,17 @@  class Row(object):
             vlog.err("attempting to write bad value to column %s (%s)"
                      % (column_name, e))
             return
+        # Remove prior version of the Row from the index if it has the indexed
+        # column set, and the column changing is an indexed column
+        if hasattr(self, column_name):
+            for idx in self._table.rows.indexes.values():
+                if column_name in (c.column for c in idx.columns):
+                    idx.remove(self)
         self._idl.txn._write(self, column, datum)
+        for idx in self._table.rows.indexes.values():
+            # Only update the index if indexed columns change
+            if column_name in (c.column for c in idx.columns):
+                idx.add(self)
 
     def addvalue(self, column_name, key):
         self._idl.txn._txn_rows[self.uuid] = self
@@ -972,8 +1001,8 @@  class Row(object):
             del self._idl.txn._txn_rows[self.uuid]
         else:
             self._idl.txn._txn_rows[self.uuid] = self
-        self.__dict__["_changes"] = None
         del self._table.rows[self.uuid]
+        self.__dict__["_changes"] = None
 
     def fetch(self, column_name):
         self._idl.txn._fetch(self, column_name)
@@ -1145,6 +1174,10 @@  class Transaction(object):
 
         for row in six.itervalues(self._txn_rows):
             if row._changes is None:
+                # If we add the deleted row back to rows with _changes == None
+                # then __getattr__ will not work for the indexes
+                row.__dict__["_changes"] = {}
+                row.__dict__["_mutations"] = {}
                 row._table.rows[row.uuid] = row
             elif row._data is None:
                 del row._table.rows[row.uuid]
diff --git a/tests/test-ovsdb.py b/tests/test-ovsdb.py
index fc42a2d..8aca35b 100644
--- a/tests/test-ovsdb.py
+++ b/tests/test-ovsdb.py
@@ -289,10 +289,7 @@  def idltest_find_simple2(idl, i):
 
 
 def idltest_find_simple3(idl, i):
-    for row in six.itervalues(idl.tables["simple3"].rows):
-        if row.name == i:
-            return row
-    return None
+    return next(idl.index_equal("simple3", "simple3_by_name", i), None)
 
 
 def idl_set(idl, commands, step):
@@ -579,6 +576,8 @@  def do_idl(schema_file, remote, *commands):
     else:
         schema_helper.register_all()
     idl = ovs.db.idl.Idl(remote, schema_helper)
+    if "simple3" in idl.tables:
+        idl.index_create("simple3", "simple3_by_name")
 
     if commands:
         error, stream = ovs.stream.Stream.open_block(