diff mbox series

[ovs-dev,v4] ovn-sb.ovsschema: Avoid duplicated IPs in Encap table.

Message ID 1542331307-50850-1-git-send-email-hzhou8@ebay.com
State Changes Requested
Headers show
Series [ovs-dev,v4] ovn-sb.ovsschema: Avoid duplicated IPs in Encap table. | expand

Commit Message

Han Zhou Nov. 16, 2018, 1:21 a.m. UTC
From: Han Zhou <hzhou8@ebay.com>

When adding a new chassis, if there is an old chassis with same IP
existed in Encap table, it is allowed to be added today. However,
allowing it to be added results in problems:

1. The new chassis cannot work because none of the other chassises
   are able to create tunnel to it, because of the IP confliction
   with already existed tunnel to the old chassis.

2. All the other chassises will continuously retry creating the tunnel
   and complaining about the error.

So, instead of hiding the problem, it is better to expose it while
trying to add the second chassis with duplicated IP. This patch
ensures it from the ovsdb schema.

Signed-off-by: Han Zhou <hzhou8@ebay.com>
---
 Documentation/intro/install/ovn-upgrades.rst | 28 ++++++++++++++++++++++++++++
 NEWS                                         |  5 +++++
 ovn/ovn-sb.ovsschema                         |  7 ++++---
 3 files changed, 37 insertions(+), 3 deletions(-)

Comments

Ben Pfaff Dec. 12, 2018, 8:05 p.m. UTC | #1
On Thu, Nov 15, 2018 at 05:21:47PM -0800, Han Zhou wrote:
> From: Han Zhou <hzhou8@ebay.com>
> 
> When adding a new chassis, if there is an old chassis with same IP
> existed in Encap table, it is allowed to be added today. However,
> allowing it to be added results in problems:
> 
> 1. The new chassis cannot work because none of the other chassises
>    are able to create tunnel to it, because of the IP confliction
>    with already existed tunnel to the old chassis.
> 
> 2. All the other chassises will continuously retry creating the tunnel
>    and complaining about the error.
> 
> So, instead of hiding the problem, it is better to expose it while
> trying to add the second chassis with duplicated IP. This patch
> ensures it from the ovsdb schema.
> 
> Signed-off-by: Han Zhou <hzhou8@ebay.com>

This seems reasonable, but should we change the schema version number
from 1.17.0 to 2.0.0 because of the incompatibility?

Thanks,

Ben.
Han Zhou Dec. 13, 2018, 6:22 a.m. UTC | #2
On Wed, Dec 12, 2018 at 12:05 PM Ben Pfaff <blp@ovn.org> wrote:
>
> On Thu, Nov 15, 2018 at 05:21:47PM -0800, Han Zhou wrote:
> > From: Han Zhou <hzhou8@ebay.com>
> >
> > When adding a new chassis, if there is an old chassis with same IP
> > existed in Encap table, it is allowed to be added today. However,
> > allowing it to be added results in problems:
> >
> > 1. The new chassis cannot work because none of the other chassises
> >    are able to create tunnel to it, because of the IP confliction
> >    with already existed tunnel to the old chassis.
> >
> > 2. All the other chassises will continuously retry creating the tunnel
> >    and complaining about the error.
> >
> > So, instead of hiding the problem, it is better to expose it while
> > trying to add the second chassis with duplicated IP. This patch
> > ensures it from the ovsdb schema.
> >
> > Signed-off-by: Han Zhou <hzhou8@ebay.com>
>
> This seems reasonable, but should we change the schema version number
> from 1.17.0 to 2.0.0 because of the incompatibility?
>
Hmm, it is potentially incompatible, if there is *dirty* data already in
the old system. However it is not like changes that is obviously
incompatible such as deleting a column.
In this case I am not sure if the major version should be increased or not.
Is there a guideline for this?

Thanks,
Han
Han Zhou Dec. 18, 2018, 10:24 p.m. UTC | #3
On Wed, Dec 12, 2018 at 10:22 PM Han Zhou <zhouhan@gmail.com> wrote:
>
>
>
> On Wed, Dec 12, 2018 at 12:05 PM Ben Pfaff <blp@ovn.org> wrote:
> >
> > On Thu, Nov 15, 2018 at 05:21:47PM -0800, Han Zhou wrote:
> > > From: Han Zhou <hzhou8@ebay.com>
> > >
> > > When adding a new chassis, if there is an old chassis with same IP
> > > existed in Encap table, it is allowed to be added today. However,
> > > allowing it to be added results in problems:
> > >
> > > 1. The new chassis cannot work because none of the other chassises
> > >    are able to create tunnel to it, because of the IP confliction
> > >    with already existed tunnel to the old chassis.
> > >
> > > 2. All the other chassises will continuously retry creating the tunnel
> > >    and complaining about the error.
> > >
> > > So, instead of hiding the problem, it is better to expose it while
> > > trying to add the second chassis with duplicated IP. This patch
> > > ensures it from the ovsdb schema.
> > >
> > > Signed-off-by: Han Zhou <hzhou8@ebay.com>
> >
> > This seems reasonable, but should we change the schema version number
> > from 1.17.0 to 2.0.0 because of the incompatibility?
> >
> Hmm, it is potentially incompatible, if there is *dirty* data already in
the old system. However it is not like changes that is obviously
incompatible such as deleting a column.
> In this case I am not sure if the major version should be increased or
not. Is there a guideline for this?
>
> Thanks,
> Han
>

It doesn't harm to update the number, so I sent v5:
https://mail.openvswitch.org/pipermail/ovs-dev/2018-December/354633.html
Ben Pfaff Dec. 27, 2018, 7:44 p.m. UTC | #4
On Wed, Dec 12, 2018 at 10:22:53PM -0800, Han Zhou wrote:
> On Wed, Dec 12, 2018 at 12:05 PM Ben Pfaff <blp@ovn.org> wrote:
> >
> > On Thu, Nov 15, 2018 at 05:21:47PM -0800, Han Zhou wrote:
> > > From: Han Zhou <hzhou8@ebay.com>
> > >
> > > When adding a new chassis, if there is an old chassis with same IP
> > > existed in Encap table, it is allowed to be added today. However,
> > > allowing it to be added results in problems:
> > >
> > > 1. The new chassis cannot work because none of the other chassises
> > >    are able to create tunnel to it, because of the IP confliction
> > >    with already existed tunnel to the old chassis.
> > >
> > > 2. All the other chassises will continuously retry creating the tunnel
> > >    and complaining about the error.
> > >
> > > So, instead of hiding the problem, it is better to expose it while
> > > trying to add the second chassis with duplicated IP. This patch
> > > ensures it from the ovsdb schema.
> > >
> > > Signed-off-by: Han Zhou <hzhou8@ebay.com>
> >
> > This seems reasonable, but should we change the schema version number
> > from 1.17.0 to 2.0.0 because of the incompatibility?
> >
> Hmm, it is potentially incompatible, if there is *dirty* data already in
> the old system. However it is not like changes that is obviously
> incompatible such as deleting a column.
> In this case I am not sure if the major version should be increased or not.
> Is there a guideline for this?

I don't know a reason to not increment the major version.  It is a
useful way to let alert people know of a potential incompatibility, and
there is no obvious downside.  I'd prefer to increment it.
diff mbox series

Patch

diff --git a/Documentation/intro/install/ovn-upgrades.rst b/Documentation/intro/install/ovn-upgrades.rst
index 0b76c96..3e6cd98 100644
--- a/Documentation/intro/install/ovn-upgrades.rst
+++ b/Documentation/intro/install/ovn-upgrades.rst
@@ -75,6 +75,34 @@  or if you're using a Linux distribution with systemd::
 
     $ sudo systemctl restart ovn-northd
 
+Schema Change
+^^^^^^^^^^^^^
+
+During database upgrading, if there is schema change, the DB file will be
+converted to the new schema automatically, if the schema change is backward
+compatible.  OVN tries the best to keep the DB schemas backward compatible.
+
+However, there can be situations that an incompatible change is reasonble.  An
+example of such case is to add constraints in the table to ensure correctness.
+If there were already data that violates the new constraints got added somehow,
+it will result in DB upgrade failures.  In this case, user should manually
+correct data using ovn-nbctl (for north-bound DB) or ovn-sbctl (for south-
+bound DB), and then upgrade again following previous steps.  Below is a list
+of known impactible schema changes and how to fix when error encountered.
+
+#. Release 2.11: index [type, ip] added for Encap table of south-bound DB to
+   prevent duplicated IPs being used for same tunnel type.  If there are
+   duplicated data added already (e.g. due to improper chassis management),
+   a convenient way to fix is to find the chassis that is using the IP
+   with command::
+
+    $ ovn-sbctl show
+
+   Then delete the chassis with command::
+
+    $ ovn-sbctl chassis-del <chassis>
+
+
 Upgrading OVN Integration
 -------------------------
 
diff --git a/NEWS b/NEWS
index 02402d1..1519b4d 100644
--- a/NEWS
+++ b/NEWS
@@ -7,6 +7,11 @@  Post-v2.10.0
    - The environment variable OVS_CTL_TIMEOUT, if set, is now used
      as the default timeout for control utilities.
    - ovn:
+     * OVN-SB schema changed: duplicated IP with same Encapsulation type
+       is not allowed any more.  Please refer to
+       Documentation/intro/install/ovn-upgrades.rst for the instructions
+       in case there are problems encountered when upgrading from an earlier
+       version.
      * New support for IPSEC encrypted tunnels between hypervisors.
      * ovn-ctl: allow passing user:group ids to the OVN daemons.
    - DPDK:
diff --git a/ovn/ovn-sb.ovsschema b/ovn/ovn-sb.ovsschema
index 5b9537f..afa9859 100644
--- a/ovn/ovn-sb.ovsschema
+++ b/ovn/ovn-sb.ovsschema
@@ -1,7 +1,7 @@ 
 {
     "name": "OVN_Southbound",
-    "version": "1.17.0",
-    "cksum": "3217981733 15045",
+    "version": "1.18.0",
+    "cksum": "910560265 15086",
     "tables": {
         "SB_Global": {
             "columns": {
@@ -50,7 +50,8 @@ 
                                      "min": 0,
                                      "max": "unlimited"}},
                 "ip": {"type": "string"},
-                "chassis_name": {"type": "string"}}},
+                "chassis_name": {"type": "string"}},
+            "indexes": [["type", "ip"]]},
         "Address_Set": {
             "columns": {
                 "name": {"type": "string"},