From patchwork Tue Sep 1 23:56:21 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ben Pfaff X-Patchwork-Id: 513104 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from archives.nicira.com (unknown [IPv6:2600:3c00::f03c:91ff:fe6e:bdf7]) by ozlabs.org (Postfix) with ESMTP id EBC5414018C for ; Wed, 2 Sep 2015 09:56:30 +1000 (AEST) Received: from archives.nicira.com (localhost [127.0.0.1]) by archives.nicira.com (Postfix) with ESMTP id 6E0DC109E4; Tue, 1 Sep 2015 16:56:28 -0700 (PDT) X-Original-To: dev@openvswitch.org Delivered-To: dev@openvswitch.org Received: from mx1e3.cudamail.com (mx1.cudamail.com [69.90.118.67]) by archives.nicira.com (Postfix) with ESMTPS id A7A69109E2 for ; Tue, 1 Sep 2015 16:56:27 -0700 (PDT) Received: from bar2.cudamail.com (localhost [127.0.0.1]) by mx1e3.cudamail.com (Postfix) with ESMTPS id B143F420582 for ; Tue, 1 Sep 2015 17:56:26 -0600 (MDT) X-ASG-Debug-ID: 1441151786-03dc53680d62b10001-byXFYA Received: from mx1-pf2.cudamail.com ([192.168.24.2]) by bar2.cudamail.com with ESMTP id GMwKImzB8xBhrqKf (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Tue, 01 Sep 2015 17:56:26 -0600 (MDT) X-Barracuda-Envelope-From: blp@nicira.com X-Barracuda-RBL-Trusted-Forwarder: 192.168.24.2 Received: from unknown (HELO mail-pa0-f45.google.com) (209.85.220.45) by mx1-pf2.cudamail.com with ESMTPS (RC4-SHA encrypted); 1 Sep 2015 23:56:25 -0000 Received-SPF: unknown (mx1-pf2.cudamail.com: Multiple SPF records returned) X-Barracuda-RBL-Trusted-Forwarder: 209.85.220.45 Received: by paap5 with SMTP id p5so3858485paa.0 for ; Tue, 01 Sep 2015 16:56:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:mime-version :content-type:content-disposition:user-agent; bh=w6Z9Uyq4tN3olB6qjHlzdLKyVKaw1+HDj6PQ+ooW8yg=; b=aURJR5AZzxI8Fs6oEiy37JcLWPy9ZiZQv9Ii5guA28wHE+MF+/ErHC3dgu/JDFk9hL 3Uw9xq4BBp01gvufShzuvH69ijOmxVU3rO51Kqmrr7CRSL8QK82+t6DIh8YAL8WbmK9x 18GnH8UfEqPovmSuTPBP/V5SRBDHBXNuC81XFkzAMRGjpwcaTyWe4XmJxVrV2sx75fe1 VneM61Kz0EOM3YBjbR4Ey6eoTq7gTNgMQqcZ4ndG+aPIdYhH3pf5E8oC4Ece0U6bS4B1 zaBliU6Td713bQe7Of9Jd8VeFFKYP4Vz1YcRaFixA+Mh0VAUr0PEyybarHqy1EFqNRMj IWvA== X-Gm-Message-State: ALoCoQn8IA5q4rVl3D8iF+hRzQEiBoNbD3gcG59+F8UlWkgmYXMRQW32nOcFeJpxEaFZBQA0J+Yn X-Received: by 10.66.90.232 with SMTP id bz8mr51018950pab.86.1441151785106; Tue, 01 Sep 2015 16:56:25 -0700 (PDT) Received: from nicira.com ([208.91.2.4]) by smtp.gmail.com with ESMTPSA id cz1sm16613544pdb.14.2015.09.01.16.56.22 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 01 Sep 2015 16:56:23 -0700 (PDT) Date: Tue, 1 Sep 2015 16:56:21 -0700 X-Barracuda-Apparent-Source-IP: 208.91.2.4 X-CudaMail-Envelope-Sender: blp@nicira.com From: Ben Pfaff To: Pravin Shelar , Daniele di Proietto X-CudaMail-Whitelist-To: dev@openvswitch.org X-CudaMail-MID: CM-E2-831099736 X-CudaMail-DTE: 090115 X-CudaMail-Originating-IP: 209.85.220.45 Message-ID: <20150901235621.GA29879@nicira.com> X-ASG-Orig-Subj: [##CM-E2-831099736##]native tunneling bug? MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.23 (2014-03-12) X-Barracuda-Connect: UNKNOWN[192.168.24.2] X-Barracuda-Start-Time: 1441151786 X-Barracuda-Encrypted: DHE-RSA-AES256-SHA X-Barracuda-URL: https://web.cudamail.com:443/cgi-mod/mark.cgi X-ASG-Whitelist: Header =?UTF-8?B?eFwtY3VkYW1haWxcLXdoaXRlbGlzdFwtdG8=?= X-Virus-Scanned: by bsmtpd at cudamail.com X-Barracuda-BRTS-Status: 1 Cc: dev@openvswitch.org Subject: [ovs-dev] native tunneling bug? X-BeenThere: dev@openvswitch.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@openvswitch.org Sender: "dev" I think I've come across a bug in OVS native tunneling, or at any rate an important difference between Linux kernel and OVS native tunneling. In Linux kernel tunneling, a tunnel packet received by the kernel first passes through the kernel IP stack. Among other things, the IP stack drops packets that are not destined to the current host. It appears to me that the native tunneling code doesn't have any similar check, because I'm seeing it accept and packets flooded by the upstream switch that are not destined to an IP address of the host. This means in effect that the user of native tunneling must set "options:local_ip", whereas a user of Linux kernel tunneling doesn't (and probably shouldn't). I suspect that this behavior is unintentional; it isn't mentioned in README-native-tunneling.md or (as far as I can tell) anywhere else. I noticed this while testing OVN. If you configure a few hypervisors and send packets from only one of them, then the switch that connects them will flood all the packets to all of the rest (since it hasn't yet learned where they are). The result is that for N hypervisors, remote VIFs get N-1 copies of the packets instead of just one. I'm appending a patch that works around it, though I'd prefer to fix the tunneling code rather than apply this patch. --8<--------------------------cut here-------------------------->8-- From: Ben Pfaff Date: Tue, 1 Sep 2015 16:52:29 -0700 Subject: [PATCH] ovn-controller: Attach local_ip to each tunnel. This avoids packet duplication when native tunneling is used. --- ovn/controller/encaps.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/ovn/controller/encaps.c b/ovn/controller/encaps.c index 070b741..74b0e87 100644 --- a/ovn/controller/encaps.c +++ b/ovn/controller/encaps.c @@ -113,7 +113,7 @@ tunnel_create_name(struct tunnel_ctx *tc, const char *chassis_id) static void tunnel_add(struct tunnel_ctx *tc, const char *new_chassis_id, - const struct sbrec_encap *encap) + const struct sbrec_encap *encap, const char *encap_ip) { struct port_hash_node *hash_node; @@ -167,6 +167,7 @@ tunnel_add(struct tunnel_ctx *tc, const char *new_chassis_id, ovsrec_interface_set_name(iface, port_name); ovsrec_interface_set_type(iface, encap->type); smap_add(&options, "remote_ip", encap->ip); + smap_add(&options, "local_ip", encap_ip); smap_add(&options, "key", "flow"); ovsrec_interface_set_options(iface, &options); smap_destroy(&options); @@ -235,6 +236,18 @@ encaps_run(struct controller_ctx *ctx, const struct ovsrec_bridge *br_int, return; } + const struct ovsrec_open_vswitch *cfg = ovsrec_open_vswitch_first(ctx->ovs_idl); + if (!cfg) { + VLOG_INFO("No Open_vSwitch row defined."); + return; + } + + const char *encap_ip = smap_get(&cfg->external_ids, "ovn-encap-ip"); + if (!encap_ip) { + VLOG_INFO("Need to specify an encap ip"); + return; + } + const struct sbrec_chassis *chassis_rec; const struct ovsrec_bridge *br; @@ -278,7 +291,7 @@ encaps_run(struct controller_ctx *ctx, const struct ovsrec_bridge *br_int, VLOG_INFO("No supported encaps for '%s'", chassis_rec->name); continue; } - tunnel_add(&tc, chassis_rec->name, encap); + tunnel_add(&tc, chassis_rec->name, encap, encap_ip); } }