diff mbox series

[ovs-dev,v3] ofproto-collectors: Windows - Trigger an ARP request before sending IPFIX packet

Message ID 20180518221643.70304-1-vsairam@vmware.com
State Deferred
Headers show
Series [ovs-dev,v3] ofproto-collectors: Windows - Trigger an ARP request before sending IPFIX packet | expand

Commit Message

Sairam Venugopal May 18, 2018, 10:16 p.m. UTC
IPFIX templates and flow packets are silently dropped when a corresponding
ARP entry is missing for the Collector. The fix is to explicitly trigger an
ARP request before sending UDP packets to the collector.

Making changes in Windows to maintain the destination IP address as part
of the Collector Structure. Since the SendARP is a blocking call, we do
not necessarily need to check for a response before sending the IPFIX
packets. Keeping the fix specific to Windows since this behavior cannot be
reproduced in Linux.

Reported-at: https://github.com/openvswitch/ovs-issues/issues/147
Signed-off-by: Sairam Venugopal <vsairam@vmware.com>
---
 ofproto/collectors.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

Comments

Ben Pfaff May 23, 2018, 8:19 p.m. UTC | #1
On Fri, May 18, 2018 at 03:16:43PM -0700, Sairam Venugopal wrote:
> IPFIX templates and flow packets are silently dropped when a corresponding
> ARP entry is missing for the Collector. The fix is to explicitly trigger an
> ARP request before sending UDP packets to the collector.
> 
> Making changes in Windows to maintain the destination IP address as part
> of the Collector Structure. Since the SendARP is a blocking call, we do
> not necessarily need to check for a response before sending the IPFIX
> packets. Keeping the fix specific to Windows since this behavior cannot be
> reproduced in Linux.

When you say that SendARP is a blocking call, do you mean that it sends
a packet and then waits for a response?  That will delay everything else
happening in the OVS main thread for an arbitrary amount of time.
Usually this is not acceptable.
Sairam Venugopal May 24, 2018, 3:13 p.m. UTC | #2
This is the snippet from MSDN:  https://msdn.microsoft.com/en-us/library/windows/desktop/aa366358(v=vs.85).aspx

The SendARP function is used to request the physical hardware address (sometimes referred to as the MAC address) that corresponds to a specified destination IPv4 address. If the information requested is not in the ARP table on the local computer, then the SendARP function will cause an ARP request to be sent to obtain the physical address. If the function is successful, the physical address that corresponds to the specified destination IPv4 address is returned in the array pointed to by the pMacAddr parameter.

I don't think this would end up blocking every other thread in OVS. I might've relayed this information incorrectly. There isn't an argument to specify timeout for the SendArp function.

Thanks,
Sairam

On 5/23/18, 1:19 PM, "Ben Pfaff" <blp@ovn.org> wrote:

    On Fri, May 18, 2018 at 03:16:43PM -0700, Sairam Venugopal wrote:
    > IPFIX templates and flow packets are silently dropped when a corresponding
    > ARP entry is missing for the Collector. The fix is to explicitly trigger an
    > ARP request before sending UDP packets to the collector.
    > 
    > Making changes in Windows to maintain the destination IP address as part
    > of the Collector Structure. Since the SendARP is a blocking call, we do
    > not necessarily need to check for a response before sending the IPFIX
    > packets. Keeping the fix specific to Windows since this behavior cannot be
    > reproduced in Linux.
    
    When you say that SendARP is a blocking call, do you mean that it sends
    a packet and then waits for a response?  That will delay everything else
    happening in the OVS main thread for an arbitrary amount of time.
    Usually this is not acceptable.
Ben Pfaff May 24, 2018, 5:09 p.m. UTC | #3
The MSDN entry is really unclear.  It says that it sends an ARP request,
if necessary.  I cannot find anything in the entry that says whether, if
it sends an ARP request, it waits for a reply.

However, this stackoverflow entry implies that it does wait for a reply:
https://stackoverflow.com/questions/43626184/any-way-to-change-the-behavior-of-synchronous-windows-api-sendarp

I think that's probably unacceptable because blocking the OVS main
thread will break lots of stuff.  If it's necessary to block for ARP
resolution, then I think it will have to happen in a thread dedicated to
that purpose.

On Thu, May 24, 2018 at 03:13:58PM +0000, Sairam Venugopal wrote:
> This is the snippet from MSDN:  https://msdn.microsoft.com/en-us/library/windows/desktop/aa366358(v=vs.85).aspx
> 
> The SendARP function is used to request the physical hardware address (sometimes referred to as the MAC address) that corresponds to a specified destination IPv4 address. If the information requested is not in the ARP table on the local computer, then the SendARP function will cause an ARP request to be sent to obtain the physical address. If the function is successful, the physical address that corresponds to the specified destination IPv4 address is returned in the array pointed to by the pMacAddr parameter.
> 
> I don't think this would end up blocking every other thread in OVS. I might've relayed this information incorrectly. There isn't an argument to specify timeout for the SendArp function.
> 
> Thanks,
> Sairam
> 
> On 5/23/18, 1:19 PM, "Ben Pfaff" <blp@ovn.org> wrote:
> 
>     On Fri, May 18, 2018 at 03:16:43PM -0700, Sairam Venugopal wrote:
>     > IPFIX templates and flow packets are silently dropped when a corresponding
>     > ARP entry is missing for the Collector. The fix is to explicitly trigger an
>     > ARP request before sending UDP packets to the collector.
>     > 
>     > Making changes in Windows to maintain the destination IP address as part
>     > of the Collector Structure. Since the SendARP is a blocking call, we do
>     > not necessarily need to check for a response before sending the IPFIX
>     > packets. Keeping the fix specific to Windows since this behavior cannot be
>     > reproduced in Linux.
>     
>     When you say that SendARP is a blocking call, do you mean that it sends
>     a packet and then waits for a response?  That will delay everything else
>     happening in the OVS main thread for an arbitrary amount of time.
>     Usually this is not acceptable.
>     
>
Sairam Venugopal May 24, 2018, 7:38 p.m. UTC | #4
Thanks for pointing this out. It might be better to simply add a static ARP entry when adding an IPFIX collector. This way we do not trigger this for every packet. 

We are currently documenting this as a workaround for using IPFIX collectors on the same subnet as the Hyper-V host. This patch  was an RFC to see if we could improve this mechanism.

Calling SendArp in a separate thread means that we will somehow need to block the IPFIX packet send until ARP resolves. Otherwise, Windows might just continue to drop UDP packets. We can also check if there is a route to the Collector from the Host and drop the IPFIX packets, but this should've somehow been handled better by the Windows Sockets routine.

Thanks,
Sairam

On 5/24/18, 10:09 AM, "Ben Pfaff" <blp@ovn.org> wrote:

    The MSDN entry is really unclear.  It says that it sends an ARP request,
    if necessary.  I cannot find anything in the entry that says whether, if
    it sends an ARP request, it waits for a reply.
    
    However, this stackoverflow entry implies that it does wait for a reply:
    https://urldefense.proofpoint.com/v2/url?u=https-3A__stackoverflow.com_questions_43626184_any-2Dway-2Dto-2Dchange-2Dthe-2Dbehavior-2Dof-2Dsynchronous-2Dwindows-2Dapi-2Dsendarp&d=DwIDaQ&c=uilaK90D4TOVoH58JNXRgQ&r=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo&m=8GsX0EVrqINs9sASEjOQ9sRGoswFi0dSVQQ7_SEBrfU&s=_TGWYeiurUno6rosPRGPjkeSNycWHjHfqs3e_sWVjL4&e=
    
    I think that's probably unacceptable because blocking the OVS main
    thread will break lots of stuff.  If it's necessary to block for ARP
    resolution, then I think it will have to happen in a thread dedicated to
    that purpose.
    
    On Thu, May 24, 2018 at 03:13:58PM +0000, Sairam Venugopal wrote:
    > This is the snippet from MSDN:  https://urldefense.proofpoint.com/v2/url?u=https-3A__msdn.microsoft.com_en-2Dus_library_windows_desktop_aa366358-28v-3Dvs.85-29.aspx&d=DwIDaQ&c=uilaK90D4TOVoH58JNXRgQ&r=Z6vowHUOjP5ysP_g372c49Nqc1vEKqHKNBkR5Q5Z7uo&m=8GsX0EVrqINs9sASEjOQ9sRGoswFi0dSVQQ7_SEBrfU&s=XtI9PPWyM9PyrQpzlOEv9m1FFEKysRnXiuYlqImusn8&e=
    > 
    > The SendARP function is used to request the physical hardware address (sometimes referred to as the MAC address) that corresponds to a specified destination IPv4 address. If the information requested is not in the ARP table on the local computer, then the SendARP function will cause an ARP request to be sent to obtain the physical address. If the function is successful, the physical address that corresponds to the specified destination IPv4 address is returned in the array pointed to by the pMacAddr parameter.
    > 
    > I don't think this would end up blocking every other thread in OVS. I might've relayed this information incorrectly. There isn't an argument to specify timeout for the SendArp function.
    > 
    > Thanks,
    > Sairam
    > 
    > On 5/23/18, 1:19 PM, "Ben Pfaff" <blp@ovn.org> wrote:
    > 
    >     On Fri, May 18, 2018 at 03:16:43PM -0700, Sairam Venugopal wrote:
    >     > IPFIX templates and flow packets are silently dropped when a corresponding
    >     > ARP entry is missing for the Collector. The fix is to explicitly trigger an
    >     > ARP request before sending UDP packets to the collector.
    >     > 
    >     > Making changes in Windows to maintain the destination IP address as part
    >     > of the Collector Structure. Since the SendARP is a blocking call, we do
    >     > not necessarily need to check for a response before sending the IPFIX
    >     > packets. Keeping the fix specific to Windows since this behavior cannot be
    >     > reproduced in Linux.
    >     
    >     When you say that SendARP is a blocking call, do you mean that it sends
    >     a packet and then waits for a response?  That will delay everything else
    >     happening in the OVS main thread for an arbitrary amount of time.
    >     Usually this is not acceptable.
    >     
    >
diff mbox series

Patch

diff --git a/ofproto/collectors.c b/ofproto/collectors.c
index bc92332..4ce290e 100644
--- a/ofproto/collectors.c
+++ b/ofproto/collectors.c
@@ -27,11 +27,17 @@ 
 #include "sset.h"
 #include "util.h"
 #include "openvswitch/vlog.h"
+#ifdef WIN32
+#include <iphlpapi.h>
+#endif WIN32
 
 VLOG_DEFINE_THIS_MODULE(collectors);
 
 struct collectors {
     int *fds;                     /* Sockets. */
+#ifdef WIN32
+    IPAddr *dest_ips;             /* Collector's IP address */
+#endif WIN32
     size_t n_fds;                 /* Number of sockets. */
 };
 
@@ -58,6 +64,9 @@  collectors_create(const struct sset *targets, uint16_t default_port,
 
     c = xmalloc(sizeof *c);
     c->fds = xmalloc(sizeof *c->fds * sset_count(targets));
+#ifdef WIN32
+    c->dest_ips = xmalloc(sizeof *c->dest_ips * sset_count(targets));
+#endif
     c->n_fds = 0;
     SSET_FOR_EACH (name, targets) {
         int error;
@@ -65,6 +74,13 @@  collectors_create(const struct sset *targets, uint16_t default_port,
 
         error = inet_open_active(SOCK_DGRAM, name, default_port, NULL, &fd, 0);
         if (fd >= 0) {
+#ifdef WIN32
+            char *target = xstrdup(name);
+            char *p;
+            p = target;
+            c->dest_ips[c->n_fds] = htonl(inet_addr(inet_parse_token(&p)));
+            free(target);
+#endif WIN32
             c->fds[c->n_fds++] = fd;
         } else {
             static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
@@ -97,6 +113,9 @@  collectors_destroy(struct collectors *c)
         for (i = 0; i < c->n_fds; i++) {
             closesocket(c->fds[i]);
         }
+#ifdef WIN32
+        free(c->dest_ips);
+#endif
         free(c->fds);
         free(c);
     }
@@ -114,6 +133,17 @@  collectors_send(const struct collectors *c, const void *payload, size_t n)
 
         for (i = 0; i < c->n_fds; i++) {
             static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 5);
+#ifdef WIN32
+            /* Trigger an ARP request before sending an IPFIX packet */
+            ULONG mac[2];
+            ULONG phys_addr_len = 6;
+            DWORD ret_val;
+            ret_val = SendARP(c->dest_ips[i], 0, &mac, &phys_addr_len);
+            if (ret_val != NO_ERROR) {
+                VLOG_ERR_RL(&rl, "Sending ARP failed:%lu for %lu",
+                            ret_val, c->dest_ips[i]);
+            }
+#endif
             if (send(c->fds[i], payload, n, 0) == -1) {
                 char *s = describe_fd(c->fds[i]);
                 VLOG_WARN_RL(&rl, "%s: sending to collector failed (%s)",