IPv4 IPv6 parallel dns lookup in combination with nfqueue is problematic

Hi Everyone,

Problem:
I have a simple daemon listening for packets coming from nfqueue. When
a client issues  parallel dns requests for IPv4 and IPv6 addresses
(since glibc 2.9 this is default behaviour), IPv6 request is dropped
on its way in gateway. Client, after 5 seconds timeout, sends these
requests sequentially and there is no problem in this case.

Workaround:
I applied a kernel patch from an earlier mail (
http://www.spinics.net/lists/netfilter-devel/msg15860.html ) to kernel
version 3.16. This patch solves the problem but I'm unaware of the
performance and security implications of this solution. I hope to find
a better solution that doesn't require patching kernel.

Regards,
Tarik.

Related links to the problem:
https://bbs.archlinux.org/viewtopic.php?id=75770
https://www.astaro.org/gateway-products/management-networking-logging-reporting/51569-slow-dns-queries-parallel-requests-ipv6.html
http://www.spinics.net/lists/netfilter-devel/msg15860.html

Extra info:
I insert packets to nfqueue in mangle table (rather than raw) because
the daemon will need to process connection marks in the future.
Currently, it reads packets from queue, marks them and allows them to
pass (NF_ACCEPT).

Network topology:
In my topology, a client (10.21.0.100) sends dns requests to 8.8.4.4
via gateway (10.21.0.1). Gateway performs snat (to 10.100.0.21) and
sends packets. The daemon runs on gateway.
10.21.0.100 (client)  ---->  10.21.0.1 (gw internal interface) ------>
(snat) 10.100.0.21 (gw external interface) -----> 8.8.4.4

Iptables rule:
iptables -t mangle -A FORWARD -m mark --mark 0x0/0x3000000 -j NFQUEUE
--queue-num 10 --queue-bypass

--------

tcpdump output (unpatched kernel):
11:08:13.580903 IP 10.21.0.100.40004 > 8.8.4.4.53:  34824+ A? httpbin.org. (29)
11:08:13.580958 IP 10.21.0.100.40004 > 8.8.4.4.53:  17101+ AAAA?
httpbin.org. (29)
11:08:13.581084 IP 10.100.0.21.40004 > 8.8.4.4.53:  34824+ A? httpbin.org. (29)
11:08:13.604559 IP 8.8.4.4.53 > 10.100.0.21.40004:  34824 1/0/0 A
54.175.222.246 (45)
11:08:13.604607 IP 8.8.4.4.53 > 10.21.0.100.40004:  34824 1/0/0 A
54.175.222.246 (45)
11:08:18.585022 IP 10.21.0.100.40004 > 8.8.4.4.53:  34824+ A? httpbin.org. (29)
11:08:18.585097 IP 10.100.0.21.40004 > 8.8.4.4.53:  34824+ A? httpbin.org. (29)
11:08:18.606474 IP 8.8.4.4.53 > 10.100.0.21.40004:  34824 1/0/0 A
54.175.222.246 (45)
11:08:18.606563 IP 8.8.4.4.53 > 10.21.0.100.40004:  34824 1/0/0 A
54.175.222.246 (45)
11:08:18.607175 IP 10.21.0.100.40004 > 8.8.4.4.53:  17101+ AAAA?
httpbin.org. (29)
11:08:18.607246 IP 10.100.0.21.40004 > 8.8.4.4.53:  17101+ AAAA?
httpbin.org. (29)
11:08:18.664119 IP 8.8.4.4.53 > 10.100.0.21.40004:  17101 0/1/0 (110)
11:08:18.664201 IP 8.8.4.4.53 > 10.21.0.100.40004:  17101 0/1/0 (110)

----

tcpdump output (patched kernel):

15:39:53.141114 IP 10.21.0.100.58891 > 8.8.4.4.53:  43314+ A? httpbin.org. (29)
15:39:53.141247 IP 10.21.0.100.58891 > 8.8.4.4.53:  25492+ AAAA?
httpbin.org. (29)
15:39:53.141362 IP 10.100.0.21.58891 > 8.8.4.4.53:  43314+ A? httpbin.org. (29)
15:39:53.141672 IP 10.100.0.21.58891 > 8.8.4.4.53:  25492+ AAAA?
httpbin.org. (29)
15:39:53.166438 IP 8.8.4.4.53 > 10.100.0.21.58891:  25492 0/1/0 (110)
15:39:53.166507 IP 8.8.4.4.53 > 10.21.0.100.58891:  25492 0/1/0 (110)
15:39:53.167052 IP 8.8.4.4.53 > 10.100.0.21.58891:  43314 1/0/0 A
54.175.219.8 (45)
15:39:53.167095 IP 8.8.4.4.53 > 10.21.0.100.58891:  43314 1/0/0 A
54.175.219.8 (45)

-------

Kernel patch(3.16.3):

+               }
+
                /* Seen it before?  This can happen for loopback, retrans,
                 * or local packets.
                 */

-------

Source code of daemon:

#include <stdlib.h>
#include <signal.h>
#include <poll.h>
#include <stdio.h>
#include <errno.h>
#include <string.h>
#include <arpa/inet.h>
#include <linux/netfilter.h>
#include <libnetfilter_queue/libnetfilter_queue.h>
#include <syslog.h>

#define CUSTOM_MARK 0x2000000

/* how long to wait for a new packet */
#define POLL_TIME 10

int g_shutdown = 0;

int nfq_callback_handler(struct nfq_q_handle *queue_handler, struct
nfgenmsg *nfmsg, struct nfq_data *tb, void *arg){
    unsigned char *data;
    int datalen = nfq_get_payload(tb, &data);
    if (datalen > 0)
    {

        struct nfqnl_msg_packet_hdr *hdr = nfq_get_msg_packet_hdr(tb);

        nfq_set_verdict2(queue_handler,
                         hdr ? ntohl(hdr->packet_id) : 0,
                         NF_ACCEPT,
                         CUSTOM_MARK,
                         0,
                         NULL);
    }
    return 0;
}

void initialize_queue() {
    struct nfq_handle *nfqh= NULL;
    struct nfq_q_handle *queue_handler = NULL;
    unsigned int queue_num = 10;
    if ((nfqh = nfq_open()) == 0){
        syslog(LOG_ERR, "nfq_open failed.");
    }
    else
    {
        /* ignore return code for this since it's inconsistent between
kernel versions */
        /* see http://www.spinics.net/lists/netfilter/msg42063.html */
        nfq_unbind_pf(nfqh, AF_INET);

        if (nfq_bind_pf(nfqh, AF_INET) < 0){
            syslog(LOG_ERR,"nfq_bind_pf failed.");
        }
        else if ((queue_handler = nfq_create_queue(nfqh, queue_num,
&nfq_callback_handler, NULL)) == 0){
            syslog(LOG_ERR,"nfq_create_queue on %u failed.", queue_num);
        }
        else if (nfq_set_mode(queue_handler, NFQNL_COPY_PACKET, 0xffff) < 0) {
            syslog(LOG_ERR,"failed to set NFQNL_COPY_PACKET.");
        }
        else
        {
            /* get the file descriptor for netlink queue */
            int fd = nfnl_fd(nfq_nfnlh(nfqh));

            //set buf size
            int on = 1024 * 1024;
            unsigned int queue_size = 10000;
            if (setsockopt(fd, SOL_SOCKET, SO_RCVBUF, &on, sizeof(int)) == -1 ){
                syslog(LOG_WARNING,"Buffer size could not be set");
            }

            //set queue size that is given by -s option
            if((nfq_set_queue_maxlen(queue_handler, queue_size)) == -1){
                syslog(LOG_WARNING,"Queue size could not be set.");
            }

            ssize_t ret;
            char buf[10000];
            struct pollfd pollinfo;
            while (!g_shutdown)
            {
                pollinfo.fd = fd;
                pollinfo.events = POLLIN;

                ret = poll(&pollinfo, 1, POLL_TIME);
                if ((ret < 0) && (errno != EINTR))
                {
                    syslog(LOG_ERR,"poll error nfq fd %d (%d/%s)", fd,
errno, strerror(errno));
                    break;
                }

                while ((ret = recv(fd, buf, sizeof(buf), MSG_DONTWAIT)) > 0) {
                    nfq_handle_packet(nfqh, buf, (int)ret);
                }

                if (ret == -1)
                {
                    if (errno == EAGAIN || errno == EINTR || errno == ENOBUFS)
                        ;
                    else
                    {
                        syslog(LOG_ERR, "recv error nfq fd %d
(%d/%s)", fd, errno, strerror(errno));
                        break;
                    }
                }
                else if (ret == 0)
                {
                    syslog(LOG_ERR,"nfq socket closed");
                    break;
                }
            }
            nfq_destroy_queue(queue_handler);
            nfq_close( nfqh );
            queue_handler = NULL;
            nfqh = NULL;
        }
    }
}

static void sig_handler(int signum){
    /**
     * This function handles cathed signals
     *
     * @param signum : Sended signal
     * @return void
     */

    if(signum == SIGINT){
        g_shutdown = 1;
        syslog(LOG_INFO,"Interrupted.");

    }
    else if(signum == SIGTERM){
        g_shutdown = 1;
        syslog(LOG_INFO,"Killed.");

    }
}

int main(int argc, char *argv[]){
    int logOpt = LOG_PID;

    signal(SIGINT, sig_handler);  //sig number 2
    signal(SIGTERM, sig_handler); //sig number 15
    signal(SIGHUP, sig_handler);  //sig number 1
    signal(SIGUSR1, sig_handler);  //sig number 10
    openlog("sniffer", logOpt, LOG_USER);

    syslog(LOG_INFO, "Program is  started.");
    initialize_queue();
    closelog();

    return 0;
}

Message ID	CAMxdDZBwqRxZjywAfHUm-bbe-0veLPqPwAfFpw90cb0As80Dmg@mail.gmail.com
State	Deferred
Delegated to:	Pablo Neira
Headers	show Return-Path: <netfilter-devel-owner@vger.kernel.org> X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 2668B140B04 for <incoming@patchwork.ozlabs.org>; Fri, 24 Jul 2015 20:34:32 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=tarikdemirci.com header.i=@tarikdemirci.com header.b=ihaYxgs/; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754050AbbGXKe3 (ORCPT <rfc822;incoming@patchwork.ozlabs.org>); Fri, 24 Jul 2015 06:34:29 -0400 Received: from mail-ob0-f177.google.com ([209.85.214.177]:35259 "EHLO mail-ob0-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752285AbbGXKeU convert rfc822-to-8bit (ORCPT <rfc822;netfilter-devel@vger.kernel.org>); Fri, 24 Jul 2015 06:34:20 -0400 Received: by obbop1 with SMTP id op1so13675818obb.2 for <netfilter-devel@vger.kernel.org>; Fri, 24 Jul 2015 03:34:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tarikdemirci.com; s=google; h=mime-version:date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=hCzktyyaYrfk9FI2EZvMnUZE/1HswUIjEK95Sjcp6/A=; b=ihaYxgs/o1AiJLBL+XRaY2icgYipY+z9s8bctmn254moLqDJCy91OlmpAEu9JeAfx1 4ChfKTOC3qqS37tCZV5stPEpsGWHtlMUMFCo9W1CklhpvXg1Ill5G577LNnONjjbgTul /S2IAImcR99k3eXbc3TpWEdZNFEtvst5r517A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=hCzktyyaYrfk9FI2EZvMnUZE/1HswUIjEK95Sjcp6/A=; b=T2vHULmuv9Y4KtmzEGx6glqQTlDfJvmlxllwaTr3SD9jRJ5L7Ym5/9CWcYGjtuHRCc dVMiZ87+qYAFUNJ65ZRKkB0Geqepu/Ioo0PeJZBDzc50Y8Ivuo0V9u2FpnS5XHYRZLKh aLSaiJH5tJh+TiGmgeSExRZYyCAjQD3DPFs6RzacS+qWWuNA3QURmQBTdouGWoeLGvoE XEd6+4O3oVysAIU92d5CO6y1Qj27Di105+WEeU8lnhUqzeNiWkmpxJlfiaEb61KJuUuD 6m+dm+fBK8O9sO2qsHnhlg/gzsV/T6XTsB6Jnm6nAuyNG+qB50rRx2gR3Jpxs7eWH08p ++cQ== X-Gm-Message-State: ALoCoQkfaWQaKt8m3CKb9ljwVXeJplI7SyXnqdHBXa/xu1Zy4MY46MxYei+ViGOdQuDnnw3ztclx MIME-Version: 1.0 X-Received: by 10.202.18.21 with SMTP id 21mr13695096ois.10.1437734059515; Fri, 24 Jul 2015 03:34:19 -0700 (PDT) Received: by 10.202.68.86 with HTTP; Fri, 24 Jul 2015 03:34:19 -0700 (PDT) X-Originating-IP: [178.251.45.176] Date: Fri, 24 Jul 2015 13:34:19 +0300 Message-ID: <CAMxdDZBwqRxZjywAfHUm-bbe-0veLPqPwAfFpw90cb0As80Dmg@mail.gmail.com> Subject: IPv4 IPv6 parallel dns lookup in combination with nfqueue is problematic From: Tarik Demirci <tarik@tarikdemirci.com> To: netfilter-devel@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8BIT Sender: netfilter-devel-owner@vger.kernel.org Precedence: bulk List-ID: <netfilter-devel.vger.kernel.org> X-Mailing-List: netfilter-devel@vger.kernel.org

IPv4 IPv6 parallel dns lookup in combination with nfqueue is problematic

Commit Message

Comments

Patch