[1/3] Kernel interfaces for multiqueue aware socket

From: Fenghua Yu <fenghua.yu@intel.com>

From: Fenghua Yu <fenghua.yu@intel.com>

Multiqueue and multicore provide packet parallel processing methodology.
Current kernel and network drivers place one queue on one core. But the higher
level socket doesn't know multiqueue. Current socket only can receive or send
packets through one network interfaces. In some cases e.g. multi bpf filter
tcpdump and snort, a lot of contentions come from socket operations like ring
buffer. Even if the application itself has been fully parallelized and run on
multi-core systems and NIC handlex tx/rx in multiqueue in parallel, network layer
and NIC device driver assemble packets to a single, serialized queue. Thus the
application cannot actually run in parallel in high speed.

To break the serialized packets assembling bottleneck in kernel, one way is to
allow socket to know multiqueue associated with a NIC interface. So each socket
can handle tx/rx in one queue in parallel.

Kernel provides several interfaces by which sockets can be bound to rx/tx queues.
User applications can configure socket by providing several sockets that each
bound to a single queue, applications can get data from kernel in parallel. After
that, competitions mentioned above can be removed.

With this patch, the user-space receiving speed on a Intel SR1690 server with
a single L5640 6-core processor and a single ixgbe-based NIC goes from 0.73Mpps
to 4.20Mpps, nearly a linear speedup. A Intel SR1625 server two E5530 4-core
processors and a single ixgbe-based NIC goes from 0.80Mpps to 4.6Mpps. We noticed
the performance penalty comes from NUMA memory allocation.

This patch set provides kernel ioctl interfaces for user space. User space can
either directly call the interfaces or libpcap interfaces can be further provided
on the top of the kernel ioctl interfaces.

The order of tx/rx packets is up to user application. In some cases, e.g. network
monitors, ordering is not a big problem because they more care how to receive and
analyze packets in highest performance in parallel.

This patch set only implements multiqueue interfaces for AF_PACKET and Intel
ixgbe NIC. Other protocols and NIC's can be handled on the top of this patch set.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Junchang Wang <junchangwang@gmail.com>
Signed-off-by: Xinan Tang <xinan.tang@intel.com>
---
 include/linux/sockios.h |    7 +++++++
 include/net/sock.h      |   18 ++++++++++++++++++
 net/core/sock.c         |    4 +++-
 3 files changed, 28 insertions(+), 1 deletions(-)

Message ID	46a08278c2ba21737528eb4b77391a7e8bc88000.1292405004.git.fenghua.yu@intel.com
State	Rejected, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id D75AC1007D6 for <patchwork-incoming@ozlabs.org>; Thu, 16 Dec 2010 07:14:41 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755337Ab0LOUOd (ORCPT <rfc822;patchwork-incoming@ozlabs.org>); Wed, 15 Dec 2010 15:14:33 -0500 Received: from mga14.intel.com ([143.182.124.37]:5800 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755063Ab0LOUOK (ORCPT <rfc822;netdev@vger.kernel.org>); Wed, 15 Dec 2010 15:14:10 -0500 Received: from azsmga001.ch.intel.com ([10.2.17.19]) by azsmga102.ch.intel.com with ESMTP; 15 Dec 2010 12:14:08 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.59,350,1288594800"; d="scan'208";a="362217864" Received: from fenghua-desk.sc.intel.com ([10.3.52.216]) by azsmga001.ch.intel.com with ESMTP; 15 Dec 2010 12:13:54 -0800 From: "Fenghua Yu" <fenghua.yu@intel.com> To: "David S. Miller" <davem@davemloft.net>, "Eric Dumazet" <eric.dumazet@gmail.com>, "John Fastabend" <john.r.fastabend@intel.com>, "Xinan Tang" <xinan.tang@intel.com>, "Junchang Wang" <junchangwang@gmail.com> Cc: "netdev" <netdev@vger.kernel.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, Fenghua Yu <fenghua.yu@intel.com>, Junchang Wang <junchangwang@gmail.com>, Xinan Tang <xinan.tang@intel.com> Subject: [PATCH 1/3] Kernel interfaces for multiqueue aware socket Date: Wed, 15 Dec 2010 12:02:04 -0800 Message-Id: <46a08278c2ba21737528eb4b77391a7e8bc88000.1292405004.git.fenghua.yu@intel.com> X-Mailer: git-send-email 1.7.2 In-Reply-To: <cover.1292405004.git.fenghua.yu@intel.com> References: <cover.1292405004.git.fenghua.yu@intel.com> In-Reply-To: <cover.1292405004.git.fenghua.yu@intel.com> References: <cover.1292405004.git.fenghua.yu@intel.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: <netdev.vger.kernel.org> X-Mailing-List: netdev@vger.kernel.org

[1/3] Kernel interfaces for multiqueue aware socket

Commit Message

Comments

Patch