From patchwork Fri Jan 29 18:18:39 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: stephen hemminger X-Patchwork-Id: 44003 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 19B50B7D14 for ; Sat, 30 Jan 2010 05:19:03 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751968Ab0A2SS5 (ORCPT ); Fri, 29 Jan 2010 13:18:57 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751186Ab0A2SS5 (ORCPT ); Fri, 29 Jan 2010 13:18:57 -0500 Received: from mail.vyatta.com ([76.74.103.46]:48149 "EHLO mail.vyatta.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751185Ab0A2SS4 (ORCPT ); Fri, 29 Jan 2010 13:18:56 -0500 Received: from localhost (localhost.localdomain [127.0.0.1]) by mail.vyatta.com (Postfix) with ESMTP id 3DA2C4F43B2; Fri, 29 Jan 2010 10:18:56 -0800 (PST) X-Virus-Scanned: amavisd-new at tahiti.vyatta.com Received: from mail.vyatta.com ([127.0.0.1]) by localhost (mail.vyatta.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id NAkAUlAEZOjW; Fri, 29 Jan 2010 10:18:41 -0800 (PST) Received: from nehalam (pool-74-107-135-205.ptldor.fios.verizon.net [74.107.135.205]) by mail.vyatta.com (Postfix) with ESMTP id EF28A4F439F; Fri, 29 Jan 2010 10:18:40 -0800 (PST) Date: Fri, 29 Jan 2010 10:18:39 -0800 From: Stephen Hemminger To: David Miller Cc: netdev@vger.kernel.org Subject: [RFC] NAPI as kobject proposal Message-ID: <20100129101839.36944ba5@nehalam> Organization: Vyatta X-Mailer: Claws Mail 3.7.2 (GTK+ 2.18.3; x86_64-pc-linux-gnu) Mime-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The NAPI interface structure in current kernels is managed by the driver. As part of receive packet steering there is a requirement to add an additional parameter to this for the CPU map. And this map needs to have an API to set it. The right way to do this in the kernel model is to make NAPI into a kobject and associate it back with the network device (parent). This isn't wildly difficult but does change some of the API for network device drivers because: 1. They need to handle another possible error on setup 2. NAPI object needs to be dynamically allocated separately (not as part of netdev_priv) 3. Driver should pass index that can be uses as part of name (easier than scanning) Eventually, there will be: /sys/class/net/eth0/napi0/ weight cpumap So here is a starting point patch that shows how the API might look like. --- include/linux/netdevice.h | 20 ++++++++++++++------ net/core/dev.c | 28 ++++++++++++++++++++++++++-- 2 files changed, 40 insertions(+), 8 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html --- a/include/linux/netdevice.h 2010-01-29 10:00:55.820739116 -0800 +++ b/include/linux/netdevice.h 2010-01-29 10:15:33.098863437 -0800 @@ -378,6 +378,8 @@ struct napi_struct { struct list_head dev_list; struct sk_buff *gro_list; struct sk_buff *skb; + + struct kobject kobj; }; enum { @@ -1037,25 +1039,31 @@ static inline void *netdev_priv(const st #define SET_NETDEV_DEVTYPE(net, devtype) ((net)->dev.type = (devtype)) /** - * netif_napi_add - initialize a napi context + * netif_napi_init - initialize a napi context * @dev: network device * @napi: napi context + * @index: queue number * @poll: polling function * @weight: default weight * - * netif_napi_add() must be used to initialize a napi context prior to calling + * netif_napi_init() must be used to create a napi context prior to calling * *any* of the other napi related functions. + * + * in case of error, the context is not left in napi_list so it can + * be cleaned up by free_netdev, but is not valid for use. */ -void netif_napi_add(struct net_device *dev, struct napi_struct *napi, - int (*poll)(struct napi_struct *, int), int weight); +extern int netif_napi_init(struct net_device *dev, struct napi_struct *napi, + unsigned index, + int (*poll)(struct napi_struct *, int), int weight); /** - * netif_napi_del - remove a napi context + * netif_napi_del - free a napi context * @napi: napi context * * netif_napi_del() removes a napi context from the network device napi list + * and frees it. */ -void netif_napi_del(struct napi_struct *napi); +extern void netif_napi_del(struct napi_struct *napi); struct napi_gro_cb { /* Virtual address of skb_shinfo(skb)->frags[0].page + offset. */ --- a/net/core/dev.c 2010-01-29 10:00:55.810739850 -0800 +++ b/net/core/dev.c 2010-01-29 10:14:53.388864572 -0800 @@ -2926,9 +2926,24 @@ void napi_complete(struct napi_struct *n } EXPORT_SYMBOL(napi_complete); -void netif_napi_add(struct net_device *dev, struct napi_struct *napi, +static void release_napi(struct kobject *kobj) +{ + struct napi_struct *napi + = container_of(kobj, struct napi_struct, kobj); + kfree(napi); +} + +static struct kobj_type napi_ktype = { + /* insert future sysfs hooks ... */ + .release = release_napi, +}; + +int netif_napi_init(struct net_device *dev, struct napi_struct *napi, + unsigned index, int (*poll)(struct napi_struct *, int), int weight) { + int err; + INIT_LIST_HEAD(&napi->poll_list); napi->gro_count = 0; napi->gro_list = NULL; @@ -2941,9 +2956,16 @@ void netif_napi_add(struct net_device *d spin_lock_init(&napi->poll_lock); napi->poll_owner = -1; #endif + + err = kobject_init_and_add(&napi->kobj, &napi_ktype, + &dev->dev.kobj, "napi%d", index); + if (err) + return err; + set_bit(NAPI_STATE_SCHED, &napi->state); + return 0; } -EXPORT_SYMBOL(netif_napi_add); +EXPORT_SYMBOL(netif_napi_init); void netif_napi_del(struct napi_struct *napi) { @@ -2960,6 +2982,8 @@ void netif_napi_del(struct napi_struct * napi->gro_list = NULL; napi->gro_count = 0; + + kobject_put(&napi->kobj); } EXPORT_SYMBOL(netif_napi_del);