Patchwork [RFC,3/3] acct: add input and output interface index

login
register
mail settings
Submitter holger@eitzenberger.org
Date Sept. 26, 2013, 3:31 p.m.
Message ID <20130926154005.592908761@eitzenberger.org>
Download mbox | patch
Permalink /patch/278213/
State RFC
Headers show

Comments

holger@eitzenberger.org - Sept. 26, 2013, 3:31 p.m.
The interface indices are exported as uint32_t, although being
signed integer inside the kernel, which goes in line with
what nfnetlink_queue does.

Both interface indices are wrapped inside CTA_ACCT.

Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>


--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira - Oct. 17, 2013, 11:06 a.m.
Hi Holger,

I like patches 1/3 and 2/3, they are nice cleanups.

Some comments regarding this patch.

On Thu, Sep 26, 2013 at 05:31:53PM +0200, Holger Eitzenberger wrote:
> The interface indices are exported as uint32_t, although being
> signed integer inside the kernel, which goes in line with
> what nfnetlink_queue does.
> 
> Both interface indices are wrapped inside CTA_ACCT.
> 
> Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org>
> 
> Index: net-next-ipfix/include/net/netfilter/nf_conntrack_acct.h
> ===================================================================
> --- net-next-ipfix.orig/include/net/netfilter/nf_conntrack_acct.h
> +++ net-next-ipfix/include/net/netfilter/nf_conntrack_acct.h
> @@ -21,6 +21,8 @@ struct nf_conn_counter {
>  
>  struct nf_conn_acct {
>  	struct nf_conn_counter counter[IP_CT_DIR_MAX];
> +	int indev;
> +	int outdev;
>  };
>  
>  static inline
> Index: net-next-ipfix/net/netfilter/nf_conntrack_core.c
> ===================================================================
> --- net-next-ipfix.orig/net/netfilter/nf_conntrack_core.c
> +++ net-next-ipfix/net/netfilter/nf_conntrack_core.c
> @@ -33,6 +33,7 @@
>  #include <linux/mm.h>
>  #include <linux/nsproxy.h>
>  #include <linux/rculist_nulls.h>
> +#include <net/dst.h>
>  
>  #include <net/netfilter/nf_conntrack.h>
>  #include <net/netfilter/nf_conntrack_l3proto.h>
> @@ -1110,6 +1111,7 @@ void __nf_ct_refresh_acct(struct nf_conn
>  acct:
>  	if (do_acct) {
>  		struct nf_conn_acct *acct;
> +		struct dst_entry *dst;
>  
>  		acct = nf_conn_acct_find(ct);
>  		if (acct) {
> @@ -1117,6 +1119,13 @@ acct:
>  
>  			atomic64_inc(&counter[CTINFO2DIR(ctinfo)].packets);
>  			atomic64_add(skb->len, &counter[CTINFO2DIR(ctinfo)].bytes);
> +
> +			if (acct->indev == 0 && skb->dev)
> +				acct->indev = skb->dev->ifindex;
> +
> +			dst = skb_dst(skb);
> +			if (acct->outdev == 0 && dst && dst->dev)
> +				acct->outdev = dst->dev->ifindex;

If you only set indev/outdev once we can skip the conntrack extension
by passing the skb to nf_ct_deliver_cached_events and include this
information in the conntrack events. That would not allow to dump the
device from conntrack dumps though. I still have concerns with this
approach as this doesn't seem to cover the scenario in which the
in/outdev changes.

Regards.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
holger@eitzenberger.org - Oct. 17, 2013, 11:33 a.m.
Hi Pablo,

> I like patches 1/3 and 2/3, they are nice cleanups.

thanks for looking into this.

> If you only set indev/outdev once we can skip the conntrack extension
> by passing the skb to nf_ct_deliver_cached_events and include this
> information in the conntrack events. That would not allow to dump the
> device from conntrack dumps though. I still have concerns with this
> approach as this doesn't seem to cover the scenario in which the
> in/outdev changes.

I know that doing it this simiple way is only "best effort", as e. g.
with IP multipathing or 802.3ad this information is not % correct
in all cases.

And the question we have to answer is whether this interface
information *has* to be correct in every case, even the less commonly
used cases.

For IPFIX I would answer this question with a 'no'.

And we can later extend this to update the interface information
correctly in every case.  It's only a few patches away.

 /Holger

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira - Nov. 3, 2013, 8:59 p.m.
On Thu, Oct 17, 2013 at 01:33:45PM +0200, Holger Eitzenberger wrote:
> Hi Pablo,
> 
> > I like patches 1/3 and 2/3, they are nice cleanups.
> 
> thanks for looking into this.

I'm going to apply 1/3 and 2/3 with some small glitches, I would like
not to lose these cleanups.

> > If you only set indev/outdev once we can skip the conntrack extension
> > by passing the skb to nf_ct_deliver_cached_events and include this
> > information in the conntrack events. That would not allow to dump the
> > device from conntrack dumps though. I still have concerns with this
> > approach as this doesn't seem to cover the scenario in which the
> > in/outdev changes.
> 
> I know that doing it this simiple way is only "best effort", as e. g.
> with IP multipathing or 802.3ad this information is not % correct
> in all cases.
> 
> And the question we have to answer is whether this interface
> information *has* to be correct in every case, even the less commonly
> used cases.
> 
> For IPFIX I would answer this question with a 'no'.
> 
> And we can later extend this to update the interface information
> correctly in every case.  It's only a few patches away.

My suggestion is to rework patch 3/3 to pass the interface information
to nf_ct_deliver_cached_events via nf_ct_event struct, then include it
in the event message. Thus, we don't need to increase the size the
conntrack. The downside of this approach is that we cannot retrieve
the interface via dump operation, but I think it should be enough for
IPFIX. This feature should be disabled by default, so please add a
/proc switch to enable/disable it in runtime.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

Index: net-next-ipfix/include/net/netfilter/nf_conntrack_acct.h
===================================================================
--- net-next-ipfix.orig/include/net/netfilter/nf_conntrack_acct.h
+++ net-next-ipfix/include/net/netfilter/nf_conntrack_acct.h
@@ -21,6 +21,8 @@  struct nf_conn_counter {
 
 struct nf_conn_acct {
 	struct nf_conn_counter counter[IP_CT_DIR_MAX];
+	int indev;
+	int outdev;
 };
 
 static inline
Index: net-next-ipfix/net/netfilter/nf_conntrack_core.c
===================================================================
--- net-next-ipfix.orig/net/netfilter/nf_conntrack_core.c
+++ net-next-ipfix/net/netfilter/nf_conntrack_core.c
@@ -33,6 +33,7 @@ 
 #include <linux/mm.h>
 #include <linux/nsproxy.h>
 #include <linux/rculist_nulls.h>
+#include <net/dst.h>
 
 #include <net/netfilter/nf_conntrack.h>
 #include <net/netfilter/nf_conntrack_l3proto.h>
@@ -1110,6 +1111,7 @@  void __nf_ct_refresh_acct(struct nf_conn
 acct:
 	if (do_acct) {
 		struct nf_conn_acct *acct;
+		struct dst_entry *dst;
 
 		acct = nf_conn_acct_find(ct);
 		if (acct) {
@@ -1117,6 +1119,13 @@  acct:
 
 			atomic64_inc(&counter[CTINFO2DIR(ctinfo)].packets);
 			atomic64_add(skb->len, &counter[CTINFO2DIR(ctinfo)].bytes);
+
+			if (acct->indev == 0 && skb->dev)
+				acct->indev = skb->dev->ifindex;
+
+			dst = skb_dst(skb);
+			if (acct->outdev == 0 && dst && dst->dev)
+				acct->outdev = dst->dev->ifindex;
 		}
 	}
 }
Index: net-next-ipfix/include/uapi/linux/netfilter/nfnetlink_conntrack.h
===================================================================
--- net-next-ipfix.orig/include/uapi/linux/netfilter/nfnetlink_conntrack.h
+++ net-next-ipfix/include/uapi/linux/netfilter/nfnetlink_conntrack.h
@@ -53,6 +53,7 @@  enum ctattr_type {
 	CTA_MARK_MASK,
 	CTA_LABELS,
 	CTA_LABELS_MASK,
+	CTA_ACCT,
 	__CTA_MAX
 };
 #define CTA_MAX (__CTA_MAX - 1)
@@ -138,6 +139,14 @@  enum ctattr_counters {
 };
 #define CTA_COUNTERS_MAX (__CTA_COUNTERS_MAX - 1)
 
+enum ctattr_acct {
+	CTA_ACCT_UNSPEC,
+	CTA_ACCT_INDEV,
+	CTA_ACCT_OUTDEV,
+	__CTA_ACCT_MAX
+};
+#define CTA_ACCT_MAX (__CTA_ACCT_MAX - 1)
+
 enum ctattr_tstamp {
 	CTA_TIMESTAMP_UNSPEC,
 	CTA_TIMESTAMP_START,
Index: net-next-ipfix/net/netfilter/nf_conntrack_netlink.c
===================================================================
--- net-next-ipfix.orig/net/netfilter/nf_conntrack_netlink.c
+++ net-next-ipfix/net/netfilter/nf_conntrack_netlink.c
@@ -248,15 +248,27 @@  static int
 ctnetlink_dump_acct(struct sk_buff *skb, const struct nf_conn *ct, int type)
 {
 	struct nf_conn_acct *acct = nf_conn_acct_find(ct);
+	struct nlattr *nla_acct;
 
 	if (acct == NULL)
 		return 0;
 
+	/* counters are not nested in CTA_ACCT for backward compatibility */
 	if (dump_counters(skb, acct, IP_CT_DIR_ORIGINAL, type) < 0)
 		return -1;
 	if (dump_counters(skb, acct, IP_CT_DIR_REPLY, type) < 0)
 		return -1;
 
+	nla_acct = nla_nest_start(skb, CTA_ACCT | NLA_F_NESTED);
+	if (!nla_acct)
+		return -1;
+
+	if (nla_put_be32(skb, CTA_ACCT_INDEV, htonl(acct->indev)) ||
+		nla_put_be32(skb, CTA_ACCT_OUTDEV, htonl(acct->outdev)))
+		return -1;
+
+	nla_nest_end(skb, nla_acct);
+
 	return 0;
 }
 
@@ -542,6 +554,8 @@  ctnetlink_acct_size(const struct nf_conn
 	return 2 * nla_total_size(0) /* CTA_COUNTERS_ORIG|REPL */
 	       + 2 * nla_total_size(sizeof(uint64_t)) /* CTA_COUNTERS_PACKETS */
 	       + 2 * nla_total_size(sizeof(uint64_t)) /* CTA_COUNTERS_BYTES */
+		   + nla_total_size(0)					  /* CTA_ACCT */
+	       + 2 * nla_total_size(sizeof(uint32_t)) /* CTA_(IN|OUT)DEV */
 	       ;
 }