Message ID | 20130926154005.592908761@eitzenberger.org |
---|---|
State | RFC |
Headers | show |
Hi Holger, I like patches 1/3 and 2/3, they are nice cleanups. Some comments regarding this patch. On Thu, Sep 26, 2013 at 05:31:53PM +0200, Holger Eitzenberger wrote: > The interface indices are exported as uint32_t, although being > signed integer inside the kernel, which goes in line with > what nfnetlink_queue does. > > Both interface indices are wrapped inside CTA_ACCT. > > Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> > > Index: net-next-ipfix/include/net/netfilter/nf_conntrack_acct.h > =================================================================== > --- net-next-ipfix.orig/include/net/netfilter/nf_conntrack_acct.h > +++ net-next-ipfix/include/net/netfilter/nf_conntrack_acct.h > @@ -21,6 +21,8 @@ struct nf_conn_counter { > > struct nf_conn_acct { > struct nf_conn_counter counter[IP_CT_DIR_MAX]; > + int indev; > + int outdev; > }; > > static inline > Index: net-next-ipfix/net/netfilter/nf_conntrack_core.c > =================================================================== > --- net-next-ipfix.orig/net/netfilter/nf_conntrack_core.c > +++ net-next-ipfix/net/netfilter/nf_conntrack_core.c > @@ -33,6 +33,7 @@ > #include <linux/mm.h> > #include <linux/nsproxy.h> > #include <linux/rculist_nulls.h> > +#include <net/dst.h> > > #include <net/netfilter/nf_conntrack.h> > #include <net/netfilter/nf_conntrack_l3proto.h> > @@ -1110,6 +1111,7 @@ void __nf_ct_refresh_acct(struct nf_conn > acct: > if (do_acct) { > struct nf_conn_acct *acct; > + struct dst_entry *dst; > > acct = nf_conn_acct_find(ct); > if (acct) { > @@ -1117,6 +1119,13 @@ acct: > > atomic64_inc(&counter[CTINFO2DIR(ctinfo)].packets); > atomic64_add(skb->len, &counter[CTINFO2DIR(ctinfo)].bytes); > + > + if (acct->indev == 0 && skb->dev) > + acct->indev = skb->dev->ifindex; > + > + dst = skb_dst(skb); > + if (acct->outdev == 0 && dst && dst->dev) > + acct->outdev = dst->dev->ifindex; If you only set indev/outdev once we can skip the conntrack extension by passing the skb to nf_ct_deliver_cached_events and include this information in the conntrack events. That would not allow to dump the device from conntrack dumps though. I still have concerns with this approach as this doesn't seem to cover the scenario in which the in/outdev changes. Regards. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Pablo, > I like patches 1/3 and 2/3, they are nice cleanups. thanks for looking into this. > If you only set indev/outdev once we can skip the conntrack extension > by passing the skb to nf_ct_deliver_cached_events and include this > information in the conntrack events. That would not allow to dump the > device from conntrack dumps though. I still have concerns with this > approach as this doesn't seem to cover the scenario in which the > in/outdev changes. I know that doing it this simiple way is only "best effort", as e. g. with IP multipathing or 802.3ad this information is not % correct in all cases. And the question we have to answer is whether this interface information *has* to be correct in every case, even the less commonly used cases. For IPFIX I would answer this question with a 'no'. And we can later extend this to update the interface information correctly in every case. It's only a few patches away. /Holger -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 17, 2013 at 01:33:45PM +0200, Holger Eitzenberger wrote: > Hi Pablo, > > > I like patches 1/3 and 2/3, they are nice cleanups. > > thanks for looking into this. I'm going to apply 1/3 and 2/3 with some small glitches, I would like not to lose these cleanups. > > If you only set indev/outdev once we can skip the conntrack extension > > by passing the skb to nf_ct_deliver_cached_events and include this > > information in the conntrack events. That would not allow to dump the > > device from conntrack dumps though. I still have concerns with this > > approach as this doesn't seem to cover the scenario in which the > > in/outdev changes. > > I know that doing it this simiple way is only "best effort", as e. g. > with IP multipathing or 802.3ad this information is not % correct > in all cases. > > And the question we have to answer is whether this interface > information *has* to be correct in every case, even the less commonly > used cases. > > For IPFIX I would answer this question with a 'no'. > > And we can later extend this to update the interface information > correctly in every case. It's only a few patches away. My suggestion is to rework patch 3/3 to pass the interface information to nf_ct_deliver_cached_events via nf_ct_event struct, then include it in the event message. Thus, we don't need to increase the size the conntrack. The downside of this approach is that we cannot retrieve the interface via dump operation, but I think it should be enough for IPFIX. This feature should be disabled by default, so please add a /proc switch to enable/disable it in runtime. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: net-next-ipfix/include/net/netfilter/nf_conntrack_acct.h =================================================================== --- net-next-ipfix.orig/include/net/netfilter/nf_conntrack_acct.h +++ net-next-ipfix/include/net/netfilter/nf_conntrack_acct.h @@ -21,6 +21,8 @@ struct nf_conn_counter { struct nf_conn_acct { struct nf_conn_counter counter[IP_CT_DIR_MAX]; + int indev; + int outdev; }; static inline Index: net-next-ipfix/net/netfilter/nf_conntrack_core.c =================================================================== --- net-next-ipfix.orig/net/netfilter/nf_conntrack_core.c +++ net-next-ipfix/net/netfilter/nf_conntrack_core.c @@ -33,6 +33,7 @@ #include <linux/mm.h> #include <linux/nsproxy.h> #include <linux/rculist_nulls.h> +#include <net/dst.h> #include <net/netfilter/nf_conntrack.h> #include <net/netfilter/nf_conntrack_l3proto.h> @@ -1110,6 +1111,7 @@ void __nf_ct_refresh_acct(struct nf_conn acct: if (do_acct) { struct nf_conn_acct *acct; + struct dst_entry *dst; acct = nf_conn_acct_find(ct); if (acct) { @@ -1117,6 +1119,13 @@ acct: atomic64_inc(&counter[CTINFO2DIR(ctinfo)].packets); atomic64_add(skb->len, &counter[CTINFO2DIR(ctinfo)].bytes); + + if (acct->indev == 0 && skb->dev) + acct->indev = skb->dev->ifindex; + + dst = skb_dst(skb); + if (acct->outdev == 0 && dst && dst->dev) + acct->outdev = dst->dev->ifindex; } } } Index: net-next-ipfix/include/uapi/linux/netfilter/nfnetlink_conntrack.h =================================================================== --- net-next-ipfix.orig/include/uapi/linux/netfilter/nfnetlink_conntrack.h +++ net-next-ipfix/include/uapi/linux/netfilter/nfnetlink_conntrack.h @@ -53,6 +53,7 @@ enum ctattr_type { CTA_MARK_MASK, CTA_LABELS, CTA_LABELS_MASK, + CTA_ACCT, __CTA_MAX }; #define CTA_MAX (__CTA_MAX - 1) @@ -138,6 +139,14 @@ enum ctattr_counters { }; #define CTA_COUNTERS_MAX (__CTA_COUNTERS_MAX - 1) +enum ctattr_acct { + CTA_ACCT_UNSPEC, + CTA_ACCT_INDEV, + CTA_ACCT_OUTDEV, + __CTA_ACCT_MAX +}; +#define CTA_ACCT_MAX (__CTA_ACCT_MAX - 1) + enum ctattr_tstamp { CTA_TIMESTAMP_UNSPEC, CTA_TIMESTAMP_START, Index: net-next-ipfix/net/netfilter/nf_conntrack_netlink.c =================================================================== --- net-next-ipfix.orig/net/netfilter/nf_conntrack_netlink.c +++ net-next-ipfix/net/netfilter/nf_conntrack_netlink.c @@ -248,15 +248,27 @@ static int ctnetlink_dump_acct(struct sk_buff *skb, const struct nf_conn *ct, int type) { struct nf_conn_acct *acct = nf_conn_acct_find(ct); + struct nlattr *nla_acct; if (acct == NULL) return 0; + /* counters are not nested in CTA_ACCT for backward compatibility */ if (dump_counters(skb, acct, IP_CT_DIR_ORIGINAL, type) < 0) return -1; if (dump_counters(skb, acct, IP_CT_DIR_REPLY, type) < 0) return -1; + nla_acct = nla_nest_start(skb, CTA_ACCT | NLA_F_NESTED); + if (!nla_acct) + return -1; + + if (nla_put_be32(skb, CTA_ACCT_INDEV, htonl(acct->indev)) || + nla_put_be32(skb, CTA_ACCT_OUTDEV, htonl(acct->outdev))) + return -1; + + nla_nest_end(skb, nla_acct); + return 0; } @@ -542,6 +554,8 @@ ctnetlink_acct_size(const struct nf_conn return 2 * nla_total_size(0) /* CTA_COUNTERS_ORIG|REPL */ + 2 * nla_total_size(sizeof(uint64_t)) /* CTA_COUNTERS_PACKETS */ + 2 * nla_total_size(sizeof(uint64_t)) /* CTA_COUNTERS_BYTES */ + + nla_total_size(0) /* CTA_ACCT */ + + 2 * nla_total_size(sizeof(uint32_t)) /* CTA_(IN|OUT)DEV */ ; }
The interface indices are exported as uint32_t, although being signed integer inside the kernel, which goes in line with what nfnetlink_queue does. Both interface indices are wrapped inside CTA_ACCT. Signed-off-by: Holger Eitzenberger <holger@eitzenberger.org> -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html