Message ID | 512BC79F.1070708@googlemail.com |
---|---|
State | Superseded |
Headers | show |
Hi Michael, On Mon, Feb 25, 2013 at 08:20:47PM +0000, Michael Zintakis wrote: [...] > I've given up on my initial idea, which was to create this custom > formatting (as well as object creation) at the point where the first > iptables statement is created for a particular nfacct object, so I > adopted a "plan b", where everything is done via the "nfacct" > executable. Thanks for the explanation. I think that, for most users, something like: nfacct list MiB would be just fine, so all counters will be displayed using the formatting (MiB in the example case) that has been requested. I'm still missing why different formatting according to the accounting object can be useful. Regards. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso wrote: > Thanks for the explanation. No problem. > I think that, for most users, something > like: > > nfacct list MiB I can't speak for other people (it would be very foolish of me to do so on this occasion), but judging this from our own needs/experience, the traffic - both by type and volume - is quite different. One cannot simply shoe-horn all traffic under a single denominator and say "that's it" - it doesn't work like that. > I'm still missing why different formatting according to the accounting > object can be useful. OK, I tried to explain this in my previous post, but if it wasn't clear I'll expand a bit further. Different types of traffic, by their very nature, have different volume requirements. At the "low" end, we have DNS and authentication-type traffic (think RADIUS for example), where the denomination needs to be pretty "low" - in KiB or even "plain bytes" range. At the other end of that scale you have much higher volume of traffic (think HD video streaming for example or private customers running their own PBXs, taking video/voice calls in their thousands), where the denomination needs to be much higher - in the GiB or even TiB range in some circumstances. Not to mention that we have our own internal measurements, where we combine the total traffic counters of whole subnets where that denomination goes much much higher that "GiB". On top of all that, you have the traffic which could be quite unpredictable (think someone running, or connecting to, a private VPN server for example), hence the need for a "dynamic" denomination, depending on the volume of that traffic, which is what I implemented with the "iec" and "si" options. Not to mention that in your example above, the chosen measurement (MiB) would also apply to packet counters - that isn't very appropriate, since packet counters are much lower (by order of magnitude!) compared to the packet length. One cannot simply brush it aside and design a one-size-fits-all measurement and apply it. We've had this problem with the "old" iptables accounting and it is one of the reasons we moved on from that, because it simply wasn't flexible enough. What I did with nfacct provides for flexibility - it can be configured to fit quite a variety of scenarios and individual needs. I hope I've explained myself a bit better this time. MZ -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 26, 2013 at 07:23:16PM +0000, Michael Zintakis wrote: [...] > Different types of traffic, by their very nature, have different > volume requirements. At the "low" end, we have DNS and > authentication-type traffic (think RADIUS for example), where the > denomination needs to be pretty "low" - in KiB or even "plain bytes" > range. I see. Then my new proposal is to add a new automagic function to round the output to the most expressive measure, would be somehow similar to xtables_print_num: http://git.netfilter.org/cgi-bin/gitweb.cgi?p=iptables.git;a=blob;f=libxtables/xtables.c;h=009ab9115f6fd687a762a2552f89ac0b81ee1a42;hb=HEAD#l1915 Would that fit into your needs? Regards. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso wrote: > I see. Then my new proposal is to add a new automagic function to > round the output to the most expressive measure, would be somehow > similar to xtables_print_num: > > http://git.netfilter.org/cgi-bin/gitweb.cgi?p=iptables.git;a=blob;f=libxtables/xtables.c;h=009ab9115f6fd687a762a2552f89ac0b81ee1a42;hb=HEAD#l1915 I might be seeing this wrong and if so I apologize, but is this the same/similar function which exists in the "old" iptables accounting, as well as seen for packets/bytes counter when iptables -L -vn is executed? If so, that isn't very appropriate as I indicated in my previous posting. Pablo, do you think there is something wrong with the "iec" and "si" options already in place? If you think that I've done something wrong, please let me know because this was one of the reasons for placing the changes (and including the patches) in the code I attached before. I would gladly benefit from a feedback on that code. > Would that fit into your needs? Short answer: no, not really. As I already posted, the "iec" and "si" options deal with the two numbering standards (IEC and the "old" SI), have 3-digit decimal point resolution and, most importantly in this case, they were put in place to cover traffic which is of unpredictable/unknown quantity/volume. Going from our own experience, this covers about 20% of the traffic we measure. For the vast majority of all other traffic, we "lock" the denominator and use the appropriate format ("kib", "mib" etc). This is so that if that traffic is different from what we expected to see, this is instantly reflected in the numbers and is immediately flagged for further analysis. Let me illustrate this with a small example: if we use "3pl,mib" format options for a specific type of traffic and we start getting byte count numbers like, for example, "140,666,825.688MiB" (in other words, over 134TiB), then this is instantly flagged to be analyzed to find why that traffic shot our pre-determined "expectation" from being in the "MiB range" and jumped two ranges and got into "TiB" territory. The packet count is also a different matter. Even though in vast majority of cases we use the "3pl" format, this is by no means set solid in stone, so packet count format would also needs to be configured for each traffic measure (i.e. nfacct object). The "old" SI-type options ("kb","mb" and so on) were also put there for a reason - we still have people who are used to this measure, so it is more convenient for them to have this working range of options, which they could use. I hope I have explained this clear, let me know if this is not the case. MZ -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello Pablo and all, > Pablo Neira Ayuso wrote: >> Would that fit into your needs? > Short answer: no, not really. In connection with this subject, I wanted to let you know that I have made quite a lot of changes, which I would try to describe below. We have had internal team gathering almost 3 weeks ago and started planning for the changes to nfacct in order to make it more useful and more functional. This was also done with a view of a presentation to all major stakeholders of the company, which was previously planned and finally concluded 2 days ago (Thursday). During that we demonstrated our new and improved capability (utilizing the new and improved nfacct was part of that, of course!). I am glad to let you know that everything was very well received and, after fine-tuning my work I am going to submit 3 patches to this community very shortly, with the changes I've made to the nfacct system. They are quite extensive and nfacct executable in particular was almost completely re-written. I also found a few bugs, which I fixed. The new changes follow (I will also include a printouts to be more clear). The only changes I have made to the kernel code since my last posting was the introduction of another property called 'bytes threshold' (64-bit number). Its main purpose was to enable us to register 'an expectation' of the traffic passing through a given accounting object and if this threshold is exceeded (in other words if bytes count > threshold), then this is visually displayed with the 'list' and 'get' commands. In other words: [root@27_13 ~]# nfacct list [ pkts = 7.260GiB bytes = 6.817TiB+ ] = "ALL 27 net" [ pkts = 296,615,264 bytes = 21.750GiB ] = " IN web;streaming" [ pkts = 533,035,424 bytes = 721.382GiB ] = "OUT web;streaming" [ pkts = 263,548,272 bytes = 236.012GiB+ ] = "ALL misc" [ pkts = 12,852,909 bytes = 11.510GiB ] = "ALL private" [ pkts = 942,885 bytes = 864.635MiB ] = "ALL sec;audit" As we see above, the plus sign (+) next to the bytes count indicates that the registered threshold for this accounting object has been exceeded (enabling such threshold is, of course, entirely optional). The actual threshold value can be shown with a new option of the 'list' and 'get' commands (called 'show') in which I can specify what columns to view. In other words: [root@27_13 shorewall]# nfacct list show bytes [ bytes = 6.817TiB+ ] = "ALL 27 net" [ bytes = 21.750GiB ] = " IN web;streaming" [ bytes = 721.382GiB ] = "OUT web;streaming" [ bytes = 236.012GiB+ ] = "ALL misc" [ bytes = 11.510GiB ] = "ALL private" [ bytes = 864.635MiB ] = "ALL sec;audit" As we can see, with the above I am shown only the name and bytes columns. [root@27_13 ~]# nfacct list show extended [ pkts = 7.260GiB bytes = 6.817TiB+ thr = 6.000TiB ] = "ALL 27 net" [ pkts = 296,615,264 bytes = 21.750GiB thr = - ] = " IN web;streaming" [ pkts = 533,035,424 bytes = 721.382GiB thr = - ] = "OUT web;streaming" [ pkts = 263,548,272 bytes = 236.012GiB+ thr = 200.000GiB ] = "ALL misc" [ pkts = 12,852,909 bytes = 11.510GiB thr = 50.000GiB ] = "ALL private" [ pkts = 942,885 bytes = 864.635MiB thr = - ] = "ALL sec;audit" As we can see now, by selecting a different 'show' option ('extended' in this case), different properties are shown (I am now shown all properties - packets and byte counters, as well as the threshold values and threshold exceeded indicator, plus account object names). Another good feature is that all column widths are now adjusted 'automatically' by nfacct (libnetfilter_acct plays a major part in this) so that we don't get excessive amount of space shown on the user screen or numbers displayed like 00000000000000001234, which was a bit ugly to say the least. Coming back to the 'bytes threshold', from the last example above we can see that for "ALL 27 net" and "ALL misc" accounting objects, the threshold of 6TiB and 200GiB respectively, has been exceeded and that is indicated by the "+" sign next to the bytes counter. We will also notice that all account object names, if they contain 'odd' symbols are now encoded and shown with quotations. This was one of many bugs I found during the improvements I've made to nfacct - if that name contained any of these characters, restore fails. With the current improvements, this is all now gone. Also as a result of that, not all data was properly encoded when the 'xml' output parameter was used - characters were shown when they were non-conformant to the xml specification (like '>' or '&' for example), but enough about bad bugs... The formatting of objects can now be overwritten by the 'list' and 'get' commands too. The formatting of the numbers of all accounting objects in the above example is 'natural' to the accounting objects themselves, but this can be changed. In other words: [root@27_13 ~]# nfacct list show extended format raw [ pkts = 7795058176 bytes = 7495370670080+ thr = 6597069766656 ] = "ALL 27 net" [ pkts = 296615264 bytes = 23353884672 thr = - ] = " IN web;streaming" [ pkts = 533035424 bytes = 774578044928 thr = - ] = "OUT web;streaming" [ pkts = 263548272 bytes = 253415948288+ thr = 214748364800 ] = "ALL misc" [ pkts = 12852909 bytes = 12358768640 thr = 53687091200 ] = "ALL private" [ pkts = 942885 bytes = 906635520 thr = - ] = "ALL sec;audit" With the above, I asked the 'list' command to show me un-formatted values ('raw' was the format used, but I can select any formatting option I chose - I have now a complete freedom). Maybe the major issue resolved in terms of administration is the new 'save' and 'restore' commands. The previous 'restore' command wasn't working, and it was capturing input from the 'list' command. This was ugly (a bit like trying to do iptables-restore from 'iptables -L'). The new 'save' command now produces output to stdout in a form completely suitable for the new 'restore' command. In other words: [root@27_13 ~]# nfacct save "ALL 27 net" iec,tib 7795057933 7495370766549 6597069766656 " IN web;streaming" 3pl,gib 296615255 23353884672 0 "OUT web;streaming" 3pl,gib 533035414 774578024481 0 "ALL misc" 3pl,gib 263548277 253415955366 214748364800 "ALL private" 3pl,gib 12852909 12358768394 53687091200 "ALL sec;audit" 3pl,mib 942885 906635509 0 As we can see, this can be safely directed to a file and then used with the new 'nfacct restore'. The 'restore' command also had a lot of changes: The best improvement in this is that it now allows all accounting objects to be restored regardless of whether they are used by iptables or not. This was not possible before. The two additional parameters to the 'restore' command - 'flush' and 'replace' make sure that the accounting table can be flushed (though objects used by iptables are still not deleted) and the second option - 'replace' - makes sure that accounting object properties are replaced if they exist in the accounting table. The latter option can modify object properties even if these are in use/locked by iptables. The 'add' and 'get' commands have similar options allowing accounting object properties to be modified at will. That was not possible before. So, with the new 'save' and 'restore' nfacct commands it is now possible for full and complete restoration of all account objects to be done. I will list the detailed changes I've made for each nfacct component (kernel, libnetfilter_acct and nfacct) in the patches I will submit shortly. For full information about the new and improved features, there is alsomst completely re-written man page, but I am listing the output of the 'help' command which shows very briefly all the options currently available. The nfacct executable now has the following options (from the improved 'nfacct help' command): nfacct v1.0.1: utility for the Netfilter extended accounting infrastructure Usage: nfacct command [parameters]... Commands: list LST_PARAMS List the accounting object table add NAME ADD_PARAMS Add new accounting object NAME to table delete NAME Delete existing accounting object NAME get NAME GET_PARAMS Get and list existing accounting object NAME flush Flush accounting object table save Dump current accounting object table to stdout restore RST_PARAMS Restore accounting object table from stdin version Display version and disclaimer help Display this help message Parameters: LST_PARAMS := [ reset ] [ show SHOW_SPEC ] [ format FMT_SPEC ] [ xml ] ADD_PARAMS := [ replace ] [ format FMT_SPEC ] [ threshold NUMBER ] GET_PARAMS := [ reset ] [ show SHOW_SPEC ] [ format FMT_SPEC ] [ xml ] RST_PARAMS := [ flush ] [ replace ] SHOW_SPEC := { bytes | extended } FMT_SPEC := { [FMT] | [,] | [FMT] ... } FMT := { def | raw | 3pl | iec | kib | mib | gib | tib | pib | eib | si | kb | mb | gb | tb | pb | eb } After all this, I do have a question: in what circumstances can the kernel part be unable to update the account object counters - is this possible and if so in what circumstances and how likely is this to happen? It is important for us to know and that is one question I was asked and I didn't really knew the answer, though by looking in the kernel code I couldn't find anything which could prevent that from happening, but thought to ask here anyway. MZ -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello Pablo, Michael Zintakis wrote: > Pablo Neira Ayuso wrote: >> I see. Then my new proposal is to add a new automagic function to >> round the output to the most expressive measure, would be somehow >> similar to xtables_print_num: >> >> http://git.netfilter.org/cgi-bin/gitweb.cgi?p=iptables.git;a=blob;f=libxtables/xtables.c;h=009ab9115f6fd687a762a2552f89ac0b81ee1a42;hb=HEAD#l1915 Something we've discovered with regards to the nfacct match recently. If I have the following iptables statement: iptables -A INPUT -m nfacct --nfacct <nfacct_obj> -m <match2> -m <match3> The above aklways updates the "nfacct_obj" byte and packet counters, regardless of whether "match2" and "match3" actually matches. However, if we have: iptables -A INPUT -m <match2> -m nfacct --nfacct <nfacct_obj> -m <match3> then "nfacct_obj" counters are updated only when "match1" is satisfied, but if we have: iptables -A INPUT -m <match2> -m <match3> -m nfacct --nfacct <nfacct_obj> then "nfacct_obj" counters are updated when both match2 and match3 are matched (which was the initial intention). This inconsistency stems from the fact that the nfacct match in the kernel (xt_nfacct.c::nfacct_mt) always returns true, but also because of how iptables evaluates matches: it does so from left to right. Since there isn't a callback in the xt_match struct which is called after ALL matches have been satisfied (xt_match.match is called for each registered match in that statement), this causes the nfacct counters to be updated (or not) depending on the position of the nfacct match. What I have done locally is to add a separate callback (I called it "matched") which is called for all matches after all such matches in a particular statement have been satisfied, but that obviously will break lots of code depending on the old xt_match struct if such approach is adopted. My question is: is there more elegant solution to do this? Thanks. MZ -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 4 Apr 2013, Michael Zintakis wrote: > Michael Zintakis wrote: > > Pablo Neira Ayuso wrote: > >> I see. Then my new proposal is to add a new automagic function to > >> round the output to the most expressive measure, would be somehow > >> similar to xtables_print_num: > >> > >> http://git.netfilter.org/cgi-bin/gitweb.cgi?p=iptables.git;a=blob;f=libxtables/xtables.c;h=009ab9115f6fd687a762a2552f89ac0b81ee1a42;hb=HEAD#l1915 > Something we've discovered with regards to the nfacct match recently. If > I have the following iptables statement: > > iptables -A INPUT -m nfacct --nfacct <nfacct_obj> -m <match2> -m <match3> > > The above aklways updates the "nfacct_obj" byte and packet counters, > regardless of whether "match2" and "match3" actually matches. However, > if we have: > > iptables -A INPUT -m <match2> -m nfacct --nfacct <nfacct_obj> -m <match3> > > then "nfacct_obj" counters are updated only when "match1" is satisfied, > but if we have: > > iptables -A INPUT -m <match2> -m <match3> -m nfacct --nfacct <nfacct_obj> > > then "nfacct_obj" counters are updated when both match2 and match3 are > matched (which was the initial intention). > > This inconsistency stems from the fact that the nfacct match in the > kernel (xt_nfacct.c::nfacct_mt) always returns true, but also because of > how iptables evaluates matches: it does so from left to right. > > Since there isn't a callback in the xt_match struct which is called > after ALL matches have been satisfied (xt_match.match is called for each > registered match in that statement), this causes the nfacct counters to > be updated (or not) depending on the position of the nfacct match. > > What I have done locally is to add a separate callback (I called it > "matched") which is called for all matches after all such matches in a > particular statement have been satisfied, but that obviously will break > lots of code depending on the old xt_match struct if such approach is > adopted. My question is: is there more elegant solution to do this? In my opinion this is not inconsistency at all, but the intended behaviour. So I don't see any reason to add such a hack to override it. What prevents you from entering the matches in the order you want them to be evaluated? Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences H-1525 Budapest 114, POB. 49, Hungary -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello Jozsef, Jozsef Kadlecsik wrote: > On Thu, 4 Apr 2013, Michael Zintakis wrote: >> Something we've discovered with regards to the nfacct match recently. If >> I have the following iptables statement: >> >> iptables -A INPUT -m nfacct --nfacct <nfacct_obj> -m <match2> -m <match3> >> >> The above aklways updates the "nfacct_obj" byte and packet counters, >> regardless of whether "match2" and "match3" actually matches. However, >> if we have: >> >> iptables -A INPUT -m <match2> -m nfacct --nfacct <nfacct_obj> -m <match3> >> >> then "nfacct_obj" counters are updated only when "match1" is satisfied, >> but if we have: >> >> iptables -A INPUT -m <match2> -m <match3> -m nfacct --nfacct <nfacct_obj> >> >> then "nfacct_obj" counters are updated when both match2 and match3 are >> matched (which was the initial intention). >> >> This inconsistency stems from the fact that the nfacct match in the >> kernel (xt_nfacct.c::nfacct_mt) always returns true, but also because of >> how iptables evaluates matches: it does so from left to right. >> >> Since there isn't a callback in the xt_match struct which is called >> after ALL matches have been satisfied (xt_match.match is called for each >> registered match in that statement), this causes the nfacct counters to >> be updated (or not) depending on the position of the nfacct match. >> >> What I have done locally is to add a separate callback (I called it >> "matched") which is called for all matches after all such matches in a >> particular statement have been satisfied, but that obviously will break >> lots of code depending on the old xt_match struct if such approach is >> adopted. My question is: is there more elegant solution to do this? > > In my opinion this is not inconsistency at all, but the intended > behaviour. So I don't see any reason to add such a hack to override it. I meant inconsistent in terms of the end result, which in the example above is packet/bytes counting. That result is different depending on the order of the conditions (i.e. matches) attached to the iptables rule. With the 'old' accounting we didn't have that. In other words, with the old accounting we've had: If (match1 && match2 && matchN) { do_packet_and_bytes_counting(); } No matter how we arrange the order of match1, match2 and matchN, the end result is (or should be) the same. With the nfacct match that isn't the case, but that isn't nfacct match's fault, but I guess it is because of the way iptables is examining the matches. We would have had the consistency (in other words, getting a consistent result regardless of the order of the various conditions/matches) if nfacct was a target, not a match, but I know that would be difficult (I already examined that possibility) since the x_tables target does not provide a 'destroy' method, so there isn't a way to track the 'refcnt' in the nfacct kernel struct, so inventing this method is as equally as ugly as the hack I did with the nfacct match above, so I thought to ask and see whether there is a better solution. > What prevents you from entering the matches in the order you want them to > be evaluated? Nothing. Again, I am coming from the point of view of the 'old' accounting where I did not have that, so I didn't expect this change. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Michael, On Fri, 5 Apr 2013, Michael Zintakis wrote: > Jozsef Kadlecsik wrote: > > On Thu, 4 Apr 2013, Michael Zintakis wrote: > >> Something we've discovered with regards to the nfacct match recently. If > >> I have the following iptables statement: > >> > >> iptables -A INPUT -m nfacct --nfacct <nfacct_obj> -m <match2> -m <match3> > >> > >> The above aklways updates the "nfacct_obj" byte and packet counters, > >> regardless of whether "match2" and "match3" actually matches. However, > >> if we have: > >> > >> iptables -A INPUT -m <match2> -m nfacct --nfacct <nfacct_obj> -m <match3> > >> > >> then "nfacct_obj" counters are updated only when "match1" is satisfied, > >> but if we have: > >> > >> iptables -A INPUT -m <match2> -m <match3> -m nfacct --nfacct <nfacct_obj> > >> > >> then "nfacct_obj" counters are updated when both match2 and match3 are > >> matched (which was the initial intention). > >> > >> This inconsistency stems from the fact that the nfacct match in the > >> kernel (xt_nfacct.c::nfacct_mt) always returns true, but also because of > >> how iptables evaluates matches: it does so from left to right. > >> > >> Since there isn't a callback in the xt_match struct which is called > >> after ALL matches have been satisfied (xt_match.match is called for each > >> registered match in that statement), this causes the nfacct counters to > >> be updated (or not) depending on the position of the nfacct match. > >> > >> What I have done locally is to add a separate callback (I called it > >> "matched") which is called for all matches after all such matches in a > >> particular statement have been satisfied, but that obviously will break > >> lots of code depending on the old xt_match struct if such approach is > >> adopted. My question is: is there more elegant solution to do this? > > > > In my opinion this is not inconsistency at all, but the intended > > behaviour. So I don't see any reason to add such a hack to override it. > I meant inconsistent in terms of the end result, which in the example > above is packet/bytes counting. > > That result is different depending on the order of the conditions (i.e. > matches) attached to the iptables rule. With the 'old' accounting we > didn't have that. In other words, with the old accounting we've had: > > If (match1 && match2 && matchN) { > do_packet_and_bytes_counting(); > } > > No matter how we arrange the order of match1, match2 and matchN, the end > result is (or should be) the same. With the nfacct match that isn't the > case, but that isn't nfacct match's fault, but I guess it is because of > the way iptables is examining the matches. Yes, exactly. And actually it supports rules like this: iptables -A INPUT -m <match0> -m nfacct --nfacct acct0 \ -m <match1> -m nfacct --nfacct acct1 \ ... Also, this is a new accounting method, which is just not the same as the old one. > We would have had the consistency (in other words, getting a consistent > result regardless of the order of the various conditions/matches) if > nfacct was a target, not a match, but I know that would be difficult (I > already examined that possibility) since the x_tables target does not > provide a 'destroy' method, so there isn't a way to track the 'refcnt' > in the nfacct kernel struct, so inventing this method is as equally as > ugly as the hack I did with the nfacct match above, so I thought to ask > and see whether there is a better solution. Targets do have a destroy method. Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences H-1525 Budapest 114, POB. 49, Hungary -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Michael Zintakis wrote:
> We would have had the consistency (in other words, getting a consistent result regardless of the order of the various conditions/matches) if nfacct was a target, not a match, but I know that would be difficult (I already examined that possibility) since the x_tables target does not provide a 'destroy' method, so there isn't a way to track the 'refcnt' in the nfacct kernel struct, so inventing this method is as equally as ugly as the hack I did with the nfacct match above, so I thought to ask and see whether there is a better solution.
It looks as though I was wrong - I must have been blind when I looked in the x_tables header file!
There is a destroy method as part of mt_target. So if I 'reform' the nfacct match and make it a target, then I guess that whole 'inconsistency' thing will disappear since I could now use something like:
iptables -A INPUT -m match1 -m match2 -j NFACCT --nfacct <nfacct_obj>
and regardless of the order of match1 and match2, the result will be the same, am I correct or is there something very wrong?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello Jozsef, Jozsef Kadlecsik wrote: > Hi Michael, > > On Fri, 5 Apr 2013, Michael Zintakis wrote: > >> Jozsef Kadlecsik wrote: >>> On Thu, 4 Apr 2013, Michael Zintakis wrote: >>>> Something we've discovered with regards to the nfacct match recently. If >>>> I have the following iptables statement: >>>> >>>> iptables -A INPUT -m nfacct --nfacct <nfacct_obj> -m <match2> -m <match3> >>>> >>>> The above aklways updates the "nfacct_obj" byte and packet counters, >>>> regardless of whether "match2" and "match3" actually matches. However, >>>> if we have: >>>> >>>> iptables -A INPUT -m <match2> -m nfacct --nfacct <nfacct_obj> -m <match3> >>>> >>>> then "nfacct_obj" counters are updated only when "match1" is satisfied, >>>> but if we have: >>>> >>>> iptables -A INPUT -m <match2> -m <match3> -m nfacct --nfacct <nfacct_obj> >>>> >>>> then "nfacct_obj" counters are updated when both match2 and match3 are >>>> matched (which was the initial intention). >>>> >>>> This inconsistency stems from the fact that the nfacct match in the >>>> kernel (xt_nfacct.c::nfacct_mt) always returns true, but also because of >>>> how iptables evaluates matches: it does so from left to right. >>>> >>>> Since there isn't a callback in the xt_match struct which is called >>>> after ALL matches have been satisfied (xt_match.match is called for each >>>> registered match in that statement), this causes the nfacct counters to >>>> be updated (or not) depending on the position of the nfacct match. >>>> >>>> What I have done locally is to add a separate callback (I called it >>>> "matched") which is called for all matches after all such matches in a >>>> particular statement have been satisfied, but that obviously will break >>>> lots of code depending on the old xt_match struct if such approach is >>>> adopted. My question is: is there more elegant solution to do this? >>> In my opinion this is not inconsistency at all, but the intended >>> behaviour. So I don't see any reason to add such a hack to override it. >> I meant inconsistent in terms of the end result, which in the example >> above is packet/bytes counting. >> >> That result is different depending on the order of the conditions (i.e. >> matches) attached to the iptables rule. With the 'old' accounting we >> didn't have that. In other words, with the old accounting we've had: >> >> If (match1 && match2 && matchN) { >> do_packet_and_bytes_counting(); >> } >> >> No matter how we arrange the order of match1, match2 and matchN, the end >> result is (or should be) the same. With the nfacct match that isn't the >> case, but that isn't nfacct match's fault, but I guess it is because of >> the way iptables is examining the matches. > > Yes, exactly. And actually it supports rules like this: > > iptables -A INPUT -m <match0> -m nfacct --nfacct acct0 \ > -m <match1> -m nfacct --nfacct acct1 \ > ... Hm, never thought of that, but I guess one learns something new every day. Thanks Jozsef! > Also, this is a new accounting method, which is just not the same as the > old one. Yes, I know, I wasn't disputing that - it is just that I am used to the 'old' accounting and when you've been using it for years it is not so easy to 'detach' yourself from that. >> We would have had the consistency (in other words, getting a consistent >> result regardless of the order of the various conditions/matches) if >> nfacct was a target, not a match, but I know that would be difficult (I >> already examined that possibility) since the x_tables target does not >> provide a 'destroy' method, so there isn't a way to track the 'refcnt' >> in the nfacct kernel struct, so inventing this method is as equally as >> ugly as the hack I did with the nfacct match above, so I thought to ask >> and see whether there is a better solution. > > Targets do have a destroy method. Haha, you are far too quick for me! I just found that out - I don't know how I did not see it when I first looked at it. I guess if I 'convert' nfacct to a target I could get that 'consistency', but I appreciate the new example you gave above, which I have to admit is very useful indeed (one can hit two or more birds with one stone so to speak). -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Michael, On Fri, 5 Apr 2013, Michael Zintakis wrote: > Jozsef Kadlecsik wrote: > > On Fri, 5 Apr 2013, Michael Zintakis wrote: > > > >> Jozsef Kadlecsik wrote: > >>> On Thu, 4 Apr 2013, Michael Zintakis wrote: > >>>> Something we've discovered with regards to the nfacct match recently. If > >>>> I have the following iptables statement: > >>>> > >>>> iptables -A INPUT -m nfacct --nfacct <nfacct_obj> -m <match2> -m <match3> > >>>> > >>>> The above aklways updates the "nfacct_obj" byte and packet counters, > >>>> regardless of whether "match2" and "match3" actually matches. However, > >>>> if we have: > >>>> > >>>> iptables -A INPUT -m <match2> -m nfacct --nfacct <nfacct_obj> -m <match3> > >>>> > >>>> then "nfacct_obj" counters are updated only when "match1" is satisfied, > >>>> but if we have: > >>>> > >>>> iptables -A INPUT -m <match2> -m <match3> -m nfacct --nfacct <nfacct_obj> > >>>> > >>>> then "nfacct_obj" counters are updated when both match2 and match3 are > >>>> matched (which was the initial intention). > >>>> > >>>> This inconsistency stems from the fact that the nfacct match in the > >>>> kernel (xt_nfacct.c::nfacct_mt) always returns true, but also because of > >>>> how iptables evaluates matches: it does so from left to right. > >>>> > >>>> Since there isn't a callback in the xt_match struct which is called > >>>> after ALL matches have been satisfied (xt_match.match is called for each > >>>> registered match in that statement), this causes the nfacct counters to > >>>> be updated (or not) depending on the position of the nfacct match. > >>>> > >>>> What I have done locally is to add a separate callback (I called it > >>>> "matched") which is called for all matches after all such matches in a > >>>> particular statement have been satisfied, but that obviously will break > >>>> lots of code depending on the old xt_match struct if such approach is > >>>> adopted. My question is: is there more elegant solution to do this? > >>> In my opinion this is not inconsistency at all, but the intended > >>> behaviour. So I don't see any reason to add such a hack to override it. > >> I meant inconsistent in terms of the end result, which in the example > >> above is packet/bytes counting. > >> > >> That result is different depending on the order of the conditions (i.e. > >> matches) attached to the iptables rule. With the 'old' accounting we > >> didn't have that. In other words, with the old accounting we've had: > >> > >> If (match1 && match2 && matchN) { > >> do_packet_and_bytes_counting(); > >> } > >> > >> No matter how we arrange the order of match1, match2 and matchN, the end > >> result is (or should be) the same. With the nfacct match that isn't the > >> case, but that isn't nfacct match's fault, but I guess it is because of > >> the way iptables is examining the matches. > > > > Yes, exactly. And actually it supports rules like this: > > > > iptables -A INPUT -m <match0> -m nfacct --nfacct acct0 \ > > -m <match1> -m nfacct --nfacct acct1 \ > > ... > Hm, never thought of that, but I guess one learns something new every > day. Thanks Jozsef! > > > Also, this is a new accounting method, which is just not the same as the > > old one. > Yes, I know, I wasn't disputing that - it is just that I am used to the > 'old' accounting and when you've been using it for years it is not so > easy to 'detach' yourself from that. > > >> We would have had the consistency (in other words, getting a consistent > >> result regardless of the order of the various conditions/matches) if > >> nfacct was a target, not a match, but I know that would be difficult (I > >> already examined that possibility) since the x_tables target does not > >> provide a 'destroy' method, so there isn't a way to track the 'refcnt' > >> in the nfacct kernel struct, so inventing this method is as equally as > >> ugly as the hack I did with the nfacct match above, so I thought to ask > >> and see whether there is a better solution. > > > > Targets do have a destroy method. > Haha, you are far too quick for me! > > I just found that out - I don't know how I did not see it when I first > looked at it. I guess if I 'convert' nfacct to a target I could get that > 'consistency', but I appreciate the new example you gave above, which I > have to admit is very useful indeed (one can hit two or more birds with > one stone so to speak). nfacct can't be converted to a target, because it'd result backward incompatibilty - it already exists as a match. The module could be extended to play the role of target as well, but it seems to be unnecessary: there's no need to have a target in a rule, so in userspace "-j NFACCT" could simply be replaced by "-m nfacct". Best regards, Jozsef - E-mail : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.mta.hu PGP key : http://www.kfki.hu/~kadlec/pgp_public_key.txt Address : Wigner Research Centre for Physics, Hungarian Academy of Sciences H-1525 Budapest 114, POB. 49, Hungary -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello Jozsef, Jozsef Kadlecsik wrote: > Hi Michael, > > On Fri, 5 Apr 2013, Michael Zintakis wrote: > >> Jozsef Kadlecsik wrote: >>> On Fri, 5 Apr 2013, Michael Zintakis wrote: >>> >>>> Jozsef Kadlecsik wrote: >>>>> On Thu, 4 Apr 2013, Michael Zintakis wrote: >>>>>> Something we've discovered with regards to the nfacct match recently. If >>>>>> I have the following iptables statement: >>>>>> >>>>>> iptables -A INPUT -m nfacct --nfacct <nfacct_obj> -m <match2> -m <match3> >>>>>> >>>>>> The above aklways updates the "nfacct_obj" byte and packet counters, >>>>>> regardless of whether "match2" and "match3" actually matches. However, >>>>>> if we have: >>>>>> >>>>>> iptables -A INPUT -m <match2> -m nfacct --nfacct <nfacct_obj> -m <match3> >>>>>> >>>>>> then "nfacct_obj" counters are updated only when "match1" is satisfied, >>>>>> but if we have: >>>>>> >>>>>> iptables -A INPUT -m <match2> -m <match3> -m nfacct --nfacct <nfacct_obj> >>>>>> >>>>>> then "nfacct_obj" counters are updated when both match2 and match3 are >>>>>> matched (which was the initial intention). >>>>>> >>>>>> This inconsistency stems from the fact that the nfacct match in the >>>>>> kernel (xt_nfacct.c::nfacct_mt) always returns true, but also because of >>>>>> how iptables evaluates matches: it does so from left to right. >>>>>> >>>>>> Since there isn't a callback in the xt_match struct which is called >>>>>> after ALL matches have been satisfied (xt_match.match is called for each >>>>>> registered match in that statement), this causes the nfacct counters to >>>>>> be updated (or not) depending on the position of the nfacct match. >>>>>> >>>>>> What I have done locally is to add a separate callback (I called it >>>>>> "matched") which is called for all matches after all such matches in a >>>>>> particular statement have been satisfied, but that obviously will break >>>>>> lots of code depending on the old xt_match struct if such approach is >>>>>> adopted. My question is: is there more elegant solution to do this? >>>>> In my opinion this is not inconsistency at all, but the intended >>>>> behaviour. So I don't see any reason to add such a hack to override it. >>>> I meant inconsistent in terms of the end result, which in the example >>>> above is packet/bytes counting. >>>> >>>> That result is different depending on the order of the conditions (i.e. >>>> matches) attached to the iptables rule. With the 'old' accounting we >>>> didn't have that. In other words, with the old accounting we've had: >>>> >>>> If (match1 && match2 && matchN) { >>>> do_packet_and_bytes_counting(); >>>> } >>>> >>>> No matter how we arrange the order of match1, match2 and matchN, the end >>>> result is (or should be) the same. With the nfacct match that isn't the >>>> case, but that isn't nfacct match's fault, but I guess it is because of >>>> the way iptables is examining the matches. >>> Yes, exactly. And actually it supports rules like this: >>> >>> iptables -A INPUT -m <match0> -m nfacct --nfacct acct0 \ >>> -m <match1> -m nfacct --nfacct acct1 \ >>> ... >> Hm, never thought of that, but I guess one learns something new every >> day. Thanks Jozsef! Just as a side note (which wasn't obvious to me at first): even though acct0 gets updated when match0 returns true, acct1 only gets updated when both match0 AND match1 return true... >>> Also, this is a new accounting method, which is just not the same as the >>> old one. >> Yes, I know, I wasn't disputing that - it is just that I am used to the >> 'old' accounting and when you've been using it for years it is not so >> easy to 'detach' yourself from that. >> >>>> We would have had the consistency (in other words, getting a consistent >>>> result regardless of the order of the various conditions/matches) if >>>> nfacct was a target, not a match, but I know that would be difficult (I >>>> already examined that possibility) since the x_tables target does not >>>> provide a 'destroy' method, so there isn't a way to track the 'refcnt' >>>> in the nfacct kernel struct, so inventing this method is as equally as >>>> ugly as the hack I did with the nfacct match above, so I thought to ask >>>> and see whether there is a better solution. >>> Targets do have a destroy method. >> Haha, you are far too quick for me! >> >> I just found that out - I don't know how I did not see it when I first >> looked at it. I guess if I 'convert' nfacct to a target I could get that >> 'consistency', but I appreciate the new example you gave above, which I >> have to admit is very useful indeed (one can hit two or more birds with >> one stone so to speak). > > nfacct can't be converted to a target, because it'd result backward > incompatibilty - it already exists as a match. Sorry Jozsef, I meant for nfacct to be added as a target (in addition to nfacct as a match). ? The module could be > extended to play the role of target as well, but it seems to be > unnecessary: there's no need to have a target in a rule, I agree. nfacct (as a match) has the full functionality of nfacct (as a target), though one needs to get used to the 'new' matching and be aware of it. Maybe a note in the man pages to that effect would do. > so in userspace > "-j NFACCT" could simply be replaced by "-m nfacct". I just did a quick hack and implemented nfacct as a target - just out of curiosity, if not anything else. It works well and I could do something like: iptables -I INPUT 1 -m nfacct --nfacct-name test -m conntrack --ctstate NEW -j NFACCT --nfacct-name test2 In the above statement the nfacct match on 'test' gets updated regardless of the state of the connection, while the nfacct target gets only executed (for 'test2') when cstate is NEW (this statement even works with '-j NFACCT --nfacct-name test'). This is all academical though - I agree that the existing nfacct match covers all the functionality of the nfacct target even if one needs to be aware of how this all works... MZ -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/net/netfilter/nfnetlink_acct.c +++ b/net/netfilter/nfnetlink_acct.c @@ -32,6 +32,7 @@ struct nf_acct { atomic64_t pkts; atomic64_t bytes; + atomic_t fmt; struct list_head head; atomic_t refcnt; char name[NFACCT_NAME_MAX]; @@ -63,9 +64,14 @@ if (matching) { if (nlh->nlmsg_flags & NLM_F_REPLACE) { - /* reset counters if you request a replacement. */ + /* reset counters if you request a replacement... */ atomic64_set(&matching->pkts, 0); atomic64_set(&matching->bytes, 0); + /* ... and change the format */ + if (tb[NFACCT_FMT]) { + atomic_set(&matching->fmt, + be32_to_cpu(nla_get_be32(tb[NFACCT_FMT]))); + } return 0; } return -EBUSY; @@ -85,6 +91,10 @@ atomic64_set(&nfacct->pkts, be64_to_cpu(nla_get_be64(tb[NFACCT_PKTS]))); } + if (tb[NFACCT_FMT]) { + atomic_set(&nfacct->fmt, + be32_to_cpu(nla_get_be32(tb[NFACCT_FMT]))); + } atomic_set(&nfacct->refcnt, 1); list_add_tail_rcu(&nfacct->head, &nfnl_acct_list); return 0; @@ -121,6 +131,7 @@ } if (nla_put_be64(skb, NFACCT_PKTS, cpu_to_be64(pkts)) || nla_put_be64(skb, NFACCT_BYTES, cpu_to_be64(bytes)) || + nla_put_be32(skb, NFACCT_FMT, htonl(atomic_read(&acct->fmt))) || nla_put_be32(skb, NFACCT_USE, htonl(atomic_read(&acct->refcnt)))) goto nla_put_failure; @@ -265,6 +276,7 @@ [NFACCT_NAME] = { .type = NLA_NUL_STRING, .len = NFACCT_NAME_MAX-1 }, [NFACCT_BYTES] = { .type = NLA_U64 }, [NFACCT_PKTS] = { .type = NLA_U64 }, + [NFACCT_FMT] = { .type = NLA_U32 }, }; static const struct nfnl_callback nfnl_acct_cb[NFNL_MSG_ACCT_MAX] = { --- a/include/uapi/linux/netfilter/nfnetlink_acct.h +++ b/include/uapi/linux/netfilter/nfnetlink_acct.h @@ -18,6 +18,7 @@ NFACCT_NAME, NFACCT_PKTS, NFACCT_BYTES, + NFACCT_FMT, NFACCT_USE, __NFACCT_MAX };