Message ID | 4F611835.4080904@gmx.de |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
Helge Deller <deller@gmx.de> writes: > When writing to the ip_local_reserved_ports proc file it will currently clear > all previously reserved ports and update the current list with the one given > in the input. > > This behaviour makes it's usage quite hard, for example: > a) The generic proc filesystem limitation of only handle up to PAGE_SIZE-1 > characters at maximum may not be sufficient to provide all your wished-to- > be-reserved ports at once. Fundamentally this need to be fixed first or else you will not be able to display the bitmap through sysctl. > b) There is no easy way to disable specific given ports, you always need to > give the full port list at once. This makes shell scripting hard, since > you need to parse everything yourself. > c) There is no easy way to just add specific ports or port ranges. Again, > this would be useful for shell scripts. Arguably b and c call for user space tools for better tools for dealing with text ranges not a magic parser in /proc. We already have tools like seq, dshbak, and pdsh. What is the difficulty of writing a little shell utility instead of modifying /proc? > The following patch solves this problem by simply extending the parser > in proc_do_large_bitmap() to accept the keywords "add" and "release" in front > of given ports or port ranges and to either add or drop the given ports > from the already existing list. > > Here is an example: > $ echo "1000-2000,5000" > /proc/sys/net/ipv4/ip_local_reserved_ports > $ cat /proc/sys/net/ipv4/ip_local_reserved_ports > 1000-2000,5000 (works as before, current port list is replaced by new one) > > $ echo "add 3000-4000" > /proc/sys/net/ipv4/ip_local_reserved_ports > $ cat /proc/sys/net/ipv4/ip_local_reserved_ports > 1000-2000,3000-4000,5000 (new ports added) > > $ echo "release 1500-3500" > /proc/sys/net/ipv4/ip_local_reserved_ports > $ cat /proc/sys/net/ipv4/ip_local_reserved_ports > 1000-1499,3501-4000,5000 (given ports were dropped from the list) > > My main motivation for this patch is because of a huge commercial application > which by default may use lots of ports. The full port list which I would have > needed to echo to the /proc/sys/net/ipv4/ip_local_reserved_ports file was > around 30K, and in this case all ports were already combined to regions where > possible. With this patch it's now easy to split up the port ranges into > single pieces and to implement everything in simple bootup shell scripts. > Furthermore adding new or removing unneeded ports dynamically at runtime is > now easily possible. You are breaking concept that the bitmap is a single value in /proc/sys which I don't like at all. And ultimately this is a lot of code to avoid fixing a 4K limit of sysctl reads and writes. If I understand this correctly after this patch is applied you can not read the result you write in with your commercial application. That just seems wrong. Can you please attack the fundamental issue first? Eric > Signed-off-by: Helge Deller <deller@gmx.de> > CC: Octavian Purdila <octavian.purdila@intel.com> > CC: netdev@vger.kernel.org > CC: David Miller <davem@davemloft.net> > CC: Cong Wang <amwang@redhat.com> > CC: "Eric W. Biederman" <ebiederm@xmission.com> > > Documentation/networking/ip-sysctl.txt | 10 +++++- > kernel/sysctl.c | 51 ++++++++++++++++++++++++++++----- > 2 files changed, 53 insertions(+), 8 deletions(-) > > > diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt > index ad3e80e..fc52546 100644 > --- a/Documentation/networking/ip-sysctl.txt > +++ b/Documentation/networking/ip-sysctl.txt > @@ -624,7 +624,15 @@ ip_local_reserved_ports - list of comma separated ranges > list of ranges (e.g. "1,2-4,10-10" for ports 1, 2, 3, 4 and > 10). Writing to the file will clear all previously reserved > ports and update the current list with the one given in the > - input. > + input unless one of the keywords "add" or "release" is used > + in front of the ports in which case the given ports are added > + to or released from the currently existing port list. > + Example: > + $ echo "1000-2000" > /proc/sys/net/ipv4/ip_local_reserved_ports > + $ echo "add 3000-4000,5000" > /proc/sys/net/ipv4/ip_local_reserved_ports > + $ echo "release 1500-3500" > /proc/sys/net/ipv4/ip_local_reserved_ports > + $ cat /proc/sys/net/ipv4/ip_local_reserved_ports > + 1000-1499,3501-4000,5000 > > Note that ip_local_port_range and ip_local_reserved_ports > settings are independent and both are considered by the kernel > diff --git a/kernel/sysctl.c b/kernel/sysctl.c > index f487f25..f9b1930 100644 > --- a/kernel/sysctl.c > +++ b/kernel/sysctl.c > @@ -2805,6 +2805,8 @@ static int proc_do_cad_pid(struct ctl_table *table, int write, > * We use a range comma separated format (e.g. 1,3-4,10-10) so that > * large bitmaps may be represented in a compact manner. Writing into > * the file will clear the bitmap then update it with the given input. > + * If "add" or "release" is written in front of numbers or number ranges, > + * the given bits will be added to or released from the existing bitmap. > * > * Returns 0 on success. > */ > @@ -2813,11 +2815,13 @@ int proc_do_large_bitmap(struct ctl_table *table, int write, > { > int err = 0; > bool first = 1; > + bool add_or_release = 0, xrelease = 0; > size_t left = *lenp; > unsigned long bitmap_len = table->maxlen; > unsigned long *bitmap = (unsigned long *) table->data; > - unsigned long *tmp_bitmap = NULL; > - char tr_a[] = { '-', ',', '\n' }, tr_b[] = { ',', '\n', 0 }, c; > + unsigned long *tmp_bitmap = NULL, *release_bitmap = NULL; > + char tr_a[] = { '-', ',', ' ', '\n' }, > + tr_b[] = { ',', ' ', '\n', 0 }, c; > > if (!bitmap_len || !left || (*ppos && !write)) { > *lenp = 0; > @@ -2841,8 +2845,9 @@ int proc_do_large_bitmap(struct ctl_table *table, int write, > } > kbuf[left] = 0; > > - tmp_bitmap = kzalloc(BITS_TO_LONGS(bitmap_len) * sizeof(unsigned long), > - GFP_KERNEL); > + tmp_bitmap = kzalloc(2 * BITS_TO_LONGS(bitmap_len) * > + sizeof(unsigned long), GFP_KERNEL); > + release_bitmap = &tmp_bitmap[BITS_TO_LONGS(bitmap_len)]; > if (!tmp_bitmap) { > free_page(page); > return -ENOMEM; > @@ -2852,6 +2857,29 @@ int proc_do_large_bitmap(struct ctl_table *table, int write, > unsigned long val_a, val_b; > bool neg; > > + left -= proc_skip_spaces(&kbuf); > + if (!left) > + continue; > + > + if (first || add_or_release) { > + if (!strnicmp(kbuf, "add ", 4)) { > + xrelease = 0; > + add_or_release = 1; > + kbuf += 4; > + left -= 4; > + } else > + if (!strnicmp(kbuf, "release ", 8)) { > + xrelease = 1; > + add_or_release = 1; > + kbuf += 8; > + left -= 8; > + } > + > + left -= proc_skip_spaces(&kbuf); > + if (!left) > + continue; > + } > + > err = proc_get_long(&kbuf, &left, &val_a, &neg, tr_a, > sizeof(tr_a), &c); > if (err) > @@ -2885,12 +2913,20 @@ int proc_do_large_bitmap(struct ctl_table *table, int write, > } > > while (val_a <= val_b) > - set_bit(val_a++, tmp_bitmap); > + if (xrelease) > + set_bit(val_a++, release_bitmap); > + else > + set_bit(val_a++, tmp_bitmap); > > first = 0; > proc_skip_char(&kbuf, &left, '\n'); > } > free_page(page); > + > + /* Do not allow adding and releasing same bits in one step. */ > + if (!err && add_or_release && > + bitmap_intersects(tmp_bitmap, release_bitmap, bitmap_len)) > + err = -EINVAL; > } else { > unsigned long bit_a, bit_b = 0; > > @@ -2926,9 +2962,10 @@ int proc_do_large_bitmap(struct ctl_table *table, int write, > > if (!err) { > if (write) { > - if (*ppos) > + if (*ppos || add_or_release) { > bitmap_or(bitmap, bitmap, tmp_bitmap, bitmap_len); > - else > + bitmap_andnot(bitmap, bitmap, release_bitmap, bitmap_len); > + } else > memcpy(bitmap, tmp_bitmap, > BITS_TO_LONGS(bitmap_len) * sizeof(unsigned long)); > } -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/14/2012 11:34 PM, Eric W. Biederman wrote: > Helge Deller<deller@gmx.de> writes: > >> When writing to the ip_local_reserved_ports proc file it will currently clear >> all previously reserved ports and update the current list with the one given >> in the input. >> >> This behaviour makes it's usage quite hard, for example: >> a) The generic proc filesystem limitation of only handle up to PAGE_SIZE-1 >> characters at maximum may not be sufficient to provide all your wished-to- >> be-reserved ports at once. > > Fundamentally this need to be fixed first or else you will not be able > to display the bitmap through sysctl. Not necessarily. Reading more than 4K from /proc/sys/net/ipv4/ip_local_reserved_ports is possible today already (everything depends on how much chars you read-at-once). Just try using "dd bs=1M". Writing is the major limiting factor. >> b) There is no easy way to disable specific given ports, you always need to >> give the full port list at once. This makes shell scripting hard, since >> you need to parse everything yourself. > >> c) There is no easy way to just add specific ports or port ranges. Again, >> this would be useful for shell scripts. > > Arguably b and c call for user space tools for better tools for dealing > with text ranges not a magic parser in /proc. We already have tools > like seq, dshbak, and pdsh. What is the difficulty of writing a little > shell utility instead of modifying /proc? I'm not modifying /proc. I'm just extending it to provide an alternative method to modify values for the large bitmap case only. The basic existing behaviour to just read/write values is still functional. If for example the current value of ip_local_reserved_ports is: 300,400-500,600.... then I like the idea of just echo "release 420-480" > ... much more than cat ip_local_reserved_ports | do_complicated_shell_string_scripting_to_remove_420-480_without_getting_above_4K > ip_local_reserved_ports BTW, in this case the range 400-500 gets split up and the output should become 300,400-419,481-500,600.... >> The following patch solves this problem by simply extending the parser >> in proc_do_large_bitmap() to accept the keywords "add" and "release" in front >> of given ports or port ranges and to either add or drop the given ports >> from the already existing list. >> >> Here is an example: >> $ echo "1000-2000,5000"> /proc/sys/net/ipv4/ip_local_reserved_ports >> $ cat /proc/sys/net/ipv4/ip_local_reserved_ports >> 1000-2000,5000 (works as before, current port list is replaced by new one) >> >> $ echo "add 3000-4000"> /proc/sys/net/ipv4/ip_local_reserved_ports >> $ cat /proc/sys/net/ipv4/ip_local_reserved_ports >> 1000-2000,3000-4000,5000 (new ports added) >> >> $ echo "release 1500-3500"> /proc/sys/net/ipv4/ip_local_reserved_ports >> $ cat /proc/sys/net/ipv4/ip_local_reserved_ports >> 1000-1499,3501-4000,5000 (given ports were dropped from the list) >> >> My main motivation for this patch is because of a huge commercial application >> which by default may use lots of ports. The full port list which I would have >> needed to echo to the /proc/sys/net/ipv4/ip_local_reserved_ports file was >> around 30K, and in this case all ports were already combined to regions where >> possible. With this patch it's now easy to split up the port ranges into >> single pieces and to implement everything in simple bootup shell scripts. >> Furthermore adding new or removing unneeded ports dynamically at runtime is >> now easily possible. > > You are breaking concept that the bitmap is a single value in /proc/sys > which I don't like at all. Please keep in mind that this is a 65535 bit (port) bitmap, not just a "simple" single value. And the basic concept is still there. > And ultimately this is a lot of code to avoid fixing a 4K limit of > sysctl reads and writes. I would love if the 4K problem gets fixed. But it seems that this will lead to a major redesign and rewrite of the whole /proc interface which I won't be able to drive. I'm not even sure if this could be done in short time without breaking existing compatibility. Other people here on the list can surely comment on that better than me. So, my patch is at least a "balanced" solution which can make the current ip_local_reserved_ports interface useable to existing applications. In it's current form, the usage of ip_local_reserved_ports isfull of limitations. >If I understand this correctly after this > patch is applied you can not read the result you write in with your > commercial application. That just seems wrong. As mentioned above: reading is possible. And with my patch applied you usually probably won't need to read the values since you don't care which other ports have been configured and just add/release the ones you are interested in. Regarding the commercial application: The idea is to provide an additional shell-based system initscript which fills the ip_local_reserved_ports list with initial values based on the product which the customer installed. After the values have been set during bootup, further (dynamic) changes of this port list won't happen unless the administrator manually wants to add/release other ports. So, no magic in here... > Can you please attack the fundamental issue first? You mean the 4K/PAGE_SIZE limitation? As said above, that's probably above my possibilities / time capatibilities. Sorry. But of course I'm open for all kind of other ideas how and API for the large bitmap case could be done better. Sadly the seq_file interface doesn't seem to support reading yet...? Helge -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/networking/ip-sysctl.txt b/Documentation/networking/ip-sysctl.txt index ad3e80e..fc52546 100644 --- a/Documentation/networking/ip-sysctl.txt +++ b/Documentation/networking/ip-sysctl.txt @@ -624,7 +624,15 @@ ip_local_reserved_ports - list of comma separated ranges list of ranges (e.g. "1,2-4,10-10" for ports 1, 2, 3, 4 and 10). Writing to the file will clear all previously reserved ports and update the current list with the one given in the - input. + input unless one of the keywords "add" or "release" is used + in front of the ports in which case the given ports are added + to or released from the currently existing port list. + Example: + $ echo "1000-2000" > /proc/sys/net/ipv4/ip_local_reserved_ports + $ echo "add 3000-4000,5000" > /proc/sys/net/ipv4/ip_local_reserved_ports + $ echo "release 1500-3500" > /proc/sys/net/ipv4/ip_local_reserved_ports + $ cat /proc/sys/net/ipv4/ip_local_reserved_ports + 1000-1499,3501-4000,5000 Note that ip_local_port_range and ip_local_reserved_ports settings are independent and both are considered by the kernel diff --git a/kernel/sysctl.c b/kernel/sysctl.c index f487f25..f9b1930 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -2805,6 +2805,8 @@ static int proc_do_cad_pid(struct ctl_table *table, int write, * We use a range comma separated format (e.g. 1,3-4,10-10) so that * large bitmaps may be represented in a compact manner. Writing into * the file will clear the bitmap then update it with the given input. + * If "add" or "release" is written in front of numbers or number ranges, + * the given bits will be added to or released from the existing bitmap. * * Returns 0 on success. */ @@ -2813,11 +2815,13 @@ int proc_do_large_bitmap(struct ctl_table *table, int write, { int err = 0; bool first = 1; + bool add_or_release = 0, xrelease = 0; size_t left = *lenp; unsigned long bitmap_len = table->maxlen; unsigned long *bitmap = (unsigned long *) table->data; - unsigned long *tmp_bitmap = NULL; - char tr_a[] = { '-', ',', '\n' }, tr_b[] = { ',', '\n', 0 }, c; + unsigned long *tmp_bitmap = NULL, *release_bitmap = NULL; + char tr_a[] = { '-', ',', ' ', '\n' }, + tr_b[] = { ',', ' ', '\n', 0 }, c; if (!bitmap_len || !left || (*ppos && !write)) { *lenp = 0; @@ -2841,8 +2845,9 @@ int proc_do_large_bitmap(struct ctl_table *table, int write, } kbuf[left] = 0; - tmp_bitmap = kzalloc(BITS_TO_LONGS(bitmap_len) * sizeof(unsigned long), - GFP_KERNEL); + tmp_bitmap = kzalloc(2 * BITS_TO_LONGS(bitmap_len) * + sizeof(unsigned long), GFP_KERNEL); + release_bitmap = &tmp_bitmap[BITS_TO_LONGS(bitmap_len)]; if (!tmp_bitmap) { free_page(page); return -ENOMEM; @@ -2852,6 +2857,29 @@ int proc_do_large_bitmap(struct ctl_table *table, int write, unsigned long val_a, val_b; bool neg; + left -= proc_skip_spaces(&kbuf); + if (!left) + continue; + + if (first || add_or_release) { + if (!strnicmp(kbuf, "add ", 4)) { + xrelease = 0; + add_or_release = 1; + kbuf += 4; + left -= 4; + } else + if (!strnicmp(kbuf, "release ", 8)) { + xrelease = 1; + add_or_release = 1; + kbuf += 8; + left -= 8; + } + + left -= proc_skip_spaces(&kbuf); + if (!left) + continue; + } + err = proc_get_long(&kbuf, &left, &val_a, &neg, tr_a, sizeof(tr_a), &c); if (err) @@ -2885,12 +2913,20 @@ int proc_do_large_bitmap(struct ctl_table *table, int write, } while (val_a <= val_b) - set_bit(val_a++, tmp_bitmap); + if (xrelease) + set_bit(val_a++, release_bitmap); + else + set_bit(val_a++, tmp_bitmap); first = 0; proc_skip_char(&kbuf, &left, '\n'); } free_page(page); + + /* Do not allow adding and releasing same bits in one step. */ + if (!err && add_or_release && + bitmap_intersects(tmp_bitmap, release_bitmap, bitmap_len)) + err = -EINVAL; } else { unsigned long bit_a, bit_b = 0; @@ -2926,9 +2962,10 @@ int proc_do_large_bitmap(struct ctl_table *table, int write, if (!err) { if (write) { - if (*ppos) + if (*ppos || add_or_release) { bitmap_or(bitmap, bitmap, tmp_bitmap, bitmap_len); - else + bitmap_andnot(bitmap, bitmap, release_bitmap, bitmap_len); + } else memcpy(bitmap, tmp_bitmap, BITS_TO_LONGS(bitmap_len) * sizeof(unsigned long)); }
When writing to the ip_local_reserved_ports proc file it will currently clear all previously reserved ports and update the current list with the one given in the input. This behaviour makes it's usage quite hard, for example: a) The generic proc filesystem limitation of only handle up to PAGE_SIZE-1 characters at maximum may not be sufficient to provide all your wished-to- be-reserved ports at once. b) There is no easy way to disable specific given ports, you always need to give the full port list at once. This makes shell scripting hard, since you need to parse everything yourself. c) There is no easy way to just add specific ports or port ranges. Again, this would be useful for shell scripts. The following patch solves this problem by simply extending the parser in proc_do_large_bitmap() to accept the keywords "add" and "release" in front of given ports or port ranges and to either add or drop the given ports from the already existing list. Here is an example: $ echo "1000-2000,5000" > /proc/sys/net/ipv4/ip_local_reserved_ports $ cat /proc/sys/net/ipv4/ip_local_reserved_ports 1000-2000,5000 (works as before, current port list is replaced by new one) $ echo "add 3000-4000" > /proc/sys/net/ipv4/ip_local_reserved_ports $ cat /proc/sys/net/ipv4/ip_local_reserved_ports 1000-2000,3000-4000,5000 (new ports added) $ echo "release 1500-3500" > /proc/sys/net/ipv4/ip_local_reserved_ports $ cat /proc/sys/net/ipv4/ip_local_reserved_ports 1000-1499,3501-4000,5000 (given ports were dropped from the list) My main motivation for this patch is because of a huge commercial application which by default may use lots of ports. The full port list which I would have needed to echo to the /proc/sys/net/ipv4/ip_local_reserved_ports file was around 30K, and in this case all ports were already combined to regions where possible. With this patch it's now easy to split up the port ranges into single pieces and to implement everything in simple bootup shell scripts. Furthermore adding new or removing unneeded ports dynamically at runtime is now easily possible. Signed-off-by: Helge Deller <deller@gmx.de> CC: Octavian Purdila <octavian.purdila@intel.com> CC: netdev@vger.kernel.org CC: David Miller <davem@davemloft.net> CC: Cong Wang <amwang@redhat.com> CC: "Eric W. Biederman" <ebiederm@xmission.com> Documentation/networking/ip-sysctl.txt | 10 +++++- kernel/sysctl.c | 51 ++++++++++++++++++++++++++++----- 2 files changed, 53 insertions(+), 8 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html