Message ID | 20200713134930.7807-1-mkroken@gmail.com |
---|---|
State | Rejected |
Delegated to: | Petr Štetiar |
Headers | show |
Series | busybox: tr: enable options required by POSIX | expand |
Magnus Kroken <mkroken@gmail.com> [2020-07-13 15:49:30]: Hi, > Support for character classes (e.g. [:upper:] and [:lower:]) and > equivalence classes (e.g. [=a=]) in the tr utility are required by POSIX. > This change increases package size by approx. 500 bytes. where does OpenWrt claims, that it's fully POSIX compliant? Some deviations are expected from the standards in exchange for lower flash usage. Maybe it could be considered as `default y if !SMALL_FLASH` for devices with more flash space, but then we would probably get inconsistent behaviour across various targets and scripts wouldn't use this classes anyway. So I don't see anything in favor for this patch inclusion. -- ynezz
On 2020-07-13 08:36, Petr Štetiar wrote: > Magnus Kroken <mkroken@gmail.com> [2020-07-13 15:49:30]: > > Hi, > >> Support for character classes (e.g. [:upper:] and [:lower:]) and >> equivalence classes (e.g. [=a=]) in the tr utility are required by POSIX. >> This change increases package size by approx. 500 bytes. > where does OpenWrt claims, that it's fully POSIX compliant? Some deviations > are expected from the standards in exchange for lower flash usage. Maybe it > could be considered as `default y if !SMALL_FLASH` for devices with more flash > space, but then we would probably get inconsistent behaviour across various > targets and scripts wouldn't use this classes anyway. > > So I don't see anything in favor for this patch inclusion. > > -- ynezz Hi Petr, Not sure if you've had a chance to read through the earlier discussion about this, so I will reiterate my point a bit below On OpenWRT 'tr' is configured to silently ignore character classes and treat all characters literally, which is the most dangerous kind of deviation from norm, as it is does something non-standard without informing the user. That alone seems to strongly put this in favour of being included. Even if it is decided to deviate from the standard and ignore character classes, there should at the very least be an error/warning printed. The question being asked is, is saving 500 bytes worth a tremendous deviation from the norm, and rendering a standard tool essentially useless (with a built-in foot gun to boot!) Regards, Jordan
On Mon, Jul 13, 2020 at 12:14 PM Jordan Geoghegan <jordan@geoghegan.ca> wrote: > > > > On 2020-07-13 08:36, Petr Štetiar wrote: > > Magnus Kroken <mkroken@gmail.com> [2020-07-13 15:49:30]: > > > > Hi, > > > >> Support for character classes (e.g. [:upper:] and [:lower:]) and > >> equivalence classes (e.g. [=a=]) in the tr utility are required by POSIX. > >> This change increases package size by approx. 500 bytes. > > where does OpenWrt claims, that it's fully POSIX compliant? Some deviations > > are expected from the standards in exchange for lower flash usage. Maybe it > > could be considered as `default y if !SMALL_FLASH` for devices with more flash > > space, but then we would probably get inconsistent behaviour across various > > targets and scripts wouldn't use this classes anyway. > > > > So I don't see anything in favor for this patch inclusion. > > > > -- ynezz > > Hi Petr, > > Not sure if you've had a chance to read through the earlier discussion > about this, so I will reiterate my point a bit below > > On OpenWRT 'tr' is configured to silently ignore character classes and > treat all characters literally, which is the most dangerous kind of > deviation from norm, as it is does something non-standard without > informing the user. That alone seems to strongly put this in favour of > being included. Even if it is decided to deviate from the standard and > ignore character classes, there should at the very least be an > error/warning printed. Got any example of this being problematic currently? > > The question being asked is, is saving 500 bytes worth a tremendous > deviation from the norm, and rendering a standard tool essentially > useless (with a built-in foot gun to boot!) tr is used in the tree for more than this. > > Regards, > > Jordan > > > > _______________________________________________ > openwrt-devel mailing list > openwrt-devel@lists.openwrt.org > https://lists.openwrt.org/mailman/listinfo/openwrt-devel
On 2020-07-13 22:17, Rosen Penev wrote: > On Mon, Jul 13, 2020 at 12:14 PM Jordan Geoghegan <jordan@geoghegan.ca> wrote: >> >> >> On 2020-07-13 08:36, Petr Štetiar wrote: >>> Magnus Kroken <mkroken@gmail.com> [2020-07-13 15:49:30]: >>> >>> Hi, >>> >>>> Support for character classes (e.g. [:upper:] and [:lower:]) and >>>> equivalence classes (e.g. [=a=]) in the tr utility are required by POSIX. >>>> This change increases package size by approx. 500 bytes. >>> where does OpenWrt claims, that it's fully POSIX compliant? Some deviations >>> are expected from the standards in exchange for lower flash usage. Maybe it >>> could be considered as `default y if !SMALL_FLASH` for devices with more flash >>> space, but then we would probably get inconsistent behaviour across various >>> targets and scripts wouldn't use this classes anyway. >>> >>> So I don't see anything in favor for this patch inclusion. >>> >>> -- ynezz >> Hi Petr, >> >> Not sure if you've had a chance to read through the earlier discussion >> about this, so I will reiterate my point a bit below >> >> On OpenWRT 'tr' is configured to silently ignore character classes and >> treat all characters literally, which is the most dangerous kind of >> deviation from norm, as it is does something non-standard without >> informing the user. That alone seems to strongly put this in favour of >> being included. Even if it is decided to deviate from the standard and >> ignore character classes, there should at the very least be an >> error/warning printed. > Got any example of this being problematic currently? A quick grep of the source tree shows there's already things relying on classes that aren't actually working correctly: ryzen$ rg "tr '\[:" scripts/mkits.sh 59:ARCH_UPPER=$(echo "$ARCH" | tr '[:lower:]' '[:upper:]') I also grepped the package/ports tree and found a number of issues, namely, using double brackets "[[" is a no-no with tr, as the extra brackets are treated literally, as well as '[A-Z]' is also a bug, as the brackets are unnecessary and are treated literally. ryzen$ rg "tr '\[" utils/lxc/patches/010-Remove-distro-check.patch 43:-with_distro=`echo ${with_distro} | tr '[[:upper:]]' '[[:lower:]]'` sound/shairport-sync/patches/010-no-cxx.patch 28:@@ -19,7 +19,6 @@ with_os=`echo ${with_os} | tr '[[:upper:]]' '[[:lower:]]' ` utils/pciutils/patches/105-fix-host.patch 7:-host=`echo $HOST | sed -e 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\3/' -e 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\2/' -e 's/^\([^-]*\)-\([^-]*\)$/\1--\2/' | tr '[A-Z]' '[a-z]'` 8:+host=`echo $HOST | sed -e 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\3/' -e 's/^\([^-]*\)-\([^-]*\)$/\1--\2/' | tr '[A-Z]' '[a-z]'` net/ser2net/files/ser2net.init 28: [ "$uc" -eq 1 ] && key=`echo "$key" | tr '[a-z]' '[A-Z]'` 120: parity=`echo "$parity" | tr '[a-z]' '[A-Z]'` >> The question being asked is, is saving 500 bytes worth a tremendous >> deviation from the norm, and rendering a standard tool essentially >> useless (with a built-in foot gun to boot!) > tr is used in the tree for more than this. My point still stands. >> Regards, >> >> Jordan >> >> >> >> _______________________________________________ >> openwrt-devel mailing list >> openwrt-devel@lists.openwrt.org >> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
On Mon, Jul 13, 2020 at 10:44 PM Jordan Geoghegan <jordan@geoghegan.ca> wrote: > > > > On 2020-07-13 22:17, Rosen Penev wrote: > > On Mon, Jul 13, 2020 at 12:14 PM Jordan Geoghegan <jordan@geoghegan.ca> wrote: > >> > >> > >> On 2020-07-13 08:36, Petr Štetiar wrote: > >>> Magnus Kroken <mkroken@gmail.com> [2020-07-13 15:49:30]: > >>> > >>> Hi, > >>> > >>>> Support for character classes (e.g. [:upper:] and [:lower:]) and > >>>> equivalence classes (e.g. [=a=]) in the tr utility are required by POSIX. > >>>> This change increases package size by approx. 500 bytes. > >>> where does OpenWrt claims, that it's fully POSIX compliant? Some deviations > >>> are expected from the standards in exchange for lower flash usage. Maybe it > >>> could be considered as `default y if !SMALL_FLASH` for devices with more flash > >>> space, but then we would probably get inconsistent behaviour across various > >>> targets and scripts wouldn't use this classes anyway. > >>> > >>> So I don't see anything in favor for this patch inclusion. > >>> > >>> -- ynezz > >> Hi Petr, > >> > >> Not sure if you've had a chance to read through the earlier discussion > >> about this, so I will reiterate my point a bit below > >> > >> On OpenWRT 'tr' is configured to silently ignore character classes and > >> treat all characters literally, which is the most dangerous kind of > >> deviation from norm, as it is does something non-standard without > >> informing the user. That alone seems to strongly put this in favour of > >> being included. Even if it is decided to deviate from the standard and > >> ignore character classes, there should at the very least be an > >> error/warning printed. > > Got any example of this being problematic currently? > A quick grep of the source tree shows there's already things relying on > classes that aren't actually working correctly: > > ryzen$ rg "tr '\[:" > scripts/mkits.sh > 59:ARCH_UPPER=$(echo "$ARCH" | tr '[:lower:]' '[:upper:]') > > I also grepped the package/ports tree and found a number of issues, > namely, using double brackets "[[" is a no-no with tr, as the extra > brackets are treated literally, as well as '[A-Z]' is also a bug, as the > brackets are unnecessary and are treated literally. > > ryzen$ rg "tr '\[" > utils/lxc/patches/010-Remove-distro-check.patch > 43:-with_distro=`echo ${with_distro} | tr '[[:upper:]]' '[[:lower:]]'` > > sound/shairport-sync/patches/010-no-cxx.patch > 28:@@ -19,7 +19,6 @@ with_os=`echo ${with_os} | tr '[[:upper:]]' > '[[:lower:]]' ` > > utils/pciutils/patches/105-fix-host.patch > 7:-host=`echo $HOST | sed -e > 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\3/' -e > 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\2/' -e > 's/^\([^-]*\)-\([^-]*\)$/\1--\2/' | tr '[A-Z]' '[a-z]'` > 8:+host=`echo $HOST | sed -e > 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\3/' -e > 's/^\([^-]*\)-\([^-]*\)$/\1--\2/' | tr '[A-Z]' '[a-z]'` > > net/ser2net/files/ser2net.init > 28: [ "$uc" -eq 1 ] && key=`echo "$key" | tr '[a-z]' '[A-Z]'` > 120: parity=`echo "$parity" | tr '[a-z]' '[A-Z]'` All of those examples except for the last one are for the host. > > >> The question being asked is, is saving 500 bytes worth a tremendous > >> deviation from the norm, and rendering a standard tool essentially > >> useless (with a built-in foot gun to boot!) > > tr is used in the tree for more than this. > My point still stands. The issue is that it does not solve an issue that is currently present. > >> Regards, > >> > >> Jordan > >> > >> > >> > >> _______________________________________________ > >> openwrt-devel mailing list > >> openwrt-devel@lists.openwrt.org > >> https://lists.openwrt.org/mailman/listinfo/openwrt-devel >
On 2020-07-13 22:56, Rosen Penev wrote: > On Mon, Jul 13, 2020 at 10:44 PM Jordan Geoghegan <jordan@geoghegan.ca> wrote: >> >> >> On 2020-07-13 22:17, Rosen Penev wrote: >>> On Mon, Jul 13, 2020 at 12:14 PM Jordan Geoghegan <jordan@geoghegan.ca> wrote: >>>> >>>> On 2020-07-13 08:36, Petr Štetiar wrote: >>>>> Magnus Kroken <mkroken@gmail.com> [2020-07-13 15:49:30]: >>>>> >>>>> Hi, >>>>> >>>>>> Support for character classes (e.g. [:upper:] and [:lower:]) and >>>>>> equivalence classes (e.g. [=a=]) in the tr utility are required by POSIX. >>>>>> This change increases package size by approx. 500 bytes. >>>>> where does OpenWrt claims, that it's fully POSIX compliant? Some deviations >>>>> are expected from the standards in exchange for lower flash usage. Maybe it >>>>> could be considered as `default y if !SMALL_FLASH` for devices with more flash >>>>> space, but then we would probably get inconsistent behaviour across various >>>>> targets and scripts wouldn't use this classes anyway. >>>>> >>>>> So I don't see anything in favor for this patch inclusion. >>>>> >>>>> -- ynezz >>>> Hi Petr, >>>> >>>> Not sure if you've had a chance to read through the earlier discussion >>>> about this, so I will reiterate my point a bit below >>>> >>>> On OpenWRT 'tr' is configured to silently ignore character classes and >>>> treat all characters literally, which is the most dangerous kind of >>>> deviation from norm, as it is does something non-standard without >>>> informing the user. That alone seems to strongly put this in favour of >>>> being included. Even if it is decided to deviate from the standard and >>>> ignore character classes, there should at the very least be an >>>> error/warning printed. >>> Got any example of this being problematic currently? >> A quick grep of the source tree shows there's already things relying on >> classes that aren't actually working correctly: >> >> ryzen$ rg "tr '\[:" >> scripts/mkits.sh >> 59:ARCH_UPPER=$(echo "$ARCH" | tr '[:lower:]' '[:upper:]') >> >> I also grepped the package/ports tree and found a number of issues, >> namely, using double brackets "[[" is a no-no with tr, as the extra >> brackets are treated literally, as well as '[A-Z]' is also a bug, as the >> brackets are unnecessary and are treated literally. >> >> ryzen$ rg "tr '\[" >> utils/lxc/patches/010-Remove-distro-check.patch >> 43:-with_distro=`echo ${with_distro} | tr '[[:upper:]]' '[[:lower:]]'` >> >> sound/shairport-sync/patches/010-no-cxx.patch >> 28:@@ -19,7 +19,6 @@ with_os=`echo ${with_os} | tr '[[:upper:]]' >> '[[:lower:]]' ` >> >> utils/pciutils/patches/105-fix-host.patch >> 7:-host=`echo $HOST | sed -e >> 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\3/' -e >> 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\2/' -e >> 's/^\([^-]*\)-\([^-]*\)$/\1--\2/' | tr '[A-Z]' '[a-z]'` >> 8:+host=`echo $HOST | sed -e >> 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\3/' -e >> 's/^\([^-]*\)-\([^-]*\)$/\1--\2/' | tr '[A-Z]' '[a-z]'` >> >> net/ser2net/files/ser2net.init >> 28: [ "$uc" -eq 1 ] && key=`echo "$key" | tr '[a-z]' '[A-Z]'` >> 120: parity=`echo "$parity" | tr '[a-z]' '[A-Z]'` > All of those examples except for the last one are for the host. Either way, those bugs I mentioned should be addressed, as they are indeed errors. Also, the use of "tr 'a-z'..." is unsafe for exotic locales as mentioned earlier in the thread. What makes all this so dangerous is that no error is printed, it silently and sneakily does the exact opposite of what you would expect. >>>> The question being asked is, is saving 500 bytes worth a tremendous >>>> deviation from the norm, and rendering a standard tool essentially >>>> useless (with a built-in foot gun to boot!) >>> tr is used in the tree for more than this. >> My point still stands. > The issue is that it does not solve an issue that is currently present. It does solve an issue that is currently present, what do you think brought me here? This all started because a couple of my scripts blew up (scripts that are highly portable and run on a multitude of systems, for example Linux, MacOS, OpenBSD, FreeBSD, NetBSD, DragonflyBSD etc. Everything worked on OpenWRT, except for 'tr' working unlike any other 'tr' I've encountered (other than ancient 20th century Unixen). >>>> Regards, >>>> >>>> Jordan >>>> >>>> >>>> >>>> _______________________________________________ >>>> openwrt-devel mailing list >>>> openwrt-devel@lists.openwrt.org >>>> https://lists.openwrt.org/mailman/listinfo/openwrt-devel
Jordan Geoghegan <jordan@geoghegan.ca> [2020-07-13 22:44:05]: > scripts/mkits.sh > 59:ARCH_UPPER=$(echo "$ARCH" | tr '[:lower:]' '[:upper:]') What's a problem here? It's running on host, so this classes are supported. > ryzen$ rg "tr '\[" > utils/lxc/patches/010-Remove-distro-check.patch > 43:-with_distro=`echo ${with_distro} | tr '[[:upper:]]' '[[:lower:]]'` Upstream issue, it's removed, so not actually used, but still host related. Should be fixed upstream anyway not OpenWrt related. Unlikely that this value would ever contain square brackets. > sound/shairport-sync/patches/010-no-cxx.patch > 28:@@ -19,7 +19,6 @@ with_os=`echo ${with_os} | tr '[[:upper:]]' > '[[:lower:]]' ` Upstream issue, this is context in the patch not added by OpenWrt, running on host. Unlikely that this value would ever contain square brackets. > utils/pciutils/patches/105-fix-host.patch > 7:-host=`echo $HOST | sed -e > 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\3/' -e > 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\2/' -e > 's/^\([^-]*\)-\([^-]*\)$/\1--\2/' | tr '[A-Z]' '[a-z]'` > 8:+host=`echo $HOST | sed -e > 's/^\([^-]*\)-\([^-]*\)-\([^-]*\)-\([^-]*\)$/\1-\3/' -e > 's/^\([^-]*\)-\([^-]*\)$/\1--\2/' | tr '[A-Z]' '[a-z]'` Upstream even provided you the comment few lines above: # CAVEAT: tr on Solaris is a bit weird and the extra [] is otherwise harmless. > net/ser2net/files/ser2net.init > 28: [ "$uc" -eq 1 ] && key=`echo "$key" | tr '[a-z]' '[A-Z]'` > 120: parity=`echo "$parity" | tr '[a-z]' '[A-Z]'` If you look at the context, not an issue. Those values would never contain square brackets. Anyway feel free to submit PR with a square brackets removal. So actually no issues. -- ynezz
Jordan Geoghegan <jordan@geoghegan.ca> [2020-07-13 23:06:30]:
> Also, the use of "tr 'a-z'..." is unsafe for exotic locales
Buildsystem sets LC_ALL=C explicitly, so whats the issue here?
On 2020-07-14 02:08, Petr Štetiar wrote: > Jordan Geoghegan <jordan@geoghegan.ca> [2020-07-13 23:06:30]: > >> Also, the use of "tr 'a-z'..." is unsafe for exotic locales > Buildsystem sets LC_ALL=C explicitly, so whats the issue here? I've reported that 'tr' behaves abnormally, that's about all I can do. If you guys want to be edgelords and ship a broken 'tr' in the name of saving 500 bytes, then that's your prerogative. 'tr' is a standard system utility, I wasn't expecting such pushback against making it behave as every other modern implementation does. If there's somebody here on the mailing list with commit access that wants to do something about this, then I guess now is the time to have your say. Regards, Jordan Geoghegan
Jordan Geoghegan <jordan@geoghegan.ca> writes: > 'tr' is a standard system utility, I wasn't expecting such pushback > against making it behave as every other modern implementation does. OpenWrt exists because it is different from every other distribution. https://openwrt.org/about describes it as "all the features you need with none of the bloat". There is no promise of all the features you want, or all the featues of any other distro, or even all POSIX features. What you "need" is obviously something you can discuss, but I believe it is reasonable to limit it to the software included with the distro. OpenWrt specific porting might be required for any 3rd party scripts or applications, even those running unmodified on other distros. OpenWrt provides GNU coreutils as an optional alternative for the 'tr' application. This is full featured++, like GNU utils often are. Just my 2 peanuts. Bjørn
On 2020-07-14 02:54, Bjørn Mork wrote: > Jordan Geoghegan <jordan@geoghegan.ca> writes: > >> 'tr' is a standard system utility, I wasn't expecting such pushback >> against making it behave as every other modern implementation does. > OpenWrt exists because it is different from every other distribution. > > https://openwrt.org/about describes it as "all the features you need > with none of the bloat". There is no promise of all the features you > want, or all the featues of any other distro, or even all POSIX > features. What you "need" is obviously something you can discuss, but I > believe it is reasonable to limit it to the software included with the > distro. OpenWrt specific porting might be required for any 3rd party > scripts or applications, even those running unmodified on other distros. > > OpenWrt provides GNU coreutils as an optional alternative for the 'tr' > application. This is full featured++, like GNU utils often are. > > Just my 2 peanuts. > > > Bjørn > > _______________________________________________ > openwrt-devel mailing list > openwrt-devel@lists.openwrt.org > https://lists.openwrt.org/mailman/listinfo/openwrt-devel This isn't about 500 bytes of space anymore is it? I think at this point people just want to be "different" special snowflakes and not have a working tr even though nobody has presented any potential upside to leaving this unfixed. The folks that have responded to this bug report (not even sure any of them have commit access lol) seem more interested in bikeshedding than actually fixing problems. I'm now bored of debating this, so if you want to ship broken trash, I'll leave you alone. Regards, Jordan
Jordan Geoghegan <jordan@geoghegan.ca> writes:
> This isn't about 500 bytes of space anymore is it?
FWIW, I believe it is. It is about the sum of all the 500 bytes of
space contributed by each feature which doesn't qualify as strictly
necessary.
The upside is that OpenWrt runs on devices with almost no RAM and flash.
I thought that was pretty obvious. This is the result of a rather
tedious work over a long time, defining features as "bloat". It makes
OpenWrt unique.
There is obviously also a downside, as you have pointed out. All those
removed features are useful in some context. And it is definitely
frustrating every time you miss one of them.
But I don't think you help your cause by describing the current state as
"not have a working tr". It might not work as you expected. That's
understandable. But that is by design and not a bug, as you've been
explained.
Bjørn
diff --git a/package/utils/busybox/Config-defaults.in b/package/utils/busybox/Config-defaults.in index 29724041f4..76c51cf7e9 100644 --- a/package/utils/busybox/Config-defaults.in +++ b/package/utils/busybox/Config-defaults.in @@ -837,10 +837,10 @@ config BUSYBOX_DEFAULT_TR default y config BUSYBOX_DEFAULT_FEATURE_TR_CLASSES bool - default n + default y config BUSYBOX_DEFAULT_FEATURE_TR_EQUIV bool - default n + default y config BUSYBOX_DEFAULT_TRUE bool default y
Support for character classes (e.g. [:upper:] and [:lower:]) and equivalence classes (e.g. [=a=]) in the tr utility are required by POSIX. This change increases package size by approx. 500 bytes. Size before: 208372 busybox_1.31.1-1_mips_24kc.ipk Size after: 208895 busybox_1.31.1-1_mips_24kc.ipk Signed-off-by: Magnus Kroken <mkroken@gmail.com> Reported-by: Jordan Geoghegan <jordan@geoghegan.ca> --- This was discussed a few days ago [1], but the patch wasn't caught by Patchwork. Resending. Jordan: I was not able to apply your patch, but it was easy enough to fix. Please consider using git-send-email for future patches, as it ensures it is formatted so Patchwork [2] will keep track of it, and solves most formatting issues that may occur between author and committer. 1: https://lists.openwrt.org/pipermail/openwrt-devel/2020-July/030021.html 2: https://patchwork.ozlabs.org/project/openwrt/list/ package/utils/busybox/Config-defaults.in | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)