Patchwork [05/11] dialog: Patch incorrect use of toupper()

login
register
mail settings
Submitter Paul Cercueil
Date April 3, 2014, 9:01 p.m.
Message ID <1396558881-29631-5-git-send-email-paul@crapouillou.net>
Download mbox | patch
Permalink /patch/336778/
State Not Applicable
Headers show

Comments

Paul Cercueil - April 3, 2014, 9:01 p.m.
On some platforms (e.g. MIPS), the "char" type is signed by default.
The problem is that toupper() takes an int as argument: a signed
char then gets sign-extended to 32bit, which causes an assertion
failure as toupper() verifies that its argument fits in 8 bits.

Signed-Off-By: Paul Cercueil <paul@crapouillou.net>
Acked-By: Maarten ter Huurne <maarten@treewalker.org>
---
 package/dialog/dialog-toupper.patch | 13 +++++++++++++
 1 file changed, 13 insertions(+)
 create mode 100644 package/dialog/dialog-toupper.patch
Yann E. MORIN - April 3, 2014, 9:32 p.m.
Paul, All,

On 2014-04-03 23:01 +0200, Paul Cercueil spake thusly:
> On some platforms (e.g. MIPS), the "char" type is signed by default.
> The problem is that toupper() takes an int as argument: a signed
> char then gets sign-extended to 32bit, which causes an assertion
> failure as toupper() verifies that its argument fits in 8 bits.

I could not vefrify the assertion that MIPS has signed chars.

However, this change is still corect, since it makes it explicit that we
want an unsigned char.

> Signed-Off-By: Paul Cercueil <paul@crapouillou.net>
> Acked-By: Maarten ter Huurne <maarten@treewalker.org>

Reviewed-by: "Yann E. MORIN" <yann.morin.1998@free.fr>

Regards,
Yann E. MORIN.

> ---
>  package/dialog/dialog-toupper.patch | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
>  create mode 100644 package/dialog/dialog-toupper.patch
> 
> diff --git a/package/dialog/dialog-toupper.patch b/package/dialog/dialog-toupper.patch
> new file mode 100644
> index 0000000..3fe0e19
> --- /dev/null
> +++ b/package/dialog/dialog-toupper.patch
> @@ -0,0 +1,13 @@
> +diff --git a/dlg_keys.h b/dlg_keys.h
> +index 6a96c0f..b7b42d9 100644
> +--- a/dlg_keys.h
> ++++ b/dlg_keys.h
> +@@ -31,7 +31,7 @@
> + #define dlg_toupper(ch) towupper((wint_t)ch)
> + #define dlg_isupper(ch) iswupper((wint_t)ch)
> + #else
> +-#define dlg_toupper(ch) toupper(ch)
> ++#define dlg_toupper(ch) toupper((unsigned char)(ch))
> + #define dlg_isupper(ch) (isalpha(ch) && isupper(ch))
> + #endif
> + 
> -- 
> 1.9.0
> 
> _______________________________________________
> buildroot mailing list
> buildroot@busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot
Arnout Vandecappelle - April 4, 2014, 6:45 a.m.
On 03/04/14 23:32, Yann E. MORIN wrote:
> Paul, All,
> 
> On 2014-04-03 23:01 +0200, Paul Cercueil spake thusly:
>> On some platforms (e.g. MIPS), the "char" type is signed by default.
>> The problem is that toupper() takes an int as argument: a signed
>> char then gets sign-extended to 32bit, which causes an assertion
>> failure as toupper() verifies that its argument fits in 8 bits.
> 
> I could not vefrify the assertion that MIPS has signed chars.

 I think char is signed on all platforms. At least, it is on all
platforms I regularly use.

> 
> However, this change is still corect, since it makes it explicit that we
> want an unsigned char.

 But then we should change all the packages that use toupper and
tolower... (and also isdigit and isalnum and ...).

 Basically, if toupper() asserts in this case, it's a bug in the
toolchain, not in the package.


 Regards,
 Arnout

> 
>> Signed-Off-By: Paul Cercueil <paul@crapouillou.net>
>> Acked-By: Maarten ter Huurne <maarten@treewalker.org>
> 
> Reviewed-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
> 
> Regards,
> Yann E. MORIN.
> 
>> ---
>>  package/dialog/dialog-toupper.patch | 13 +++++++++++++
>>  1 file changed, 13 insertions(+)
>>  create mode 100644 package/dialog/dialog-toupper.patch
>>
>> diff --git a/package/dialog/dialog-toupper.patch b/package/dialog/dialog-toupper.patch
>> new file mode 100644
>> index 0000000..3fe0e19
>> --- /dev/null
>> +++ b/package/dialog/dialog-toupper.patch
>> @@ -0,0 +1,13 @@
>> +diff --git a/dlg_keys.h b/dlg_keys.h
>> +index 6a96c0f..b7b42d9 100644
>> +--- a/dlg_keys.h
>> ++++ b/dlg_keys.h
>> +@@ -31,7 +31,7 @@
>> + #define dlg_toupper(ch) towupper((wint_t)ch)
>> + #define dlg_isupper(ch) iswupper((wint_t)ch)
>> + #else
>> +-#define dlg_toupper(ch) toupper(ch)
>> ++#define dlg_toupper(ch) toupper((unsigned char)(ch))
>> + #define dlg_isupper(ch) (isalpha(ch) && isupper(ch))
>> + #endif
>> + 
>> -- 
>> 1.9.0
>>
>> _______________________________________________
>> buildroot mailing list
>> buildroot@busybox.net
>> http://lists.busybox.net/mailman/listinfo/buildroot
>
Paul Cercueil - April 8, 2014, 4:56 p.m.
Hi,

On 04/04/2014 08:45, Arnout Vandecappelle wrote:
> On 03/04/14 23:32, Yann E. MORIN wrote:
>> Paul, All,
>>
>> On 2014-04-03 23:01 +0200, Paul Cercueil spake thusly:
>>> On some platforms (e.g. MIPS), the "char" type is signed by default.
>>> The problem is that toupper() takes an int as argument: a signed
>>> char then gets sign-extended to 32bit, which causes an assertion
>>> failure as toupper() verifies that its argument fits in 8 bits.
>> I could not vefrify the assertion that MIPS has signed chars.
>   I think char is signed on all platforms. At least, it is on all
> platforms I regularly use.
>

Well I read that ARM has unsigned chars, but I can't seem to find a real 
list.

>> However, this change is still corect, since it makes it explicit that we
>> want an unsigned char.
>   But then we should change all the packages that use toupper and
> tolower... (and also isdigit and isalnum and ...).
>
>   Basically, if toupper() asserts in this case, it's a bug in the
> toolchain, not in the package.

So you suggest to patch uClibc instead?

>
>
>   Regards,
>   Arnout
>
>>> Signed-Off-By: Paul Cercueil <paul@crapouillou.net>
>>> Acked-By: Maarten ter Huurne <maarten@treewalker.org>
>> Reviewed-by: "Yann E. MORIN" <yann.morin.1998@free.fr>
>>
>> Regards,
>> Yann E. MORIN.
>>
>>> ---
>>>   package/dialog/dialog-toupper.patch | 13 +++++++++++++
>>>   1 file changed, 13 insertions(+)
>>>   create mode 100644 package/dialog/dialog-toupper.patch
>>>
>>> diff --git a/package/dialog/dialog-toupper.patch b/package/dialog/dialog-toupper.patch
>>> new file mode 100644
>>> index 0000000..3fe0e19
>>> --- /dev/null
>>> +++ b/package/dialog/dialog-toupper.patch
>>> @@ -0,0 +1,13 @@
>>> +diff --git a/dlg_keys.h b/dlg_keys.h
>>> +index 6a96c0f..b7b42d9 100644
>>> +--- a/dlg_keys.h
>>> ++++ b/dlg_keys.h
>>> +@@ -31,7 +31,7 @@
>>> + #define dlg_toupper(ch) towupper((wint_t)ch)
>>> + #define dlg_isupper(ch) iswupper((wint_t)ch)
>>> + #else
>>> +-#define dlg_toupper(ch) toupper(ch)
>>> ++#define dlg_toupper(ch) toupper((unsigned char)(ch))
>>> + #define dlg_isupper(ch) (isalpha(ch) && isupper(ch))
>>> + #endif
>>> +
>>> -- 
>>> 1.9.0
>>>
>>> _______________________________________________
>>> buildroot mailing list
>>> buildroot@busybox.net
>>> http://lists.busybox.net/mailman/listinfo/buildroot
>
Thomas De Schampheleire - April 10, 2014, 8:40 a.m.
Hi,

On Fri, Apr 4, 2014 at 8:45 AM, Arnout Vandecappelle <arnout@mind.be> wrote:
> On 03/04/14 23:32, Yann E. MORIN wrote:
>> Paul, All,
>>
>> On 2014-04-03 23:01 +0200, Paul Cercueil spake thusly:
>>> On some platforms (e.g. MIPS), the "char" type is signed by default.
>>> The problem is that toupper() takes an int as argument: a signed
>>> char then gets sign-extended to 32bit, which causes an assertion
>>> failure as toupper() verifies that its argument fits in 8 bits.
>>
>> I could not vefrify the assertion that MIPS has signed chars.
>
>  I think char is signed on all platforms. At least, it is on all
> platforms I regularly use.

This is not correct.
MIPS and x86 have signed char as default, while ARM and PPC have
unsigned char by default.
Take into account that the compiler can be told to switch the default,
using -fsigned-char or -funsigned-char.


>>
>> However, this change is still corect, since it makes it explicit that we
>> want an unsigned char.
>
>  But then we should change all the packages that use toupper and
> tolower... (and also isdigit and isalnum and ...).
>
>  Basically, if toupper() asserts in this case, it's a bug in the
> toolchain, not in the package.

I wouldn't classify it as a bug in the toolchain, but rather as an
unfortunate API choice in toupper and friends, or alternatively as an
unfortunate failing of the C specification to explicitly define 'char'
as being signed or unsigned.

Best regards,
Thomas
Arnout Vandecappelle - April 10, 2014, 9:44 a.m.
On 10/04/14 10:40, Thomas De Schampheleire wrote:
> Hi,
> 
> On Fri, Apr 4, 2014 at 8:45 AM, Arnout Vandecappelle <arnout@mind.be> wrote:
>> On 03/04/14 23:32, Yann E. MORIN wrote:
>>> Paul, All,
>>>
>>> On 2014-04-03 23:01 +0200, Paul Cercueil spake thusly:
[snip]
>>> However, this change is still corect, since it makes it explicit that we
>>> want an unsigned char.
>>
>>  But then we should change all the packages that use toupper and
>> tolower... (and also isdigit and isalnum and ...).
>>
>>  Basically, if toupper() asserts in this case, it's a bug in the
>> toolchain, not in the package.
> 
> I wouldn't classify it as a bug in the toolchain, but rather as an
> unfortunate API choice in toupper and friends, or alternatively as an
> unfortunate failing of the C specification to explicitly define 'char'
> as being signed or unsigned.

 Yeah, the bane of backward compatibility... Some of these functions
existed before there were prototypes in C.


 Still, I'd say that not allowing a negative signed char value for these
functions is incorrect. uClibc at least _tries_ to handle it:

[libc/misc/ctype/ctype.c]
#ifdef __UCLIBC_HAS_CTYPE_SIGNED__

#if EOF >= CHAR_MIN
#define CTYPE_DOMAIN_CHECK(C) \
        (((unsigned int)((C) - CHAR_MIN)) <= (UCHAR_MAX - CHAR_MIN))
#else
#define CTYPE_DOMAIN_CHECK(C) \
        ((((unsigned int)((C) - CHAR_MIN)) <= (UCHAR_MAX - CHAR_MIN)) ||
((C) == EOF))
#endif

#else  /* __UCLIBC_HAS_CTYPE_SIGNED__ */

#if EOF == -1
#define CTYPE_DOMAIN_CHECK(C) \
        (((unsigned int)((C) - EOF)) <= (UCHAR_MAX - EOF))
#else
#define CTYPE_DOMAIN_CHECK(C) \
        ((((unsigned int)(C)) <= UCHAR_MAX) || ((C) == EOF))
#endif

#endif /* __UCLIBC_HAS_CTYPE_SIGNED__ */



 Regards,
 Arnout
Thomas De Schampheleire - July 19, 2014, 2:06 p.m.
Hi,

On Thu, Apr 10, 2014 at 11:44 AM, Arnout Vandecappelle <arnout@mind.be> wrote:
> On 10/04/14 10:40, Thomas De Schampheleire wrote:
>> Hi,
>>
>> On Fri, Apr 4, 2014 at 8:45 AM, Arnout Vandecappelle <arnout@mind.be> wrote:
>>> On 03/04/14 23:32, Yann E. MORIN wrote:
>>>> Paul, All,
>>>>
>>>> On 2014-04-03 23:01 +0200, Paul Cercueil spake thusly:
> [snip]
>>>> However, this change is still corect, since it makes it explicit that we
>>>> want an unsigned char.
>>>
>>>  But then we should change all the packages that use toupper and
>>> tolower... (and also isdigit and isalnum and ...).
>>>
>>>  Basically, if toupper() asserts in this case, it's a bug in the
>>> toolchain, not in the package.
>>
>> I wouldn't classify it as a bug in the toolchain, but rather as an
>> unfortunate API choice in toupper and friends, or alternatively as an
>> unfortunate failing of the C specification to explicitly define 'char'
>> as being signed or unsigned.
>
>  Yeah, the bane of backward compatibility... Some of these functions
> existed before there were prototypes in C.
>
>
>  Still, I'd say that not allowing a negative signed char value for these
> functions is incorrect. uClibc at least _tries_ to handle it:

The man page of toupper says: (Linux man-pages)

 "If c is not an unsigned char value, or EOF, the behavior of these
functions is undefined."

According to me, the patch (http://patchwork.ozlabs.org/patch/336778/)
is a valid one. The dialog sources should only call toupper with an
unsigned char. There are different ways to fixing this, but the one
proposed in the patch seems fine.

As mentioned before, the statements made by the submitter are correct:
MIPS does indeed have a signed-char by default, it will do
sign-extension on the implicit cast from char to int.

Hence:

Reviewed-by: Thomas De Schampheleire <thomas.de.schampheleire@gmail.com>

Best regards,
Thomas
Paul Cercueil - Aug. 15, 2014, 11 p.m.
Hi,

I remove that patch from patchwork. Dialog has been bumped to version 
1.2-20140219 which doesn't require this fix anymore.

Paul Cercueil

Le 03/04/2014 23:01, Paul Cercueil a écrit :
> On some platforms (e.g. MIPS), the "char" type is signed by default.
> The problem is that toupper() takes an int as argument: a signed
> char then gets sign-extended to 32bit, which causes an assertion
> failure as toupper() verifies that its argument fits in 8 bits.
>
> Signed-Off-By: Paul Cercueil <paul@crapouillou.net>
> Acked-By: Maarten ter Huurne <maarten@treewalker.org>
> ---
>   package/dialog/dialog-toupper.patch | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
>   create mode 100644 package/dialog/dialog-toupper.patch
>
> diff --git a/package/dialog/dialog-toupper.patch b/package/dialog/dialog-toupper.patch
> new file mode 100644
> index 0000000..3fe0e19
> --- /dev/null
> +++ b/package/dialog/dialog-toupper.patch
> @@ -0,0 +1,13 @@
> +diff --git a/dlg_keys.h b/dlg_keys.h
> +index 6a96c0f..b7b42d9 100644
> +--- a/dlg_keys.h
> ++++ b/dlg_keys.h
> +@@ -31,7 +31,7 @@
> + #define dlg_toupper(ch) towupper((wint_t)ch)
> + #define dlg_isupper(ch) iswupper((wint_t)ch)
> + #else
> +-#define dlg_toupper(ch) toupper(ch)
> ++#define dlg_toupper(ch) toupper((unsigned char)(ch))
> + #define dlg_isupper(ch) (isalpha(ch) && isupper(ch))
> + #endif
> +
>

Patch

diff --git a/package/dialog/dialog-toupper.patch b/package/dialog/dialog-toupper.patch
new file mode 100644
index 0000000..3fe0e19
--- /dev/null
+++ b/package/dialog/dialog-toupper.patch
@@ -0,0 +1,13 @@ 
+diff --git a/dlg_keys.h b/dlg_keys.h
+index 6a96c0f..b7b42d9 100644
+--- a/dlg_keys.h
++++ b/dlg_keys.h
+@@ -31,7 +31,7 @@
+ #define dlg_toupper(ch) towupper((wint_t)ch)
+ #define dlg_isupper(ch) iswupper((wint_t)ch)
+ #else
+-#define dlg_toupper(ch) toupper(ch)
++#define dlg_toupper(ch) toupper((unsigned char)(ch))
+ #define dlg_isupper(ch) (isalpha(ch) && isupper(ch))
+ #endif
+