Message ID | m3oao06pj3.fsf-ueno@gnu.org |
---|---|
State | New |
Headers | show |
On 03/11/2015 02:01 AM, Daiki Ueno wrote: > It is surprising that there are no checks of lengths/offsets read from > MO files. Currently, I'm thinking of the attached patch (to gettext), > which is a bit complicated. If anyone could suggest a cleaner approach, > I'd appreciate it. Why does it surprise you? The MO files are writable only by root, so it's not a security issue because if you could write to them you'd be root, and you'd have full access to the system anyway. The other alternative is that the filesystem is corrupted and loading the MO file crashes your application. This is expected since the filesystem is corrupted. You are suggesting we add some rather complex checking for the possibly low probability case of a corrupted filesystem. If the filesystem is corrupted other things will be failing and you need to fix the corruption. What strong technical reasons do you have for propsing these additional checks? Cheers, Carlos.
On 11 Mar 2015 02:39, Carlos O'Donell wrote: > On 03/11/2015 02:01 AM, Daiki Ueno wrote: > > It is surprising that there are no checks of lengths/offsets read from > > MO files. Currently, I'm thinking of the attached patch (to gettext), > > which is a bit complicated. If anyone could suggest a cleaner approach, > > I'd appreciate it. > > Why does it surprise you? > > The MO files are writable only by root, so it's not a security issue > because if you could write to them you'd be root, and you'd have > full access to the system anyway. > > The other alternative is that the filesystem is corrupted and loading > the MO file crashes your application. This is expected since the > filesystem is corrupted. You are suggesting we add some rather complex > checking for the possibly low probability case of a corrupted > filesystem. If the filesystem is corrupted other things will be failing > and you need to fix the corruption. > > What strong technical reasons do you have for propsing these additional > checks? i thought you could control things via $TEXTDOMAIN/$TEXTDOMAINDIR, but it looks like just `bash` and `gettext` respect those ? so if you have a shell script that either directly supports translated messages (e.g. bash's $"..."), or indirectly (e.g. manually calling `gettext`), and it doesn't lock down the TEXTDOMAINDIR envvar properly, you could get them to load untrusted data and crash due to the omitted range checks in glibc ? i'm not really familiar with how much gettext relies on glibc though or if it just entirely uses its own copy of code. using Debian's code search [1], it looks like git provides GIT_TEXTDOMAINDIR to override the default TEXTDOMAINDIR. i stopped at page ~6 ;). -mike [1] http://codesearch.debian.net/perpackage-results/TEXTDOMAINDIR/
Mike Frysinger <vapier@gentoo.org> writes: >> What strong technical reasons do you have for propsing these additional >> checks? > > i thought you could control things via $TEXTDOMAIN/$TEXTDOMAINDIR, but it looks > like just `bash` and `gettext` respect those ? so if you have a shell script > that either directly supports translated messages (e.g. bash's $"..."), or > indirectly (e.g. manually calling `gettext`), and it doesn't lock down the > TEXTDOMAINDIR envvar properly, you could get them to load untrusted data and > crash due to the omitted range checks in glibc ? bindtextdomain is the only place to configure the location, and it seems to be the design: http://thread.gmane.org/gmane.comp.lib.glibc.alpha/575 However, I too observed a few programs which use the location obtained from environment variable. Perhaps it would be nice to suggest using the fixed location in the documentation. Regards, -- Daiki Ueno
Carlos O'Donell wrote: > The MO files are writable only by root, so it's not a security issue > because if you could write to them you'd be root, and you'd have > full access to the system anyway. Your argument is similar to Ulrich Drepper's argument: MO files are part of the distribution of a package, like executables and shared libraries. glibc does not check against invalid offsets in shared libraries either, and the kernel does not check against illegal instructions that happen to exist in executables and shared libraries. But these arguments don't consider the LANGUAGE variable. The original intent of LANGUAGE was that it contains colon-separated language or locale identifiers. But in fact, you can specify relative files names that start with "../", and thus you can make the _nl_load_domain function in glibc access files anywhere in the file system. For example: $ LANGUAGE=../../../../../../../../../../../../../../tmp/hack/crashing-mos strace cp . . ... open("/usr/share/locale/../../../../../../../../../../../../../../tmp/hack/crashing-mos/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) open("/usr/share/locale-langpack/../../../../../../../../../../../../../../tmp/hack/crashing-mos/LC_MESSAGES/coreutils.mo", O_RDONLY) = -1 ENOENT (No such file or directory) ... If I had put a hacked .mo file at /tmp/hack/crashing-mos/LC_MESSAGES/coreutils.mo I would have crashed the 'cp' program from coreutils. Likewise with any program from any package. Bruno
On 03/12/2015 02:04 AM, Bruno Haible wrote: > But these arguments don't consider the LANGUAGE variable. The original > intent of LANGUAGE was that it contains colon-separated language or locale > identifiers. But in fact, you can specify relative files names that start > with "../", and thus you can make the _nl_load_domain function in glibc > access files anywhere in the file system. Yes, this is bug 17142.
On 03/13/2015 10:29 AM, Florian Weimer wrote: > On 03/12/2015 02:04 AM, Bruno Haible wrote: > >> But these arguments don't consider the LANGUAGE variable. The original >> intent of LANGUAGE was that it contains colon-separated language or locale >> identifiers. But in fact, you can specify relative files names that start >> with "../", and thus you can make the _nl_load_domain function in glibc >> access files anywhere in the file system. > > Yes, this is bug 17142. In my opinion we need to restrict LANGUAGE just like we restricted all other the other variables in CVE-2014-0475. Cheers, Carlos.
"Carlos O'Donell" <carlos@redhat.com> writes: > On 03/13/2015 10:29 AM, Florian Weimer wrote: >> On 03/12/2015 02:04 AM, Bruno Haible wrote: >> >>> But these arguments don't consider the LANGUAGE variable. The original >>> intent of LANGUAGE was that it contains colon-separated language or locale >>> identifiers. But in fact, you can specify relative files names that start >>> with "../", and thus you can make the _nl_load_domain function in glibc >>> access files anywhere in the file system. >> >> Yes, this is bug 17142. > > In my opinion we need to restrict LANGUAGE just like we restricted all > other the other variables in CVE-2014-0475. I agree. Now that intl/ is almost[1] synchronized with gettext, what's blocking this? I'm happy to include the patch in the upcoming gettext release so non-glibc consumers also benefit from it. Footnotes: [1] except a minor adjustment for non-glibc systems: https://lists.gnu.org/archive/html/bug-gettext/2015-01/msg00029.html and absolute pathname handling, which I guess could be reverted because of not much use: http://git.savannah.gnu.org/cgit/gettext.git/commit/?id=279b57fc Regards,
On 03/20/2015 02:06 AM, Daiki Ueno wrote: > I agree. Now that intl/ is almost[1] synchronized with gettext, what's > blocking this? I'm happy to include the patch in the upcoming gettext > release so non-glibc consumers also benefit from it. The patch will use getauxval(AT_SECURE) or __libc_enable_secure (or issetuugid on other systems, but which I cannot test). It is not going to be very portable.
Florian Weimer <fweimer@redhat.com> writes: > The patch will use getauxval(AT_SECURE) or __libc_enable_secure (or > issetuugid on other systems, but which I cannot test). It is not going > to be very portable. I see (though I'm a bit confused that you removed the use of __libc_enable_secure in CVE-2014-0475). Can't you use secure_getenv, which Gnulib provides a replacement, compare the result with the normal getenv, and apply the pathname check if needed? Regards,
On 03/21/2015 04:17 AM, Daiki Ueno wrote: > Florian Weimer <fweimer@redhat.com> writes: > >> The patch will use getauxval(AT_SECURE) or __libc_enable_secure (or >> issetuugid on other systems, but which I cannot test). It is not going >> to be very portable. > > I see (though I'm a bit confused that you removed the use of > __libc_enable_secure in CVE-2014-0475). Can't you use secure_getenv, > which Gnulib provides a replacement, compare the result with > the normal getenv, and apply the pathname check if needed? Hmm, I was under the impression that absolute paths for LANGUAGE were a supported feature. If that's not the case, we can just reject directory traversal and confine lookups to the system locale directory, like we did for the other locale files.
From 8909a24b692e38472d7c05911f4b65eb24292c66 Mon Sep 17 00:00:00 2001 From: Daiki Ueno <ueno@gnu.org> Date: Wed, 11 Mar 2015 14:43:34 +0900 Subject: [PATCH] intl: Proof against invalid offset/length * gettext-runtime/intl/loadmsgcat.c (_nl_load_domain): Check the upper bound of offset/length values read from MO file. * gettext-tools/tests/gettext-9: New file. * gettext-tools/tests/Makefile.am (TESTS): Add new test. --- gettext-runtime/intl/ChangeLog | 6 ++ gettext-runtime/intl/loadmsgcat.c | 132 ++++++++++++++++++++++++++++++-------- gettext-tools/tests/ChangeLog | 5 ++ gettext-tools/tests/Makefile.am | 2 +- gettext-tools/tests/gettext-9 | 24 +++++++ 5 files changed, 142 insertions(+), 27 deletions(-) create mode 100755 gettext-tools/tests/gettext-9 diff --git a/gettext-runtime/intl/ChangeLog b/gettext-runtime/intl/ChangeLog index d32774c..ed0a14d 100644 --- a/gettext-runtime/intl/ChangeLog +++ b/gettext-runtime/intl/ChangeLog @@ -1,3 +1,9 @@ +2015-03-11 Daiki Ueno <ueno@gnu.org> + + intl: Proof against invalid offset/length + * loadmsgcat.c (_nl_load_domain): Check the upper bound of + offset/length values read from MO file. + 2015-01-22 Daiki Ueno <ueno@gnu.org> intl: Update from gnulib diff --git a/gettext-runtime/intl/loadmsgcat.c b/gettext-runtime/intl/loadmsgcat.c index 8eb77d8..b97b316 100644 --- a/gettext-runtime/intl/loadmsgcat.c +++ b/gettext-runtime/intl/loadmsgcat.c @@ -799,6 +799,8 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, int revision; const char *nullentry; size_t nullentrylen; + size_t orig_tab_offset; + size_t trans_tab_offset; __libc_lock_lock_recursive (lock); if (domain_file->decided != 0) @@ -921,6 +923,7 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, domain->mmap_size = size; domain->must_swap = data->magic != _MAGIC; domain->malloced = NULL; + domain->hash_tab = NULL; /* Fill in the information about the available tables. */ revision = W (domain->must_swap, data->revision); @@ -930,16 +933,32 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, case 0: case 1: domain->nstrings = W (domain->must_swap, data->nstrings); + orig_tab_offset = W (domain->must_swap, data->orig_tab_offset); + if (__builtin_expect (orig_tab_offset >= size, 0)) + goto invalid; domain->orig_tab = (const struct string_desc *) - ((char *) data + W (domain->must_swap, data->orig_tab_offset)); + ((char *) data + orig_tab_offset); + trans_tab_offset = W (domain->must_swap, data->trans_tab_offset); + if (__builtin_expect (trans_tab_offset >= size, 0)) + goto invalid; domain->trans_tab = (const struct string_desc *) - ((char *) data + W (domain->must_swap, data->trans_tab_offset)); + ((char *) data + trans_tab_offset); domain->hash_size = W (domain->must_swap, data->hash_tab_size); - domain->hash_tab = - (domain->hash_size > 2 - ? (const nls_uint32 *) - ((char *) data + W (domain->must_swap, data->hash_tab_offset)) - : NULL); + if (__builtin_expect (domain->hash_size + >= SIZE_MAX / sizeof (nls_uint32), 0)) + goto invalid; + if (domain->hash_size > 2) + { + size_t hash_tab_byte_size = domain->hash_size * sizeof (nls_uint32); + size_t hash_tab_offset = W (domain->must_swap, data->hash_tab_offset); + if (__builtin_expect (hash_tab_offset + >= SIZE_MAX - hash_tab_byte_size, 0) + || __builtin_expect (hash_tab_offset + hash_tab_byte_size + >= size, 0)) + goto invalid; + domain->hash_tab = (const nls_uint32 *) + ((char *) data + hash_tab_offset); + } domain->must_swap_hash_tab = domain->must_swap; /* Now dispatch on the minor revision. */ @@ -968,6 +987,9 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, const char **sysdep_segment_values; const nls_uint32 *orig_sysdep_tab; const nls_uint32 *trans_sysdep_tab; + size_t sysdep_segments_offset; + size_t orig_sysdep_tab_offset; + size_t trans_sysdep_tab_offset; nls_uint32 n_inmem_sysdep_strings; size_t memneed; char *mem; @@ -979,20 +1001,30 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, /* Get the values of the system dependent segments. */ n_sysdep_segments = W (domain->must_swap, data->n_sysdep_segments); + sysdep_segments_offset = + W (domain->must_swap, data->sysdep_segments_offset); + if (__builtin_expect (sysdep_segments_offset >= size, 0)) + goto invalid; sysdep_segments = (const struct sysdep_segment *) - ((char *) data - + W (domain->must_swap, data->sysdep_segments_offset)); + ((char *) data + sysdep_segments_offset); sysdep_segment_values = (const char **) alloca (n_sysdep_segments * sizeof (const char *)); for (i = 0; i < n_sysdep_segments; i++) { - const char *name = - (char *) data - + W (domain->must_swap, sysdep_segments[i].offset); + const char *name; + size_t nameoff = + W (domain->must_swap, sysdep_segments[i].offset); nls_uint32 namelen = W (domain->must_swap, sysdep_segments[i].length); + if (__builtin_expect (namelen >= SIZE_MAX - nameoff, 0) + || __builtin_expect (nameoff + namelen >= size, 0)) + { + freea (sysdep_segment_values); + goto invalid; + } + name = (char *) data + nameoff; if (!(namelen > 0 && name[namelen - 1] == '\0')) { freea (sysdep_segment_values); @@ -1002,12 +1034,24 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, sysdep_segment_values[i] = get_sysdep_segment_value (name); } + orig_sysdep_tab_offset = + W (domain->must_swap, data->orig_sysdep_tab_offset); + if (__builtin_expect (orig_sysdep_tab_offset >= size, 0)) + { + freea (sysdep_segment_values); + goto invalid; + } orig_sysdep_tab = (const nls_uint32 *) - ((char *) data - + W (domain->must_swap, data->orig_sysdep_tab_offset)); + ((char *) data + orig_sysdep_tab_offset); + trans_sysdep_tab_offset = + W (domain->must_swap, data->trans_sysdep_tab_offset); + if (__builtin_expect (trans_sysdep_tab_offset >= size, 0)) + { + freea (sysdep_segment_values); + goto invalid; + } trans_sysdep_tab = (const nls_uint32 *) - ((char *) data - + W (domain->must_swap, data->trans_sysdep_tab_offset)); + ((char *) data + trans_sysdep_tab_offset); /* Compute the amount of additional memory needed for the system dependent strings and the augmented hash table. @@ -1022,22 +1066,52 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, for (j = 0; j < 2; j++) { - const struct sysdep_string *sysdep_string = - (const struct sysdep_string *) - ((char *) data - + W (domain->must_swap, - j == 0 - ? orig_sysdep_tab[i] - : trans_sysdep_tab[i])); + size_t sysdep_string_offset; + const struct sysdep_string *sysdep_string; + size_t offset; size_t need = 0; - const struct segment_pair *p = sysdep_string->segments; + const struct segment_pair *p; + sysdep_string_offset = W (domain->must_swap, + j == 0 + ? orig_sysdep_tab[i] + : trans_sysdep_tab[i]); + if (__builtin_expect (sysdep_string_offset >= size, 0)) + { + freea (sysdep_segment_values); + goto invalid; + } + sysdep_string = (const struct sysdep_string *) + ((char *) data + sysdep_string_offset); + + offset = W (domain->must_swap, sysdep_string->offset); + if (__builtin_expect (offset >= size, 0)) + { + freea (sysdep_segment_values); + goto invalid; + } + + p = sysdep_string->segments; if (W (domain->must_swap, p->sysdepref) != SEGMENTS_END) for (p = sysdep_string->segments;; p++) { + nls_uint32 segsize = W (domain->must_swap, + p->segsize); nls_uint32 sysdepref; + size_t n; - need += W (domain->must_swap, p->segsize); + if (__builtin_expect (segsize + >= SIZE_MAX - offset, 0) + || __builtin_expect (offset + segsize + >= size, 0) + || __builtin_expect (segsize + >= SIZE_MAX - need, 0)) + { + freea (sysdep_segment_values); + goto invalid; + } + + need += segsize; sysdepref = W (domain->must_swap, p->sysdepref); if (sysdepref == SEGMENTS_END) @@ -1057,7 +1131,13 @@ _nl_load_domain (struct loaded_l10nfile *domain_file, break; } - need += strlen (sysdep_segment_values[sysdepref]); + n = strlen (sysdep_segment_values[sysdepref]); + if (__builtin_expect (n >= SIZE_MAX - need, 0)) + { + freea (sysdep_segment_values); + goto invalid; + } + need += n; } needs[j] = need; diff --git a/gettext-tools/tests/ChangeLog b/gettext-tools/tests/ChangeLog index 72a46d4..9983a14 100644 --- a/gettext-tools/tests/ChangeLog +++ b/gettext-tools/tests/ChangeLog @@ -1,3 +1,8 @@ +2015-03-11 Daiki Ueno <ueno@gnu.org> + + * gettext-9: New file. + * Makefile.am (TESTS): Add new test. + 2015-03-05 Daiki Ueno <ueno@gnu.org> * format-kde-kuit-1: New file. diff --git a/gettext-tools/tests/Makefile.am b/gettext-tools/tests/Makefile.am index a4decaf..7c995e3 100644 --- a/gettext-tools/tests/Makefile.am +++ b/gettext-tools/tests/Makefile.am @@ -21,7 +21,7 @@ EXTRA_DIST = MOSTLYCLEANFILES = core *.stackdump TESTS = gettext-1 gettext-2 gettext-3 gettext-4 gettext-5 gettext-6 gettext-7 \ - gettext-8 \ + gettext-8 gettext-9 \ msgattrib-1 msgattrib-2 msgattrib-3 msgattrib-4 msgattrib-5 \ msgattrib-6 msgattrib-7 msgattrib-8 msgattrib-9 msgattrib-10 \ msgattrib-11 msgattrib-12 msgattrib-13 msgattrib-14 msgattrib-15 \ diff --git a/gettext-tools/tests/gettext-9 b/gettext-tools/tests/gettext-9 new file mode 100755 index 0000000..7d6d4c8 --- /dev/null +++ b/gettext-tools/tests/gettext-9 @@ -0,0 +1,24 @@ +#! /bin/sh +. "${srcdir=.}/init.sh"; path_prepend_ . ../src + +# Test invalid or incomplete input + +: ${LOCALE_FR=fr_FR} +{ test $LOCALE_FR != none && LC_ALL=$LOCALE_FR ../testlocale; } || { + if test -f /usr/bin/localedef; then + echo "Skipping test: no traditional french locale is installed" + else + echo "Skipping test: no traditional french locale is supported" + fi + exit 77 +} + +: ${GETTEXT=gettext} + +test -d fr || mkdir fr +test -d fr/LC_MESSAGES || mkdir fr/LC_MESSAGES + +for n in 1 2 3 4 5 6; do + cp "$abs_srcdir"/overflow-$n.mo fr/LC_MESSAGES + echo LANGUAGE= LC_ALL=${LOCALE_FR} TEXTDOMAINDIR=. ${GETTEXT} -d overflow-$n foo +done -- 2.1.0