Message ID | 8ac62fe2-e4bf-0922-4947-fca9567a0561@gmail.com |
---|---|
Headers | show |
Series | detect quoting and punctuation problems in diagnostics | expand |
On 5/14/19 11:31 PM, Martin Sebor wrote: > Near the end of every release a bunch of problem reports are > raised for various punctuation, quoting, and spelling issues > in GCC. In GCC 9 a total 28 such issues were submitted. Hi Martin. Great that you prepared the patchset. > A fair number of them are discovered as new or changed > diagnostics are being translated. A checker was developed > to help find some of these as well many others by scanning > the gcc.pot file: contrib/check-internal-format-escaping.py. > The checker automates the process of finding these issues but > it doesn't prevent them. Being external to GCC the checker > cannot easily distinguish between message strings that are > expected to be translated and those that aren't. > > To help avoid introducing the most glaring subset of these > problems as early as possible while making it possible to > differentiate messages that are expected to conform from those > that need not, the attached patch adds a simple checker to GCC: > -Wformat-diag. My goal is to improve consistency of messages > and relieve the burden at the end of each release both on > translators and those who then have to fix the problems. I like the idea! One limitation compared to the linter is that one needs to have a test-suite coverage for errors and warnings. Moreover, one needs to have that for all target supported by GCC. > > The warning detects some of the following: > > * unquoted operators consisting of two or more characters > (e.g., == or ?:) > * unquoted keywords, types, GCC built-ins and command line > options, or identifiers with underscores > * unquoted apostrophes and contractions like in can't > * informal abbreviations (arg or reg) > * unbalanced paired tokens (brackets, parentheses, or quotes) > * duplicate or trailing spaces (it allows multiple spaces at > the beginning of a diagnostic since that's used by the C++ > front-end as an "indentation") > * unintended control characters > * inconsistent sentence capitalization You implemented very complex checks! Nice. > > To avoid diagnosing direct calls to pretty-printer functions > that often format parts of the same diagnostic in multiple steps, > the solution introduces the "_raw__" suffix to the __gcc_diag__ > attributes that the warning then uses to avoid checking those > calls. Functions with the raw attributes aren't checked. > > I tested the patch kit as a whole in an x86_64-linux build and > fixed all the issues it pointed out except for 43 instances of > the new warning in the Go front-end that I wasn't sure how best > to handle (I will follow up on that with Ian). I also adjusted > the affected tests. > > There were just a few issues specific to the i386 back-end but they > had an impact on 9 tests. The impact of fixing the same kinds of > issues in other back-ends is likely to be similar. Rather than > trying to do the clean up across all supported targets I think it's > better to handle the rest of the issues over time and with the help > of back end maintainers who can more easily verify that tests still > pass. With than in mind, I have prevented the warning from causing > bootstrap failures. Once the cleanup is complete I expect to remove > this and include the new warning in -Werror. The approach works form me. Martin > > Since a few files in GCC "misuse" the diagnostic machinery for > debugging output leading up to internal errors (instead of > calling a pp_xxx function) I suppressed the new warning in those > files via #pragma GCC diagnostic ignored "-Wformat-diag". If we > agree that calling pp_format or some such is the way to go I can > work replacing those calls as a subsequent step. > > The subset should cover the following bug reports (some already > resolved): > > 90176 - diagnostics should generally contain underscore only > inside quotes > 90162 - exclamation mark in diagnostic!!!!!1111!!!! > 90158 - aarch64: wrong quotation in diagnostics > 90157 - aarch64: unnecessary abbreviation in diagnostic > 90156 - add linter check suggesting to replace %<%s%> with %qs > 90149 - diagnostics containing BIT_FIELD_REF don't conform to > diagnostics guideline > 90121 - extra space in error message > 90117 - Replace %<%s%> with %qs > 90011 - [9 Regression] trailing space in diagnostic > 79477 - Please write code more translator-friendly (unquoted options) > 89936 - wrong punctuation in tree-profile.c (%<%s%>) > > Although the changes are mechanical, since I made them all by hand > I broke up the patch into a series to make it easier to review: > > [PATCH 1/12] implement -Wformat-diag to detect quoting and spelling > issues in GCC diagnostics > [PATCH 2/12] fix diagnostic quoting/spelling in ada > [PATCH 3/12] fix diagnostic quoting/spelling in Brig > [PATCH 4/12] fix diagnostic quoting/spelling in the C front-end > [PATCH 5/12] fix diagnostic quoting/spelling in c-family > [PATCH 6/12] fix diagnostic quoting/spelling in C++ > [PATCH 7/12] fix diagnostic quoting/spelling in libgcc > [PATCH 8/12] fix diagnostic quoting/spelling in the middle-end > [PATCH 9/12] adjust tests to quoting/spelling diagnostics fixes > [PATCH 10/12] fix diagnostic quoting/spelling in D > [PATCH 11/12] fix diagnostic quoting/spelling issues in i386 back-end > [PATCH 12/12] fix diagnostic quoting/spelling issues in ObjC > > Martin
Hi. Maybe I've install the patches wrongly, but I see following error on ppc64le during bootstrap in stage2: /home/marxin/Programming/gcc/objdir/./prev-gcc/xg++ -B/home/marxin/Programming/gcc/objdir/./prev-gcc/ -B/usr/local/powerpc64le-unknown-linux-gnu/bin/ -nostdinc++ -B/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknn-linux-gnu/libstdc++-v3/src/.libs -B/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -I/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include/powerpc64le-unknown-linux-gnu -I/homearxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include -I/home/marxin/Programming/gcc/libstdc++-v3/libsupc++ -L/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/src/.libs -L/home/marxin/Programming/gcc/objr/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -fno-PIE -c -g -O2 -fno-checking -gtoggle -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-error=format-diag -Wmissing-format-tribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -I. -I. -I../../gcc -I../../gcc/. -I../../gcc/../include -I../../gcc/../libcpp/include -I../../gcc/../libdecnumber -I../../gcc/../libcnumber/dpd -I../libdecnumber -I../../gcc/../libbacktrace -o tree.o -MT tree.o -MMD -MP -MF ./.deps/tree.TPo ../../gcc/tree.c ../../gcc/tree.c: In function ‘void omp_clause_check_failed(const_tree, const char*, int, const char*, omp_clause_code)’: ../../gcc/tree.c:10012:67: error: unknown conversion type character ‘k’ in format [-Werror=format=] 10012 | internal_error ("tree check: expected %<omp_clause %s%>, have %qk " | ^ ../../gcc/tree.c:10013:10: error: format ‘%s’ expects argument of type ‘char*’, but argument 3 has type ‘int’ [-Werror=format=] 10013 | "in %s, at %s:%d", | ~^ | | | char* | %d ../../gcc/tree.c:10013:20: error: format ‘%d’ expects argument of type ‘int’, but argument 5 has type ‘const char*’ [-Werror=format=] 10013 | "in %s, at %s:%d", | ~^ | | | int | %s 10014 | omp_clause_code_name[code], TREE_CODE (node), 10015 | function, trim_filename (file), line); | ~~~~~~~~~~~~~~~~~~~~ | | | const char* ../../gcc/tree.c:10012:19: error: too many arguments for format [-Werror=format-extra-args] 10012 | internal_error ("tree check: expected %<omp_clause %s%>, have %qk " | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 10013 | "in %s, at %s:%d", | ~~~~~~~~~~~~~~~~~ Martin
On 5/16/19 4:02 AM, Martin Liška wrote: > Hi. > > Maybe I've install the patches wrongly, but I see following error on ppc64le > during bootstrap in stage2: I also noticed it yesterday on x86_64. The %qk was vestige of an earlier attempt to use the pretty-printer to format TREE_CODEs. I have this in my tree that fixes it but let me post an updated patch. --- a/gcc/tree.c +++ b/gcc/tree.c @@ -10009,8 +10009,10 @@ void omp_clause_check_failed (const_tree node, const char *file, int line, const char *function, enum omp_clause_code code) { - internal_error ("tree check: expected omp_clause %s, have %s in %s, at %s:%d", - omp_clause_code_name[code], get_tree_code_name (TREE_CODE (node)), + internal_error ("tree check: expected %<omp_clause %s%>, have %qs " + "in %s, at %s:%d", + omp_clause_code_name[code], + get_tree_code_name (TREE_CODE (node)), function, trim_filename (file), line); } As a heads up, my latest log still shows a few testsuite failures that I need to clean up. Those I've looked at are all missing adjustments to expected dg-warning output. ! FAIL: 20_util/any/misc/any_cast_neg.cc (3: +3) ! FAIL: gcc.dg/gcc_diag-11.c (1: +1) ! FAIL: g++.dg/ubsan/pr63956.C (21: +21) ! FAIL: gnat.dg/inline3.adb (2: +2) ! FAIL: gnat.dg/inline5.adb (2: +2) ! FAIL: gnat.dg/inline7.adb (2: +2) ! FAIL: gnat.dg/inline9.adb (2: +2) ! FAIL: objc.dg/method-19.m (2: +2) ! FAIL: objc.dg/protocol-qualifier-2.m (2: +2) ! FAIL: obj-c++.dg/protocol-qualifier-2.mm (2: +2) Martin > > /home/marxin/Programming/gcc/objdir/./prev-gcc/xg++ -B/home/marxin/Programming/gcc/objdir/./prev-gcc/ -B/usr/local/powerpc64le-unknown-linux-gnu/bin/ -nostdinc++ -B/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknn-linux-gnu/libstdc++-v3/src/.libs -B/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -I/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include/powerpc64le-unknown-linux-gnu -I/homearxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/include -I/home/marxin/Programming/gcc/libstdc++-v3/libsupc++ -L/home/marxin/Programming/gcc/objdir/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/src/.libs -L/home/marxin/Programming/gcc/objr/prev-powerpc64le-unknown-linux-gnu/libstdc++-v3/libsupc++/.libs -fno-PIE -c -g -O2 -fno-checking -gtoggle -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wno-error=format-diag -Wmissing-format-tribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common -DHAVE_CONFIG_H -I. -I. -I../../gcc -I../../gcc/. -I../../gcc/../include -I../../gcc/../libcpp/include -I../../gcc/../libdecnumber -I../../gcc/../libcnumber/dpd -I../libdecnumber -I../../gcc/../libbacktrace -o tree.o -MT tree.o -MMD -MP -MF ./.deps/tree.TPo ../../gcc/tree.c > ../../gcc/tree.c: In function ‘void omp_clause_check_failed(const_tree, const char*, int, const char*, omp_clause_code)’: > ../../gcc/tree.c:10012:67: error: unknown conversion type character ‘k’ in format [-Werror=format=] > 10012 | internal_error ("tree check: expected %<omp_clause %s%>, have %qk " > | ^ > ../../gcc/tree.c:10013:10: error: format ‘%s’ expects argument of type ‘char*’, but argument 3 has type ‘int’ [-Werror=format=] > 10013 | "in %s, at %s:%d", > | ~^ > | | > | char* > | %d > ../../gcc/tree.c:10013:20: error: format ‘%d’ expects argument of type ‘int’, but argument 5 has type ‘const char*’ [-Werror=format=] > 10013 | "in %s, at %s:%d", > | ~^ > | | > | int > | %s > 10014 | omp_clause_code_name[code], TREE_CODE (node), > 10015 | function, trim_filename (file), line); > | ~~~~~~~~~~~~~~~~~~~~ > | | > | const char* > ../../gcc/tree.c:10012:19: error: too many arguments for format [-Werror=format-extra-args] > 10012 | internal_error ("tree check: expected %<omp_clause %s%>, have %qk " > | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > 10013 | "in %s, at %s:%d", > | ~~~~~~~~~~~~~~~~~ > > Martin >