diff mbox

[RFC] Introduce RUN_UNDER_VALGRIND in test-suite

Message ID 5652DD92.2030202@suse.cz
State New
Headers show

Commit Message

Martin Liška Nov. 23, 2015, 9:34 a.m. UTC
On 11/21/2015 05:26 AM, Hans-Peter Nilsson wrote:
> On Thu, 19 Nov 2015, Martin Li?ka wrote:
>> Hello.
>>
>> In last two weeks I've removed couple of memory leaks, mainly tight to middle-end.
>> Currently, a user of the GCC compiler can pass '--enable-checking=valgrind' configure option
>> that will run all commands within valgrind environment, but as the valgrind runs just with '-q' option,
>> the result is not very helpful.
>>
>> I would like to start with another approach, where we can run all tests in test-suite
>> within the valgrind sandbox and return an exit code if there's an error seen by the tool.
>> That unfortunately leads to many latent (maybe false positives, FE issues, ...) that can
>> be efficiently ignored by valgrind suppressions file (the file is part of suggested patch).
>>
>> The first version of the valgrind.supp can survive running compilation of tramp3d with -O2
>> and majority of tests in test-suite can successfully finish. Most of memory leaks
>> mentioned in the file can be eventually fixed.
> 
> I didn't quite understand the need for the suppression files.
> Is it like Markus said, only because valgrind annotations are
> not on by default?  Then let's change it so that's the default
> during DEV-PHASE = experimental (the development phase) or
> prerelease.  I really thought that was the case by now.
> (The suppression files are IMHO a useful addition to contrib/
> either way.)

Hi.

Well, the original motivation was to basically to fill up the file with all common
errors (known issues) and to fix all newly introduced issues. That can minimize
the number of errors reported by the tool.
	
However, as I run complete test-suite for all default languages, I've seen:

== Statistics ==
Total number of errors: 249615
Number of different errors: 5848

Where two errors are different if they produce either different message or back-backtrace.
For complete list of errors (sorted by # of occurrences), download:

https://docs.google.com/uc?authuser=0&id=0B0pisUJ80pO1MENrWXBzak5naFk&export=download

> 
>> As I noticed in results log files, most of remaining issues are connected to gcc.c and
>> lto-wrapper.c files. gcc.c heavily manipulates with strings and it would probably require
>> usage of a string pool, that can easily eventually removed (just in case of --enable-valgrind-annotations).
>> The second source file tends to produce memory leaks because of fork/exec constructs. However both
>> can be improved during next stage1.
>>
>> Apart from aforementioned issues, the compiler does not contain so many issues and I think it's
>> doable to prune them and rely on reported valgrind errors.
>>
>> Patch touches many .exp files, but basically does just couple of modifications:
>>
>> 1) gcc-defs.exp introduces new global variable run_under_valgrind
>> 2) new procedure dg-run-valgrind distinguishes between just passing options to 'gd-test',
>>    or runs 'dg-test' with additional flags that enable valgrind (using -wrapper)
>> 3) new procedure dg-runtest-valgrind does the similar
>> 4) many changes in corresponding *.exp files that utilize these procedures
>>
>> The patch should be definitely part of next stage1, but I would appreciate any thoughts
>> about the described approach?
> 
> IIRC you can replace the actual dg-runtest proc with your own
> (implementing a wrapper).  Grep aroung, I think we do that
> already.  That's certainly preferable instead of touching all
> callers.

You are right, the suggested patch was over-kill, wrapper should be fine for that.
Currently I've been playing with a bit different approach (suggested by Markus),
where I would like to enable valgrind in gcc.c using an environmental variable.

Question is if it should replace existing ENABLE_VALGRIND_CHECKING and how to
integrate it with a valgrind suppressions file?

Ideas are highly welcomed.

Thanks,
Martin

> 
>>
>> Thank you,
>> Martin
> 
> brgds, H-P
>

Comments

Hans-Peter Nilsson Nov. 24, 2015, 9:16 a.m. UTC | #1
On Mon, 23 Nov 2015, Martin Li?ka wrote:
> On 11/21/2015 05:26 AM, Hans-Peter Nilsson wrote:
> > On Thu, 19 Nov 2015, Martin Li?ka wrote:
> >> Hello.
> >>
> >> In last two weeks I've removed couple of memory leaks, mainly tight to middle-end.
> >> Currently, a user of the GCC compiler can pass '--enable-checking=valgrind' configure option
> >> that will run all commands within valgrind environment, but as the valgrind runs just with '-q' option,
> >> the result is not very helpful.

I disagree.  Bugs due to uninitialized memory are diagnosed and
pin-pointed, as intended.  If that has regressed, that's a bug.

> >> I would like to start with another approach, where we can run all tests in test-suite
> >> within the valgrind sandbox and return an exit code if there's an error seen by the tool.

That's exactly what --enable-checking=valgrind does (well, is
supposed to do.  (Leaks not being diagnosed as errors.)

> >> That unfortunately leads to many latent (maybe false positives, FE issues, ...) that can
> >> be efficiently ignored by valgrind suppressions file (the file is part of suggested patch).

How can you be sure about the false positives?

It sounds like you're papering over bugs, perhaps bit-rot due to
missing annotations?

> >> The first version of the valgrind.supp can survive running compilation of tramp3d with -O2
> >> and majority of tests in test-suite can successfully finish. Most of memory leaks
> >> mentioned in the file can be eventually fixed.

Then better not suppress them.  If they aren't visible with
valgrind -q perhaps due to being leaks rather than use of
uninitialized memory, then make that a separate mode.

> > I didn't quite understand the need for the suppression files.
> > Is it like Markus said, only because valgrind annotations are
> > not on by default?  Then let's change it so that's the default
> > during DEV-PHASE = experimental (the development phase) or
> > prerelease.  I really thought that was the case by now.
> > (The suppression files are IMHO a useful addition to contrib/
> > either way.)
>
> Hi.
>
> Well, the original motivation was to basically to fill up the file with all common
> errors (known issues) and to fix all newly introduced issues. That can minimize
> the number of errors reported by the tool.

There *are* lots of bugs seen with valgrind -q and those should
be fixed, not suppressed.  If you want a suppression mode, make
that separate.

> However, as I run complete test-suite for all default languages, I've seen:
>
> == Statistics ==
> Total number of errors: 249615
> Number of different errors: 5848
>
> Where two errors are different if they produce either different message or back-backtrace.
> For complete list of errors (sorted by # of occurrences), download:
>
> https://docs.google.com/uc?authuser=0&id=0B0pisUJ80pO1MENrWXBzak5naFk&export=download
>
> >
> >> As I noticed in results log files, most of remaining issues are connected to gcc.c and
> >> lto-wrapper.c files. gcc.c heavily manipulates with strings and it would probably require
> >> usage of a string pool, that can easily eventually removed (just in case of --enable-valgrind-annotations).
> >> The second source file tends to produce memory leaks because of fork/exec constructs. However both
> >> can be improved during next stage1.
> >>
> >> Apart from aforementioned issues, the compiler does not contain so many issues and I think it's
> >> doable to prune them and rely on reported valgrind errors.
> >>
> >> Patch touches many .exp files, but basically does just couple of modifications:
> >>
> >> 1) gcc-defs.exp introduces new global variable run_under_valgrind
> >> 2) new procedure dg-run-valgrind distinguishes between just passing options to 'gd-test',
> >>    or runs 'dg-test' with additional flags that enable valgrind (using -wrapper)
> >> 3) new procedure dg-runtest-valgrind does the similar
> >> 4) many changes in corresponding *.exp files that utilize these procedures
> >>
> >> The patch should be definitely part of next stage1, but I would appreciate any thoughts
> >> about the described approach?
> >
> > IIRC you can replace the actual dg-runtest proc with your own
> > (implementing a wrapper).  Grep aroung, I think we do that
> > already.  That's certainly preferable instead of touching all
> > callers.
>
> You are right, the suggested patch was over-kill, wrapper should be fine for that.
> Currently I've been playing with a bit different approach (suggested by Markus),
> where I would like to enable valgrind in gcc.c using an environmental variable.
>
> Question is if it should replace existing ENABLE_VALGRIND_CHECKING and how to
> integrate it with a valgrind suppressions file?

My answer is "no": by default nothing should be suppressed with
--enable-checking=valgrind and the effect should be that
annotations are active and valgrind -q is the wrapper.  Add a
suppression mode for your needs.

brgds, H-P
Bernd Schmidt Dec. 3, 2015, 2:15 p.m. UTC | #2
On 11/23/2015 10:34 AM, Martin Liška wrote:
> On 11/21/2015 05:26 AM, Hans-Peter Nilsson wrote:
>> IIRC you can replace the actual dg-runtest proc with your own
>> (implementing a wrapper).  Grep aroung, I think we do that
>> already.  That's certainly preferable instead of touching all
>> callers.
>
> You are right, the suggested patch was over-kill, wrapper should be fine for that.
> Currently I've been playing with a bit different approach (suggested by Markus),
> where I would like to enable valgrind in gcc.c using an environmental variable.
>
> Question is if it should replace existing ENABLE_VALGRIND_CHECKING and how to
> integrate it with a valgrind suppressions file?

This patch still seems to be in the queue. I've been looking at it every 
now and then, without really forming an opinion. In any case, I think 
we'll need to postpone this to stage1 at this point.

Wouldn't it be better to fix issues first and only then enable running 
the testsuite with valgrind, rather than make a suppression file?

Your latest patch seems to add the option of running the compiler 
without ENABLE_CHECKING_VALGRIND being defined. Doesn't this run into 
problems when the support in ggc isn't compiled in?


Bernd
Martin Liška Dec. 8, 2015, 3:28 p.m. UTC | #3
On 12/03/2015 03:15 PM, Bernd Schmidt wrote:
> On 11/23/2015 10:34 AM, Martin Liška wrote:
>> On 11/21/2015 05:26 AM, Hans-Peter Nilsson wrote:
>>> IIRC you can replace the actual dg-runtest proc with your own
>>> (implementing a wrapper).  Grep aroung, I think we do that
>>> already.  That's certainly preferable instead of touching all
>>> callers.
>>
>> You are right, the suggested patch was over-kill, wrapper should be fine for that.
>> Currently I've been playing with a bit different approach (suggested by Markus),
>> where I would like to enable valgrind in gcc.c using an environmental variable.
>>
>> Question is if it should replace existing ENABLE_VALGRIND_CHECKING and how to
>> integrate it with a valgrind suppressions file?
> 
> This patch still seems to be in the queue. I've been looking at it every now and then, without really forming an opinion. In any case, I think we'll need to postpone this to stage1 at this point.
> 
> Wouldn't it be better to fix issues first and only then enable running the testsuite with valgrind, rather than make a suppression file?
> 
> Your latest patch seems to add the option of running the compiler without ENABLE_CHECKING_VALGRIND being defined. Doesn't this run into problems when the support in ggc isn't compiled in?
> 
> 
> Bernd

Hi.

Right, the patch is in queue and can wait for next stage1. I must agree with Hans-Peter Nilsson that we should
mainly focus on removal of memory leaks (and other invalid operations) rather that maintaining a list of suppressions.
After that, integration with existing configure machine should be easily doable, I guess.

I've just run the test-suite (with default languages) and report file was post-processed with my script [1] that
groups same back-traces together.

Currently we have ~200000 errors, in ~4000 different back-traces.

Majority of them (~2600 BTs) are in fortran FE (BT contains 'gfc_'): [2].
The rest contains some issues in CP FE, many GGC invalid read/write operations ([4]) and many
memory leaks in gcc.c (for instance option handling).

My question is if a bug should be created for all fortran issues and whether it's realistic that
they can be eventually fixed in next stage1?

Thanks,
Martin

[1] https://github.com/marxin/script-misc/blob/master/valgrind-grep.py
[2] https://drive.google.com/file/d/0B0pisUJ80pO1ZjdCVlZoeGZQNjg/view?usp=sharing
[3] https://drive.google.com/file/d/0B0pisUJ80pO1aFZTWk5sVTBlcHc/view?usp=sharing
[4] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68758
Bernd Schmidt Dec. 8, 2015, 3:33 p.m. UTC | #4
On 12/08/2015 04:28 PM, Martin Liška wrote:
>
> Majority of them (~2600 BTs) are in fortran FE (BT contains 'gfc_'): [2].
> The rest contains some issues in CP FE, many GGC invalid read/write operations ([4]) and many
> memory leaks in gcc.c (for instance option handling).
>
> My question is if a bug should be created for all fortran issues and whether it's realistic that
> they can be eventually fixed in next stage1?

It hardly seems worthwhile to file a bug for every issue, that's just 
unnecessary bureaucracy. Fix them as you go along would be my 
recommendation, that probably takes a similar amount of time.

Can't say whether it's realistic that they'll be fixed. I'm not at home 
in the Fortran frontend.


Bernd
Jeff Law Dec. 8, 2015, 4:04 p.m. UTC | #5
On 12/08/2015 08:33 AM, Bernd Schmidt wrote:
> On 12/08/2015 04:28 PM, Martin Liška wrote:
>>
>> Majority of them (~2600 BTs) are in fortran FE (BT contains 'gfc_'): [2].
>> The rest contains some issues in CP FE, many GGC invalid read/write
>> operations ([4]) and many
>> memory leaks in gcc.c (for instance option handling).
>>
>> My question is if a bug should be created for all fortran issues and
>> whether it's realistic that
>> they can be eventually fixed in next stage1?
>
> It hardly seems worthwhile to file a bug for every issue, that's just
> unnecessary bureaucracy. Fix them as you go along would be my
> recommendation, that probably takes a similar amount of time.
Agreed, let's just fix them as we go along.

Suppressions should be a last resort in those cases where the code is 
operating correctly.  It does happen here and there.

jeff
Martin Liška May 13, 2016, 1:04 p.m. UTC | #6
Hello.

I've tried to apply the same patch for the current trunk and tried to separate
reported errors to a different categories by a simple script ([1]).

There are number (complete report: [2]):

SECTION: gfortran 
  error types: 3534, total errors: 113282
  error types: 90.15%, total errors: 27.11%
SECTION: c++ 
  error types: 161, total errors: 7260
  error types: 4.11%, total errors: 1.74%
SECTION: c 
  error types: 90, total errors: 6320
  error types: 2.30%, total errors: 1.51%
SECTION: c-common 
  error types: 39, total errors: 205033
  error types: 0.99%, total errors: 49.06%
SECTION: Other 
  error types: 96, total errors: 86037
  error types: 2.45%, total errors: 20.59%

Type in the dump means a back trace, while 'total errors' represent total # of errors.
The second line shows a percentage ratio compared to all errors seen.

As seen fortran has very big variety of back traces. Well, it shows that we have multiple PRs reported,
they are references in the following PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68800

C++ FE looks quite good, same as C. c-common contains very interesting group:

are possibly lost: 204032 occurences
  malloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
  xmalloc (xmalloc.c:148)
  new_buff (lex.c:3158)
  _cpp_get_buff (lex.c:3191)
  cpp_create_reader(c_lang, ht*, line_maps*) (init.c:251)
  c_common_init_options(unsigned int, cl_decoded_option*) (c-opts.c:219)
  toplev::main(int, char**) (toplev.c:2070)
  main (main.c:39)

(which is in fact majority of errors in the section). That's something I'm going to look at.

The last category (called Other) is dominated by

are definitely lost: 20740 occurences
  calloc (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
  xcalloc (xmalloc.c:163)
  main (collect2.c:975)

which is probably option handling allocation stuff. Apart from that, there are just ~90 types of different
back traces. It's doable to remove majority of these in this stage1. OTOH as I'm not familiar with a FE (mainly Fortran),
solving Fortran FE would be not trivial.

Note: The dump comes from one week old build.

Martin

[1] https://github.com/marxin/script-misc/blob/master/valgrind-grep.py
[2] https://drive.google.com/open?id=0B0pisUJ80pO1M1VOd2pObUVXSEU
diff mbox

Patch

From f0b211e4194e11e5ad52fa3b295a62f67b4060b8 Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Fri, 20 Nov 2015 09:46:09 +0100
Subject: [PATCH] Initial version of valgrind wrapper

---
 contrib/gcc.supp | 108 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 gcc/gcc.c        |  51 +++++++++++++++-----------
 2 files changed, 139 insertions(+), 20 deletions(-)
 create mode 100644 contrib/gcc.supp

diff --git a/contrib/gcc.supp b/contrib/gcc.supp
new file mode 100644
index 0000000..deefb28
--- /dev/null
+++ b/contrib/gcc.supp
@@ -0,0 +1,108 @@ 
+{
+   cpp_get_buff
+   Memcheck:Leak
+   match-leak-kinds: possible
+   fun:malloc
+   fun:xmalloc
+   fun:new_buff
+   fun:_cpp_get_buff
+   ...
+}
+{
+   gnu-as
+   Memcheck:Leak
+   match-leak-kinds: definite,possible
+   fun:malloc
+   ...
+   obj:/usr/bin/as
+   ...
+}
+{
+   gnu-as
+   Memcheck:Leak
+   match-leak-kinds: definite,possible
+   fun:calloc
+   fun:xcalloc
+   ...
+   obj:/usr/bin/as
+   ...
+}
+{
+   todo-fix-mpfr
+   Memcheck:Leak
+   match-leak-kinds: possible
+   fun:malloc
+   fun:__gmp_default_allocate
+   fun:mpfr_init2
+   fun:mpfr_cache
+   fun:mpfr_log
+   ...
+}
+{
+   cpp-front-end
+   Memcheck:Leak
+   match-leak-kinds: definite
+   fun:malloc
+   fun:xmalloc
+   fun:_ZL22cp_literal_operator_idPKc
+   fun:cp_parser_template_name
+   ...
+}
+{
+   todo-fix-options1
+   Memcheck:Leak
+   match-leak-kinds: definite
+   fun:calloc
+   fun:xcalloc
+   fun:_Z20lang_specific_driverPP17cl_decoded_optionPjPi
+   fun:_ZL15process_commandjP17cl_decoded_option
+   fun:_ZNK6driver12set_up_specsEv
+   fun:_ZN6driver4mainEiPPc
+   fun:main
+}
+{
+   todo-fix-options2
+   Memcheck:Leak
+   match-leak-kinds: definite
+   fun:malloc
+   fun:xmalloc
+   fun:_Z20lang_specific_driverPP17cl_decoded_optionPjPi
+   fun:_ZL15process_commandjP17cl_decoded_option
+   fun:_ZNK6driver12set_up_specsEv
+   fun:_ZN6driver4mainEiPPc
+   fun:main
+}
+{
+   ld.bfd-alloc
+   Memcheck:Leak
+   match-leak-kinds: definite,possible
+   fun:malloc
+   ...
+   obj:/usr/bin/ld.bfd
+   ...
+}
+{
+   ld.bfd-realloc
+   Memcheck:Leak
+   match-leak-kinds: definite,possible
+   fun:realloc
+   ...
+   obj:/usr/bin/ld.bfd
+   ...
+}
+{
+   <insert_a_suppression_name_here>
+   Memcheck:Leak
+   match-leak-kinds: definite
+   fun:calloc
+   fun:xcalloc
+   fun:main
+}
+{
+   collect2
+   Memcheck:Leak
+   match-leak-kinds: definite
+   fun:calloc
+   fun:xcalloc
+   fun:main
+}
diff --git a/gcc/gcc.c b/gcc/gcc.c
index cc0597d..2207913 100644
--- a/gcc/gcc.c
+++ b/gcc/gcc.c
@@ -3024,32 +3024,43 @@  execute (void)
 #endif /* DEBUG */
     }
 
-#ifdef ENABLE_VALGRIND_CHECKING
-  /* Run the each command through valgrind.  To simplify prepending the
-     path to valgrind and the option "-q" (for quiet operation unless
-     something triggers), we allocate a separate argv array.  */
-
-  for (i = 0; i < n_commands; i++)
+  if (env.get ("RUN_UNDER_VALGRIND"))
     {
-      const char **argv;
-      int argc;
-      int j;
+      /* Run the each command through valgrind.  To simplify prepending the
+	 path to valgrind and the option "-q" (for quiet operation unless
+	 something triggers), we allocate a separate argv array.  */
 
-      for (argc = 0; commands[i].argv[argc] != NULL; argc++)
-	;
+      auto_vec<const char *> valgrind_options;
+      valgrind_options.safe_push ("valgrind");
+      valgrind_options.safe_push ("-q");
+      valgrind_options.safe_push ("--leak-check=yes");
+      valgrind_options.safe_push ("--error-exitcode=111");
+      valgrind_options.safe_push ("--suppressions=/tmp/gcc.supp");
+      valgrind_options.safe_push ("--num-callers=30");
+
+      for (i = 0; i < n_commands; i++)
+	{
+	  const char **argv;
+	  unsigned argc;
 
-      argv = XALLOCAVEC (const char *, argc + 3);
+	  for (argc = 0; commands[i].argv[argc] != NULL; argc++)
+	    ;
+
+	  unsigned l = argc + valgrind_options.length ();
+	  argv = XALLOCAVEC (const char *, l + 1);
+
+	  for (unsigned j = 0; j < valgrind_options.length (); j++)
+	    argv[j] = valgrind_options[j];
 
-      argv[0] = VALGRIND_PATH;
-      argv[1] = "-q";
-      for (j = 2; j < argc + 2; j++)
-	argv[j] = commands[i].argv[j - 2];
-      argv[j] = NULL;
+	  for (unsigned j = 0; j < argc; j++)
+	    argv[j + valgrind_options.length ()] = commands[i].argv[j];
 
-      commands[i].argv = argv;
-      commands[i].prog = argv[0];
+	  argv[l] = NULL;
+
+	  commands[i].argv = argv;
+	  commands[i].prog = argv[0];
+	}
     }
-#endif
 
   /* Run each piped subprocess.  */
 
-- 
2.6.3