diff mbox

[1/2] Port Doxygen support script from Perl to Python; add unittests

Message ID 1489715500-63153-1-git-send-email-dmalcolm@redhat.com
State New
Headers show

Commit Message

David Malcolm March 17, 2017, 1:51 a.m. UTC
It's possible to run GCC's sources through Doxygen by setting
	INPUT_FILTER           = contrib/filter_gcc_for_doxygen
within contrib/gcc.doxy and invoking doxygen on the latter file.

The script filters out various preprocessor constructs from GCC
sources before Doxygen tries to parse them (e.g. GTY tags).

As-is, the script has some limitations, so as enabling work for
fixing them, this patch reimplements the support script
contrib/filter_params.pl in Python, effectively using the same
regexes, but porting them from Perl to Python syntax, adding comments,
and a unit-test suite.

This is a revised version of a patch I posted ~3.5 years ago:
  https://gcc.gnu.org/ml/gcc-patches/2013-10/msg02728.html
with the difference that in this patch I'm attempting to
faithfully reimplement the behavior of the Perl script, leaving
bugfixes to followups (in the earlier version I combined the
port with some behavior changes).

I've tested it by running some source files through both scripts
and manually verifying that the output was identical for both
implementations. apart from the Python implementation adding a
harmless trailing newline at the end of the file.

The unit tests pass for both Python 2 and Python 3 (tested
with 2.7.5 and 3.3.2).

OK for trunk?

contrib/
	* filter_gcc_for_doxygen: Use filter_params.py rather than
	filter_params.pl.
	* filter_params.pl: Delete in favor of...
	* filter_params.py: New, porting the perl script to python,
	adding a test suite.
---
 contrib/filter_gcc_for_doxygen |   2 +-
 contrib/filter_params.pl       |  14 -----
 contrib/filter_params.py       | 130 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 131 insertions(+), 15 deletions(-)
 delete mode 100755 contrib/filter_params.pl
 create mode 100644 contrib/filter_params.py

Comments

Martin Liška April 28, 2017, 12:03 p.m. UTC | #1
On 03/17/2017 02:51 AM, David Malcolm wrote:

Hello.

I've just tested you patches and I can confirm that they work :)
I've got couple of related questions:

> It's possible to run GCC's sources through Doxygen by setting
> 	INPUT_FILTER           = contrib/filter_gcc_for_doxygen
> within contrib/gcc.doxy and invoking doxygen on the latter file.

Why do we not make a default for OUTPUT_DIRECTORY and INPUT_FILTER ?
I would expect people are running doxygen from GCC root folder.

> 
> The script filters out various preprocessor constructs from GCC
> sources before Doxygen tries to parse them (e.g. GTY tags).
> 
> As-is, the script has some limitations, so as enabling work for
> fixing them, this patch reimplements the support script
> contrib/filter_params.pl in Python, effectively using the same
> regexes, but porting them from Perl to Python syntax, adding comments,
> and a unit-test suite.

You were not brave enough to port remaining pattern in contrib/filter_knr2ansi.pl,
right :) ?

Thanks for that, I've got 2 follow-up patches that I'll link to this thread.

Martin

> 
> This is a revised version of a patch I posted ~3.5 years ago:
>   https://gcc.gnu.org/ml/gcc-patches/2013-10/msg02728.html
> with the difference that in this patch I'm attempting to
> faithfully reimplement the behavior of the Perl script, leaving
> bugfixes to followups (in the earlier version I combined the
> port with some behavior changes).
> 
> I've tested it by running some source files through both scripts
> and manually verifying that the output was identical for both
> implementations. apart from the Python implementation adding a
> harmless trailing newline at the end of the file.
> 
> The unit tests pass for both Python 2 and Python 3 (tested
> with 2.7.5 and 3.3.2).
> 
> OK for trunk?
David Malcolm April 28, 2017, 10:09 p.m. UTC | #2
Ping for these two patches:
  - https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00909.html
  - https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00910.html

On Thu, 2017-03-16 at 21:51 -0400, David Malcolm wrote:
> It's possible to run GCC's sources through Doxygen by setting
> 	INPUT_FILTER           = contrib/filter_gcc_for_doxygen
> within contrib/gcc.doxy and invoking doxygen on the latter file.
> 
> The script filters out various preprocessor constructs from GCC
> sources before Doxygen tries to parse them (e.g. GTY tags).
> 
> As-is, the script has some limitations, so as enabling work for
> fixing them, this patch reimplements the support script
> contrib/filter_params.pl in Python, effectively using the same
> regexes, but porting them from Perl to Python syntax, adding
> comments,
> and a unit-test suite.
> 
> This is a revised version of a patch I posted ~3.5 years ago:
>   https://gcc.gnu.org/ml/gcc-patches/2013-10/msg02728.html
> with the difference that in this patch I'm attempting to
> faithfully reimplement the behavior of the Perl script, leaving
> bugfixes to followups (in the earlier version I combined the
> port with some behavior changes).
> 
> I've tested it by running some source files through both scripts
> and manually verifying that the output was identical for both
> implementations. apart from the Python implementation adding a
> harmless trailing newline at the end of the file.
> 
> The unit tests pass for both Python 2 and Python 3 (tested
> with 2.7.5 and 3.3.2).
> 
> OK for trunk?
> 
> contrib/
> 	* filter_gcc_for_doxygen: Use filter_params.py rather than
> 	filter_params.pl.
> 	* filter_params.pl: Delete in favor of...
> 	* filter_params.py: New, porting the perl script to python,
> 	adding a test suite.
> ---
>  contrib/filter_gcc_for_doxygen |   2 +-
>  contrib/filter_params.pl       |  14 -----
>  contrib/filter_params.py       | 130
> +++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 131 insertions(+), 15 deletions(-)
>  delete mode 100755 contrib/filter_params.pl
>  create mode 100644 contrib/filter_params.py
> 
> diff --git a/contrib/filter_gcc_for_doxygen
> b/contrib/filter_gcc_for_doxygen
> index 3787eeb..ca1db31 100755
> --- a/contrib/filter_gcc_for_doxygen
> +++ b/contrib/filter_gcc_for_doxygen
> @@ -8,5 +8,5 @@
>  # process is put on stdout.
>  
>  dir=`dirname $0`
> -perl $dir/filter_params.pl < $1 | perl $dir/filter_knr2ansi.pl 
> +python $dir/filter_params.py $1 | perl $dir/filter_knr2ansi.pl
>  exit 0
> diff --git a/contrib/filter_params.pl b/contrib/filter_params.pl
> deleted file mode 100755
> index 22dae6c..0000000
> --- a/contrib/filter_params.pl
> +++ /dev/null
> @@ -1,14 +0,0 @@
> -#!/usr/bin/perl
> -
> -# Filters out some of the #defines used throughout the GCC sources:
> -# - GTY(()) marks declarations for gengtype.c
> -# - PARAMS(()) is used for K&R compatibility. See ansidecl.h.
> -
> -while (<>) {
> -    s/^\/\* /\/\*\* \@verbatim /;
> -    s/\*\// \@endverbatim \*\//;
> -    s/GTY[ \t]*\(\(.*\)\)//g;
> -    s/[ \t]ATTRIBUTE_UNUSED//g;
> -    s/PARAMS[ \t]*\(\((.*?)\)\)/\($1\)/sg;
> -    print;
> -}
> diff --git a/contrib/filter_params.py b/contrib/filter_params.py
> new file mode 100644
> index 0000000..3c14121
> --- /dev/null
> +++ b/contrib/filter_params.py
> @@ -0,0 +1,130 @@
> +#!/usr/bin/python
> +"""
> +Filters out some of the #defines used throughout the GCC sources:
> +- GTY(()) marks declarations for gengtype.c
> +- PARAMS(()) is used for K&R compatibility. See ansidecl.h.
> +
> +When passed one or more filenames, acts on those files and prints
> the
> +results to stdout.
> +
> +When run without a filename, runs a unit-testing suite.
> +"""
> +import re
> +import sys
> +import unittest
> +
> +def filter_src(text):
> +    """
> +    str -> str.  We operate on the whole of the source file at once
> +    (rather than individual lines) so that we can have multiline
> +    regexes.
> +    """
> +
> +    # Convert C comments from GNU coding convention of:
> +    #    /* FIRST_LINE
> +    #       NEXT_LINE
> +    #       FINAL_LINE.  */
> +    # to:
> +    #    /** @verbatim FIRST_LINE
> +    #       NEXT_LINE
> +    #       FINAL_LINE.  @endverbatim */
> +    # so that doxygen will parse them.
> +    #
> +    # Only comments that begin on the left-most column are
> converted.
> +    text = re.sub(r'^/\* ',
> +                  r'/** @verbatim ',
> +                  text,
> +                  flags=re.MULTILINE)
> +    text = re.sub(r'\*/',
> +                  r' @endverbatim */',
> +                  text)
> +
> +    # Remove GTY markings:
> +    text = re.sub(r'GTY[ \t]*\(\(.*\)\)',
> +                  '',
> +                  text)
> +
> +    # Strip out 'ATTRIBUTE_UNUSED'
> +    text = re.sub('[ \t]ATTRIBUTE_UNUSED',
> +                  '',
> +                  text)
> +
> +    # PARAMS(()) is used for K&R compatibility. See ansidecl.h.
> +    text = re.sub(r'PARAMS[ \t]*\(\((.*?)\)\)',
> +                  r'(\1)',
> +                  text)
> +
> +    return text
> +
> +class FilteringTests(unittest.TestCase):
> +    '''
> +    Unit tests for filter_src.
> +    '''
> +    def assert_filters_to(self, src_input, expected_result):
> +        # assertMultiLineEqual was added to unittest in 2.7/3.1
> +        if hasattr(self, 'assertMultiLineEqual'):
> +            assertion = self.assertMultiLineEqual
> +        else:
> +            assertion = self.assertEqual
> +        assertion(expected_result, filter_src(src_input))
> +
> +    def test_comment_example(self):
> +        self.assert_filters_to(
> +            ('/* FIRST_LINE\n'
> +             '   NEXT_LINE\n'
> +             '   FINAL_LINE.  */\n'),
> +            ('/** @verbatim FIRST_LINE\n'
> +             '   NEXT_LINE\n'
> +             '   FINAL_LINE.   @endverbatim */\n'))
> +
> +    def test_oneliner_comment(self):
> +        self.assert_filters_to(
> +            '/* Returns the string representing CLASS.  */\n',
> +            ('/** @verbatim Returns the string representing CLASS.  
>  @endverbatim */\n'))
> +
> +    def test_multiline_comment(self):
> +        self.assert_filters_to(
> +            ('/* The thread-local storage model associated with a
> given VAR_DECL\n'
> +             "   or SYMBOL_REF.  This isn't used much, but both
> trees and RTL refer\n"
> +             "   to it, so it's here.  */\n"),
> +            ('/** @verbatim The thread-local storage model
> associated with a given VAR_DECL\n'
> +             "   or SYMBOL_REF.  This isn't used much, but both
> trees and RTL refer\n"
> +             "   to it, so it's here.   @endverbatim */\n"))
> +
> +    def test_GTY(self):
> +        self.assert_filters_to(
> +            ('typedef struct GTY(()) alias_pair {\n'
> +             '  tree decl;\n'
> +             '  tree target;\n'
> +             '} alias_pair;\n'),
> +            ('typedef struct  alias_pair {\n'
> +             '  tree decl;\n'
> +             '  tree target;\n'
> +             '} alias_pair;\n'))
> +
> +    def test_ATTRIBUTE_UNUSED(self):
> +        # Ensure that ATTRIBUTE_UNUSED is filtered out.
> +        self.assert_filters_to(
> +            ('static void\n'
> +             'record_set (rtx dest, const_rtx set, void *data
> ATTRIBUTE_UNUSED)\n'
> +             '{\n'),
> +            ('static void\n'
> +             'record_set (rtx dest, const_rtx set, void *data)\n'
> +             '{\n'))
> +
> +    def test_PARAMS(self):
> +        self.assert_filters_to(
> +            'char *strcpy PARAMS ((char *dest, char *source));\n',
> +            'char *strcpy (char *dest, char *source);\n')
> +
> +def act_on_files(argv):
> +    for filename in argv[1:]:
> +        with open(filename) as f:
> +            text = f.read()
> +            print(filter_src(text))
> +
> +if __name__ == '__main__':
> +    if len(sys.argv) > 1:
> +        act_on_files(sys.argv)
> +    else:
> +        unittest.main()
Martin Liška May 19, 2017, 9:12 a.m. UTC | #3
PING^2.

Thanks,
Martin
David Malcolm May 26, 2017, 7:32 p.m. UTC | #4
Ping

On Fri, 2017-04-28 at 18:09 -0400, David Malcolm wrote:
> Ping for these two patches:
>   - https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00909.html
>   - https://gcc.gnu.org/ml/gcc-patches/2017-03/msg00910.html
> 
> On Thu, 2017-03-16 at 21:51 -0400, David Malcolm wrote:
> > It's possible to run GCC's sources through Doxygen by setting
> > 	INPUT_FILTER           = contrib/filter_gcc_for_doxygen
> > within contrib/gcc.doxy and invoking doxygen on the latter file.
> > 
> > The script filters out various preprocessor constructs from GCC
> > sources before Doxygen tries to parse them (e.g. GTY tags).
> > 
> > As-is, the script has some limitations, so as enabling work for
> > fixing them, this patch reimplements the support script
> > contrib/filter_params.pl in Python, effectively using the same
> > regexes, but porting them from Perl to Python syntax, adding
> > comments,
> > and a unit-test suite.
> > 
> > This is a revised version of a patch I posted ~3.5 years ago:
> >   https://gcc.gnu.org/ml/gcc-patches/2013-10/msg02728.html
> > with the difference that in this patch I'm attempting to
> > faithfully reimplement the behavior of the Perl script, leaving
> > bugfixes to followups (in the earlier version I combined the
> > port with some behavior changes).
> > 
> > I've tested it by running some source files through both scripts
> > and manually verifying that the output was identical for both
> > implementations. apart from the Python implementation adding a
> > harmless trailing newline at the end of the file.
> > 
> > The unit tests pass for both Python 2 and Python 3 (tested
> > with 2.7.5 and 3.3.2).
> > 
> > OK for trunk?
> > 
> > contrib/
> > 	* filter_gcc_for_doxygen: Use filter_params.py rather than
> > 	filter_params.pl.
> > 	* filter_params.pl: Delete in favor of...
> > 	* filter_params.py: New, porting the perl script to python,
> > 	adding a test suite.
> > ---
> >  contrib/filter_gcc_for_doxygen |   2 +-
> >  contrib/filter_params.pl       |  14 -----
> >  contrib/filter_params.py       | 130
> > +++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 131 insertions(+), 15 deletions(-)
> >  delete mode 100755 contrib/filter_params.pl
> >  create mode 100644 contrib/filter_params.py
> > 
> > diff --git a/contrib/filter_gcc_for_doxygen
> > b/contrib/filter_gcc_for_doxygen
> > index 3787eeb..ca1db31 100755
> > --- a/contrib/filter_gcc_for_doxygen
> > +++ b/contrib/filter_gcc_for_doxygen
> > @@ -8,5 +8,5 @@
> >  # process is put on stdout.
> >  
> >  dir=`dirname $0`
> > -perl $dir/filter_params.pl < $1 | perl $dir/filter_knr2ansi.pl 
> > +python $dir/filter_params.py $1 | perl $dir/filter_knr2ansi.pl
> >  exit 0
> > diff --git a/contrib/filter_params.pl b/contrib/filter_params.pl
> > deleted file mode 100755
> > index 22dae6c..0000000
> > --- a/contrib/filter_params.pl
> > +++ /dev/null
> > @@ -1,14 +0,0 @@
> > -#!/usr/bin/perl
> > -
> > -# Filters out some of the #defines used throughout the GCC
> > sources:
> > -# - GTY(()) marks declarations for gengtype.c
> > -# - PARAMS(()) is used for K&R compatibility. See ansidecl.h.
> > -
> > -while (<>) {
> > -    s/^\/\* /\/\*\* \@verbatim /;
> > -    s/\*\// \@endverbatim \*\//;
> > -    s/GTY[ \t]*\(\(.*\)\)//g;
> > -    s/[ \t]ATTRIBUTE_UNUSED//g;
> > -    s/PARAMS[ \t]*\(\((.*?)\)\)/\($1\)/sg;
> > -    print;
> > -}
> > diff --git a/contrib/filter_params.py b/contrib/filter_params.py
> > new file mode 100644
> > index 0000000..3c14121
> > --- /dev/null
> > +++ b/contrib/filter_params.py
> > @@ -0,0 +1,130 @@
> > +#!/usr/bin/python
> > +"""
> > +Filters out some of the #defines used throughout the GCC sources:
> > +- GTY(()) marks declarations for gengtype.c
> > +- PARAMS(()) is used for K&R compatibility. See ansidecl.h.
> > +
> > +When passed one or more filenames, acts on those files and prints
> > the
> > +results to stdout.
> > +
> > +When run without a filename, runs a unit-testing suite.
> > +"""
> > +import re
> > +import sys
> > +import unittest
> > +
> > +def filter_src(text):
> > +    """
> > +    str -> str.  We operate on the whole of the source file at
> > once
> > +    (rather than individual lines) so that we can have multiline
> > +    regexes.
> > +    """
> > +
> > +    # Convert C comments from GNU coding convention of:
> > +    #    /* FIRST_LINE
> > +    #       NEXT_LINE
> > +    #       FINAL_LINE.  */
> > +    # to:
> > +    #    /** @verbatim FIRST_LINE
> > +    #       NEXT_LINE
> > +    #       FINAL_LINE.  @endverbatim */
> > +    # so that doxygen will parse them.
> > +    #
> > +    # Only comments that begin on the left-most column are
> > converted.
> > +    text = re.sub(r'^/\* ',
> > +                  r'/** @verbatim ',
> > +                  text,
> > +                  flags=re.MULTILINE)
> > +    text = re.sub(r'\*/',
> > +                  r' @endverbatim */',
> > +                  text)
> > +
> > +    # Remove GTY markings:
> > +    text = re.sub(r'GTY[ \t]*\(\(.*\)\)',
> > +                  '',
> > +                  text)
> > +
> > +    # Strip out 'ATTRIBUTE_UNUSED'
> > +    text = re.sub('[ \t]ATTRIBUTE_UNUSED',
> > +                  '',
> > +                  text)
> > +
> > +    # PARAMS(()) is used for K&R compatibility. See ansidecl.h.
> > +    text = re.sub(r'PARAMS[ \t]*\(\((.*?)\)\)',
> > +                  r'(\1)',
> > +                  text)
> > +
> > +    return text
> > +
> > +class FilteringTests(unittest.TestCase):
> > +    '''
> > +    Unit tests for filter_src.
> > +    '''
> > +    def assert_filters_to(self, src_input, expected_result):
> > +        # assertMultiLineEqual was added to unittest in 2.7/3.1
> > +        if hasattr(self, 'assertMultiLineEqual'):
> > +            assertion = self.assertMultiLineEqual
> > +        else:
> > +            assertion = self.assertEqual
> > +        assertion(expected_result, filter_src(src_input))
> > +
> > +    def test_comment_example(self):
> > +        self.assert_filters_to(
> > +            ('/* FIRST_LINE\n'
> > +             '   NEXT_LINE\n'
> > +             '   FINAL_LINE.  */\n'),
> > +            ('/** @verbatim FIRST_LINE\n'
> > +             '   NEXT_LINE\n'
> > +             '   FINAL_LINE.   @endverbatim */\n'))
> > +
> > +    def test_oneliner_comment(self):
> > +        self.assert_filters_to(
> > +            '/* Returns the string representing CLASS.  */\n',
> > +            ('/** @verbatim Returns the string representing CLASS.
> >   
> >  @endverbatim */\n'))
> > +
> > +    def test_multiline_comment(self):
> > +        self.assert_filters_to(
> > +            ('/* The thread-local storage model associated with a
> > given VAR_DECL\n'
> > +             "   or SYMBOL_REF.  This isn't used much, but both
> > trees and RTL refer\n"
> > +             "   to it, so it's here.  */\n"),
> > +            ('/** @verbatim The thread-local storage model
> > associated with a given VAR_DECL\n'
> > +             "   or SYMBOL_REF.  This isn't used much, but both
> > trees and RTL refer\n"
> > +             "   to it, so it's here.   @endverbatim */\n"))
> > +
> > +    def test_GTY(self):
> > +        self.assert_filters_to(
> > +            ('typedef struct GTY(()) alias_pair {\n'
> > +             '  tree decl;\n'
> > +             '  tree target;\n'
> > +             '} alias_pair;\n'),
> > +            ('typedef struct  alias_pair {\n'
> > +             '  tree decl;\n'
> > +             '  tree target;\n'
> > +             '} alias_pair;\n'))
> > +
> > +    def test_ATTRIBUTE_UNUSED(self):
> > +        # Ensure that ATTRIBUTE_UNUSED is filtered out.
> > +        self.assert_filters_to(
> > +            ('static void\n'
> > +             'record_set (rtx dest, const_rtx set, void *data
> > ATTRIBUTE_UNUSED)\n'
> > +             '{\n'),
> > +            ('static void\n'
> > +             'record_set (rtx dest, const_rtx set, void *data)\n'
> > +             '{\n'))
> > +
> > +    def test_PARAMS(self):
> > +        self.assert_filters_to(
> > +            'char *strcpy PARAMS ((char *dest, char *source));\n',
> > +            'char *strcpy (char *dest, char *source);\n')
> > +
> > +def act_on_files(argv):
> > +    for filename in argv[1:]:
> > +        with open(filename) as f:
> > +            text = f.read()
> > +            print(filter_src(text))
> > +
> > +if __name__ == '__main__':
> > +    if len(sys.argv) > 1:
> > +        act_on_files(sys.argv)
> > +    else:
> > +        unittest.main()
diff mbox

Patch

diff --git a/contrib/filter_gcc_for_doxygen b/contrib/filter_gcc_for_doxygen
index 3787eeb..ca1db31 100755
--- a/contrib/filter_gcc_for_doxygen
+++ b/contrib/filter_gcc_for_doxygen
@@ -8,5 +8,5 @@ 
 # process is put on stdout.
 
 dir=`dirname $0`
-perl $dir/filter_params.pl < $1 | perl $dir/filter_knr2ansi.pl 
+python $dir/filter_params.py $1 | perl $dir/filter_knr2ansi.pl
 exit 0
diff --git a/contrib/filter_params.pl b/contrib/filter_params.pl
deleted file mode 100755
index 22dae6c..0000000
--- a/contrib/filter_params.pl
+++ /dev/null
@@ -1,14 +0,0 @@ 
-#!/usr/bin/perl
-
-# Filters out some of the #defines used throughout the GCC sources:
-# - GTY(()) marks declarations for gengtype.c
-# - PARAMS(()) is used for K&R compatibility. See ansidecl.h.
-
-while (<>) {
-    s/^\/\* /\/\*\* \@verbatim /;
-    s/\*\// \@endverbatim \*\//;
-    s/GTY[ \t]*\(\(.*\)\)//g;
-    s/[ \t]ATTRIBUTE_UNUSED//g;
-    s/PARAMS[ \t]*\(\((.*?)\)\)/\($1\)/sg;
-    print;
-}
diff --git a/contrib/filter_params.py b/contrib/filter_params.py
new file mode 100644
index 0000000..3c14121
--- /dev/null
+++ b/contrib/filter_params.py
@@ -0,0 +1,130 @@ 
+#!/usr/bin/python
+"""
+Filters out some of the #defines used throughout the GCC sources:
+- GTY(()) marks declarations for gengtype.c
+- PARAMS(()) is used for K&R compatibility. See ansidecl.h.
+
+When passed one or more filenames, acts on those files and prints the
+results to stdout.
+
+When run without a filename, runs a unit-testing suite.
+"""
+import re
+import sys
+import unittest
+
+def filter_src(text):
+    """
+    str -> str.  We operate on the whole of the source file at once
+    (rather than individual lines) so that we can have multiline
+    regexes.
+    """
+
+    # Convert C comments from GNU coding convention of:
+    #    /* FIRST_LINE
+    #       NEXT_LINE
+    #       FINAL_LINE.  */
+    # to:
+    #    /** @verbatim FIRST_LINE
+    #       NEXT_LINE
+    #       FINAL_LINE.  @endverbatim */
+    # so that doxygen will parse them.
+    #
+    # Only comments that begin on the left-most column are converted.
+    text = re.sub(r'^/\* ',
+                  r'/** @verbatim ',
+                  text,
+                  flags=re.MULTILINE)
+    text = re.sub(r'\*/',
+                  r' @endverbatim */',
+                  text)
+
+    # Remove GTY markings:
+    text = re.sub(r'GTY[ \t]*\(\(.*\)\)',
+                  '',
+                  text)
+
+    # Strip out 'ATTRIBUTE_UNUSED'
+    text = re.sub('[ \t]ATTRIBUTE_UNUSED',
+                  '',
+                  text)
+
+    # PARAMS(()) is used for K&R compatibility. See ansidecl.h.
+    text = re.sub(r'PARAMS[ \t]*\(\((.*?)\)\)',
+                  r'(\1)',
+                  text)
+
+    return text
+
+class FilteringTests(unittest.TestCase):
+    '''
+    Unit tests for filter_src.
+    '''
+    def assert_filters_to(self, src_input, expected_result):
+        # assertMultiLineEqual was added to unittest in 2.7/3.1
+        if hasattr(self, 'assertMultiLineEqual'):
+            assertion = self.assertMultiLineEqual
+        else:
+            assertion = self.assertEqual
+        assertion(expected_result, filter_src(src_input))
+
+    def test_comment_example(self):
+        self.assert_filters_to(
+            ('/* FIRST_LINE\n'
+             '   NEXT_LINE\n'
+             '   FINAL_LINE.  */\n'),
+            ('/** @verbatim FIRST_LINE\n'
+             '   NEXT_LINE\n'
+             '   FINAL_LINE.   @endverbatim */\n'))
+
+    def test_oneliner_comment(self):
+        self.assert_filters_to(
+            '/* Returns the string representing CLASS.  */\n',
+            ('/** @verbatim Returns the string representing CLASS.   @endverbatim */\n'))
+
+    def test_multiline_comment(self):
+        self.assert_filters_to(
+            ('/* The thread-local storage model associated with a given VAR_DECL\n'
+             "   or SYMBOL_REF.  This isn't used much, but both trees and RTL refer\n"
+             "   to it, so it's here.  */\n"),
+            ('/** @verbatim The thread-local storage model associated with a given VAR_DECL\n'
+             "   or SYMBOL_REF.  This isn't used much, but both trees and RTL refer\n"
+             "   to it, so it's here.   @endverbatim */\n"))
+
+    def test_GTY(self):
+        self.assert_filters_to(
+            ('typedef struct GTY(()) alias_pair {\n'
+             '  tree decl;\n'
+             '  tree target;\n'
+             '} alias_pair;\n'),
+            ('typedef struct  alias_pair {\n'
+             '  tree decl;\n'
+             '  tree target;\n'
+             '} alias_pair;\n'))
+
+    def test_ATTRIBUTE_UNUSED(self):
+        # Ensure that ATTRIBUTE_UNUSED is filtered out.
+        self.assert_filters_to(
+            ('static void\n'
+             'record_set (rtx dest, const_rtx set, void *data ATTRIBUTE_UNUSED)\n'
+             '{\n'),
+            ('static void\n'
+             'record_set (rtx dest, const_rtx set, void *data)\n'
+             '{\n'))
+
+    def test_PARAMS(self):
+        self.assert_filters_to(
+            'char *strcpy PARAMS ((char *dest, char *source));\n',
+            'char *strcpy (char *dest, char *source);\n')
+
+def act_on_files(argv):
+    for filename in argv[1:]:
+        with open(filename) as f:
+            text = f.read()
+            print(filter_src(text))
+
+if __name__ == '__main__':
+    if len(sys.argv) > 1:
+        act_on_files(sys.argv)
+    else:
+        unittest.main()