[2/2] Introduce Python testcases to check DWARF output

Message ID	20170726160040.6516-3-derodat@adacore.com
State	New
Headers	show Return-Path: <gcc-patches-return-459030-incoming=patchwork.ozlabs.org@gcc.gnu.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references; q=dns; s= default; b=nx8mF6BMqC7ebhqIZKL9ySHARguoB71N+cNonjYSwntOTp5Dsc5Bd frzBB0fbHRR84BSMhZHL73Rdo/b1AXQ77ur+CqH3ytdKYB0BbY+0Yw251h33arPZ psyXGI3zkJECdGSPJcrsHZLCRLGQB5/Bx0iMAl5a+9OA7FcPFzIXUc= Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org From: Pierre-Marie de Rodat <derodat@adacore.com> To: gcc-patches@gcc.gnu.org Cc: Pierre-Marie de Rodat <derodat@adacore.com> Subject: [PATCH 2/2] Introduce Python testcases to check DWARF output Date: Wed, 26 Jul 2017 18:00:40 +0200 Message-Id: <20170726160040.6516-3-derodat@adacore.com> In-Reply-To: <20170726160040.6516-1-derodat@adacore.com> References: <20170726160040.6516-1-derodat@adacore.com>

Message ID

20170726160040.6516-3-derodat@adacore.com

State

New

Headers

DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender:from
	:to:cc:subject:date:message-id:in-reply-to:references; q=dns; s=
	default; b=nx8mF6BMqC7ebhqIZKL9ySHARguoB71N+cNonjYSwntOTp5Dsc5Bd
	frzBB0fbHRR84BSMhZHL73Rdo/b1AXQ77ur+CqH3ytdKYB0BbY+0Yw251h33arPZ
	psyXGI3zkJECdGSPJcrsHZLCRLGQB5/Bx0iMAl5a+9OA7FcPFzIXUc=
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
Sender: gcc-patches-owner@gcc.gnu.org
From: Pierre-Marie de Rodat <derodat@adacore.com>
To: gcc-patches@gcc.gnu.org
Cc: Pierre-Marie de Rodat <derodat@adacore.com>
Subject: [PATCH 2/2] Introduce Python testcases to check DWARF output
Date: Wed, 26 Jul 2017 18:00:40 +0200
Message-Id: <20170726160040.6516-3-derodat@adacore.com>
In-Reply-To: <20170726160040.6516-1-derodat@adacore.com>
References: <20170726160040.6516-1-derodat@adacore.com>

Commit Message

Pierre-Marie de Rodat July 26, 2017, 4 p.m. UTC

For now, this supports only platforms that have an objdump available for
the corresponding target. There are several things that would be nico to
have in the future:

  * add support for more DWARF dumping tools, such as otool on Darwin;

  * have a DWARF location expression decoder, to be able to parse and
    pattern match expressions that objdump does not decode itself;

  * complete the set of decoders for DIE attributes.

gcc/testsuite/

	* lib/gcc-dwarf.exp: New helper files.
	* python/dwarfutils/__init__.py,
	python/dwarfutils/data.py,
	python/dwarfutils/helpers.py,
	python/dwarfutils/objdump.py: New Python helpers.
	* gcc.dg/debug/dwarf2-py/dwarf2-py.exp,
	gnat.dg/dwarf/dwarf.exp: New test drivers.
	* gcc.dg/debug/dwarf2-py/sso.c,
	gcc.dg/debug/dwarf2-py/sso.py,
	gcc.dg/debug/dwarf2-py/var2.c,
	gcc.dg/debug/dwarf2-py/var2.py,
	gnat.dg/dwarf/debug9.adb,
	gnat.dg/dwarf/debug9.py,
	gnat.dg/dwarf/debug11.adb,
	gnat.dg/dwarf/debug11.py,
	gnat.dg/dwarf/debug12.adb,
	gnat.dg/dwarf/debug12.ads,
	gnat.dg/dwarf/debug12.py: New tests.
---
 gcc/testsuite/gcc.dg/debug/dwarf2-py/dwarf2-py.exp |  52 ++
 gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.c         |  19 +
 gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.py        |  52 ++
 gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.c        |  13 +
 gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.py       |  11 +
 gcc/testsuite/gnat.dg/dg.exp                       |   1 +
 gcc/testsuite/gnat.dg/dwarf/debug11.adb            |  19 +
 gcc/testsuite/gnat.dg/dwarf/debug11.py             |  51 ++
 gcc/testsuite/gnat.dg/dwarf/debug12.adb            |  10 +
 gcc/testsuite/gnat.dg/dwarf/debug12.ads            |   8 +
 gcc/testsuite/gnat.dg/dwarf/debug12.py             |   9 +
 gcc/testsuite/gnat.dg/dwarf/debug9.adb             |  45 ++
 gcc/testsuite/gnat.dg/dwarf/debug9.py              |  22 +
 gcc/testsuite/gnat.dg/dwarf/dwarf.exp              |  39 ++
 gcc/testsuite/lib/gcc-dwarf.exp                    |  41 ++
 gcc/testsuite/python/dwarfutils/__init__.py        |  70 +++
 gcc/testsuite/python/dwarfutils/data.py            | 597 +++++++++++++++++++++
 gcc/testsuite/python/dwarfutils/helpers.py         |  11 +
 gcc/testsuite/python/dwarfutils/objdump.py         | 338 ++++++++++++
 19 files changed, 1408 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/dwarf2-py.exp
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.py
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.py
 create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug11.adb
 create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug11.py
 create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug12.adb
 create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug12.ads
 create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug12.py
 create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug9.adb
 create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug9.py
 create mode 100644 gcc/testsuite/gnat.dg/dwarf/dwarf.exp
 create mode 100644 gcc/testsuite/lib/gcc-dwarf.exp
 create mode 100644 gcc/testsuite/python/dwarfutils/__init__.py
 create mode 100644 gcc/testsuite/python/dwarfutils/data.py
 create mode 100644 gcc/testsuite/python/dwarfutils/helpers.py
 create mode 100644 gcc/testsuite/python/dwarfutils/objdump.py

Comments

David Malcolm July 26, 2017, 5:09 p.m. UTC | #1

On Wed, 2017-07-26 at 18:00 +0200, Pierre-Marie de Rodat wrote:
[...]
> diff --git a/gcc/testsuite/python/dwarfutils/__init__.py
> b/gcc/testsuite/python/dwarfutils/__init__.py
> new file mode 100644
> index 00000000000..246fbbd15be
> --- /dev/null
> +++ b/gcc/testsuite/python/dwarfutils/__init__.py
[...]
> +def parse_dwarf(object_file=None, single_cu=True):
> +    """
> +    Fetch and decode DWARF compilation units in `object_file`.
> +
> +    If `single_cu` is True, make sure there is exactly one
> compilation unit and

"is True" -> "is true"

[...]

> --- /dev/null
> +++ b/gcc/testsuite/python/dwarfutils/data.py

> +
> +    def get_attr(self, name, single=True, or_error=True):
> +        """Look for an attribute in this DIE.
> +
> +        :param str|int name: Attribute name, or number if name is
> unknown.
> +        :param bool single: If true, this will raise a KeyError for
> +            zero/multiple matches and return an Attribute instance
> when found.
> +            Otherwise, return a potentially empty list of
> attributes.
> +        :param bool or_error: When True, if `single` is True and no
> attribute

"True" -> "true" in two places

[...]

> +    def find(self, predicate=None, tag=None, name=None,
> recursive=True,
> +             single=True):
> +        """Look for a DIE that satisfies the given expectations.
> +
> +        :param None|(DIE) -> bool predicate: If provided, function
> that filters
> +            out DIEs when it returns False.
> +        :param str|int|None tag: If provided, filter out DIEs whose
> tag does
> +            not match.
> +        :param str|None name: If provided, filter out DIEs whose
> name (see
> +            the `name` property) does not match.
> +        :param bool recursive: If True, perform the search
> recursively in
> +            self's children.
> +        :param bool single: If True, look for a single DIE and raise
> a

"True" -> "true", I suppose

[...]

> +class MatchResult(object):
> +    """Holder for the result of a DIE tree pattern match."""
> +
> +    def __init__(self):
> +        self.dict = {}
> +
> +        self.mismatch_reason = None
> +        """
> +        If left to None, the match succeded. Otherwise, must be set


"succeded" -> "succeeded"

> +
> +    def capture(self, name):
> +        """Return what has been captured by the `name` capture.
> +
> +        This is valid iff the match succeded.

here again.

[...]


> diff --git a/gcc/testsuite/python/dwarfutils/helpers.py
> b/gcc/testsuite/python/dwarfutils/helpers.py
> new file mode 100644
> index 00000000000..f5e77896ae6
> --- /dev/null
> +++ b/gcc/testsuite/python/dwarfutils/helpers.py
> @@ -0,0 +1,11 @@
> +import sys
> +
> +
> +def as_ascii(str_or_byte):
> +    """
> +    Python 2/3 compatibility helper.
> +
> +    In Python 2, just return the input. In Python 3, decode the
> input as ASCII.
> +    """
> +    return (str_or_byte if sys.version_info.major < 3 else
> +            str_or_byte.decode('ascii'))

Aha!  Python 2 and Python 3.


Presumably this all runs with LANG=C so that there's no danger of any
non-ASCII bytes?  (bytes.decode('ascii' will raise a UnicodeDecodeError
if any byte >=128).


> diff --git a/gcc/testsuite/python/dwarfutils/objdump.py
> b/gcc/testsuite/python/dwarfutils/objdump.py
> new file mode 100644
> index 00000000000..52cfc06c03b
> --- /dev/null
> +++ b/gcc/testsuite/python/dwarfutils/objdump.py

[...]

There's a fair amount of non-trivial parsing going on here.
I wonder if it would be helpful to add a "unittest" suite for the
parsing?
(e.g. to have some precanned fragments of objdump output as strings,
and to verify that they're parsed as expected).

Note that I'm not a reviewer for the testsuite, so this is just a
suggestion.

Hope this is constructive
Dave

Richard Biener July 27, 2017, 8:36 a.m. UTC | #2

On Wed, Jul 26, 2017 at 6:00 PM, Pierre-Marie de Rodat
<derodat@adacore.com> wrote:
> For now, this supports only platforms that have an objdump available for
> the corresponding target. There are several things that would be nico to
> have in the future:
>
>   * add support for more DWARF dumping tools, such as otool on Darwin;
>
>   * have a DWARF location expression decoder, to be able to parse and
>     pattern match expressions that objdump does not decode itself;
>
>   * complete the set of decoders for DIE attributes.

Just some random thoughts.

Given that gdb can decode dwarf and we rely on gdb for guality and
gdb has python scripting can we somehow walk its dwarf tree from
within a python script?  That is, not need the dwarf decoding or
objdump requirement?

On IRC I suggested to use pre-existing python DWARF decoders
which we might be able to import into the tree.  We'd still need them
to handle non-ELF object formats or somehow extract DWARF from
other containers to an ELF file (objcopy to the rescue...).

That said, not needing to write a DWARF / object file decoder
would be nice.

I see your testcases have associated .py files.  There are a few
existing "simple" dwarf testcases that would benefit from being
able to embed matching into the testcase source file itself?  Thus
have TCL autogenerate a .py file for the testing from, say

/* { dg-final { scan-dwarf { "Matcher('DW_TAG_member', 'i',
                      attrs={'DW_AT_type': Capture('s0_i_type')})" } } } */

do you think that's feasible or doesn't it make much sense because
it would essentially match anywhere?  Or we'd end up with a
gazillion of scan-dwarf variants?

I think a separate .py for checking is required anyway for the more
complex cases.

> gcc/testsuite/
>
>         * lib/gcc-dwarf.exp: New helper files.
>         * python/dwarfutils/__init__.py,
>         python/dwarfutils/data.py,
>         python/dwarfutils/helpers.py,
>         python/dwarfutils/objdump.py: New Python helpers.
>         * gcc.dg/debug/dwarf2-py/dwarf2-py.exp,
>         gnat.dg/dwarf/dwarf.exp: New test drivers.
>         * gcc.dg/debug/dwarf2-py/sso.c,
>         gcc.dg/debug/dwarf2-py/sso.py,
>         gcc.dg/debug/dwarf2-py/var2.c,
>         gcc.dg/debug/dwarf2-py/var2.py,
>         gnat.dg/dwarf/debug9.adb,
>         gnat.dg/dwarf/debug9.py,
>         gnat.dg/dwarf/debug11.adb,
>         gnat.dg/dwarf/debug11.py,
>         gnat.dg/dwarf/debug12.adb,
>         gnat.dg/dwarf/debug12.ads,
>         gnat.dg/dwarf/debug12.py: New tests.
> ---
>  gcc/testsuite/gcc.dg/debug/dwarf2-py/dwarf2-py.exp |  52 ++
>  gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.c         |  19 +
>  gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.py        |  52 ++
>  gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.c        |  13 +
>  gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.py       |  11 +
>  gcc/testsuite/gnat.dg/dg.exp                       |   1 +
>  gcc/testsuite/gnat.dg/dwarf/debug11.adb            |  19 +
>  gcc/testsuite/gnat.dg/dwarf/debug11.py             |  51 ++
>  gcc/testsuite/gnat.dg/dwarf/debug12.adb            |  10 +
>  gcc/testsuite/gnat.dg/dwarf/debug12.ads            |   8 +
>  gcc/testsuite/gnat.dg/dwarf/debug12.py             |   9 +
>  gcc/testsuite/gnat.dg/dwarf/debug9.adb             |  45 ++
>  gcc/testsuite/gnat.dg/dwarf/debug9.py              |  22 +
>  gcc/testsuite/gnat.dg/dwarf/dwarf.exp              |  39 ++
>  gcc/testsuite/lib/gcc-dwarf.exp                    |  41 ++
>  gcc/testsuite/python/dwarfutils/__init__.py        |  70 +++
>  gcc/testsuite/python/dwarfutils/data.py            | 597 +++++++++++++++++++++
>  gcc/testsuite/python/dwarfutils/helpers.py         |  11 +
>  gcc/testsuite/python/dwarfutils/objdump.py         | 338 ++++++++++++
>  19 files changed, 1408 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/dwarf2-py.exp
>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.c
>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.py
>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.c
>  create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.py
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug11.adb
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug11.py
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug12.adb
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug12.ads
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug12.py
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug9.adb
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/debug9.py
>  create mode 100644 gcc/testsuite/gnat.dg/dwarf/dwarf.exp
>  create mode 100644 gcc/testsuite/lib/gcc-dwarf.exp
>  create mode 100644 gcc/testsuite/python/dwarfutils/__init__.py
>  create mode 100644 gcc/testsuite/python/dwarfutils/data.py
>  create mode 100644 gcc/testsuite/python/dwarfutils/helpers.py
>  create mode 100644 gcc/testsuite/python/dwarfutils/objdump.py
>
> diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2-py/dwarf2-py.exp b/gcc/testsuite/gcc.dg/debug/dwarf2-py/dwarf2-py.exp
> new file mode 100644
> index 00000000000..5c49bc81a55
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/debug/dwarf2-py/dwarf2-py.exp
> @@ -0,0 +1,52 @@
> +# Copyright (C) 2017 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +# Testsuite driver for testcases that check the DWARF output with Python
> +# scripts.
> +
> +load_lib gcc-dg.exp
> +load_lib gcc-python.exp
> +load_lib gcc-dwarf.exp
> +
> +# This series of tests require a working Python interpreter and a supported
> +# host tool to dump DWARF.
> +if { ![check-python-available] || ![detect-dwarf-dump-tool] } {
> +    return
> +}
> +
> +# If a testcase doesn't have special options, use these.
> +global DEFAULT_CFLAGS
> +if ![info exists DEFAULT_CFLAGS] then {
> +    set DEFAULT_CFLAGS " -ansi -pedantic-errors -gdwarf"
> +}
> +
> +# Initialize `dg'.
> +dg-init
> +
> +# Main loop.
> +if {[check-python-available]} {
> +    set comp_output [gcc_target_compile \
> +       "$srcdir/$subdir/../trivial.c" "trivial.S" assembly \
> +       "additional_flags=-gdwarf"]
> +    if { ! [string match "*: target system does not support the * debug format*" \
> +       $comp_output] } {
> +       remove-build-file "trivial.S"
> +       dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\] ] ] "" $DEFAULT_CFLAGS
> +    }
> +}
> +
> +# All done.
> +dg-finish
> diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.c b/gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.c
> new file mode 100644
> index 00000000000..f7429a58179
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.c
> @@ -0,0 +1,19 @@
> +/* { dg-do assemble } */
> +/* { dg-options "-gdwarf-3" } */
> +/* { dg-final { python-test sso.py } } */
> +
> +#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
> +#define REVERSE_SSO __attribute__((scalar_storage_order("big-endian")));
> +#else
> +#define REVERSE_SSO __attribute__((scalar_storage_order("little-endian")));
> +#endif
> +
> +struct S0 { int i; };
> +
> +struct S1 { int i; struct S0 s; } REVERSE_SSO;
> +
> +struct S2 { int a[4]; struct S0 s; } REVERSE_SSO;
> +
> +struct S0 s0;
> +struct S1 s1;
> +struct S2 s2;
> diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.py b/gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.py
> new file mode 100644
> index 00000000000..0c95abfe2b8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.py
> @@ -0,0 +1,52 @@
> +import dwarfutils
> +from dwarfutils.data import Capture, DIE, Matcher
> +from testutils import check
> +
> +
> +cu = dwarfutils.parse_dwarf()
> +s0 = cu.find(tag='DW_TAG_structure_type', name='S0')
> +s1 = cu.find(tag='DW_TAG_structure_type', name='S1')
> +s2 = cu.find(tag='DW_TAG_structure_type', name='S2')
> +
> +# Check the DIE structure of these structure types
> +m0 = s0.tree_check(Matcher(
> +    'DW_TAG_structure_type', 'S0',
> +    children=[Matcher('DW_TAG_member', 'i',
> +                      attrs={'DW_AT_type': Capture('s0_i_type')})]
> +))
> +m1 = s1.tree_check(Matcher(
> +    'DW_TAG_structure_type', 'S1',
> +    children=[
> +        Matcher('DW_TAG_member', 'i',
> +                attrs={'DW_AT_type': Capture('s1_i_type')}),
> +        Matcher('DW_TAG_member', 's', attrs={'DW_AT_type': s0}),
> +    ]
> +))
> +m2 = s2.tree_check(Matcher(
> +    'DW_TAG_structure_type', 'S2',
> +    children=[
> +        Matcher('DW_TAG_member', 'a',
> +                attrs={'DW_AT_type': Capture('s2_a_type')}),
> +        Matcher('DW_TAG_member', 's', attrs={'DW_AT_type': s0}),
> +    ]
> +))
> +
> +# Now check that their scalar members have expected types
> +s0_i_type = m0.capture('s0_i_type').value
> +s1_i_type = m1.capture('s1_i_type').value
> +s2_a_type = m2.capture('s2_a_type').value
> +
> +# S0.i must not have a DW_AT_endianity attribute.  S1.i must have one.
> +s0_i_type.tree_check(Matcher('DW_TAG_base_type',
> +                             attrs={'DW_AT_endianity': None}))
> +s1_i_type.tree_check(Matcher('DW_TAG_base_type',
> +                             attrs={'DW_AT_endianity': True}))
> +
> +# So does the integer type that S2.a contains.
> +ma = s2_a_type.tree_check(Matcher(
> +    'DW_TAG_array_type',
> +    attrs={'DW_AT_type': Capture('element_type')}
> +))
> +element_type = ma.capture('element_type').value
> +check(element_type == s1_i_type,
> +      'check element type of S2.a is type of S1.i')
> diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.c b/gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.c
> new file mode 100644
> index 00000000000..e77adc0eaf5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.c
> @@ -0,0 +1,13 @@
> +/* PR 23190 */
> +/* { dg-do assemble } */
> +/* { dg-options "-O2 -gdwarf" } */
> +/* { dg-final { python-test var2.py } } */
> +
> +static int foo;
> +int bar;
> +int main(void)
> +{
> +   foo += 3;
> +   bar *= 5;
> +   return 0;
> +}
> diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.py b/gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.py
> new file mode 100644
> index 00000000000..9a9b2c4a4ca
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.py
> @@ -0,0 +1,11 @@
> +import dwarfutils
> +from dwarfutils.data import Capture, DIE, Matcher
> +from testutils import check
> +
> +
> +cu = dwarfutils.parse_dwarf()
> +foo = cu.find(tag='DW_TAG_variable', name='foo')
> +bar = cu.find(tag='DW_TAG_variable', name='bar')
> +
> +foo.check_attr('DW_AT_location', [('DW_OP_addr', '0')])
> +bar.check_attr('DW_AT_location', [('DW_OP_addr', '0')])
> diff --git a/gcc/testsuite/gnat.dg/dg.exp b/gcc/testsuite/gnat.dg/dg.exp
> index 228c71e85bb..dff86600957 100644
> --- a/gcc/testsuite/gnat.dg/dg.exp
> +++ b/gcc/testsuite/gnat.dg/dg.exp
> @@ -18,6 +18,7 @@
>
>  # Load support procs.
>  load_lib gnat-dg.exp
> +load_lib gcc-python.exp
>
>  # If a testcase doesn't have special options, use these.
>  global DEFAULT_CFLAGS
> diff --git a/gcc/testsuite/gnat.dg/dwarf/debug11.adb b/gcc/testsuite/gnat.dg/dwarf/debug11.adb
> new file mode 100644
> index 00000000000..a87470925f1
> --- /dev/null
> +++ b/gcc/testsuite/gnat.dg/dwarf/debug11.adb
> @@ -0,0 +1,19 @@
> +--  { dg-options "-cargs -O0 -g -dA -fgnat-encodings=minimal -margs" }
> +--  { dg-do assemble }
> +--  { dg-final { python-test debug11.py } }
> +
> +with Ada.Text_IO;
> +
> +procedure Debug11 is
> +   type Rec_Type (C : Character) is record
> +      case C is
> +         when 'Z' .. Character'Val (128) => I : Integer;
> +         when others                     => null;
> +      end case;
> +   end record;
> +   --  R : Rec_Type := ('Z', 2);
> +   R : Rec_Type ('Z');
> +begin
> +   R.I := 0;
> +   Ada.Text_IO.Put_Line ("" & R.C);
> +end Debug11;
> diff --git a/gcc/testsuite/gnat.dg/dwarf/debug11.py b/gcc/testsuite/gnat.dg/dwarf/debug11.py
> new file mode 100644
> index 00000000000..26c3fdfeeda
> --- /dev/null
> +++ b/gcc/testsuite/gnat.dg/dwarf/debug11.py
> @@ -0,0 +1,51 @@
> +import dwarfutils
> +from dwarfutils.data import Capture, DIE, Matcher
> +from testutils import check, print_pass
> +
> +
> +cu = dwarfutils.parse_dwarf()
> +rec_type = cu.find(tag='DW_TAG_structure_type', name='debug11__rec_type')
> +
> +check(rec_type.parent.matches(tag='DW_TAG_subprogram', name='debug11'),
> +      'check that rec_type appears in the expected context')
> +
> +# Check that rec_type has the expected DIE structure
> +m = rec_type.tree_check(Matcher(
> +    'DW_TAG_structure_type', 'debug11__rec_type',
> +    children=[
> +        Matcher('DW_TAG_member', 'c', capture='c'),
> +        Matcher(
> +            'DW_TAG_variant_part',
> +            attrs={'DW_AT_discr': Capture('discr')},
> +            children=[
> +                Matcher(
> +                    'DW_TAG_variant',
> +                    attrs={'DW_AT_discr_list': Capture('discr_list'),
> +                           'DW_AT_discr_value': None},
> +                    children=[
> +                        Matcher('DW_TAG_member', 'i'),
> +                    ]
> +                ),
> +                Matcher(
> +                    'DW_TAG_variant',
> +                    attrs={'DW_AT_discr_list': None,
> +                           'DW_AT_discr_value': None},
> +                    children=[]
> +                )
> +            ]
> +        )
> +    ]
> +))
> +
> +# Check that DW_AT_discr refers to the expected DW_TAG_member
> +c = m.capture('c')
> +discr = m.capture('discr')
> +check(c == discr.value, 'check that discriminant is {}'.format(discr.value))
> +
> +# Check that DW_AT_discr_list has the expected content: the C discriminant must
> +# be properly described as unsigned, hence the 0x5a ('Z') and 0x80 0x01 (128)
> +# values in the DW_AT_discr_list attribute. If it was described as signed, we
> +# would have instead 90 and -128.
> +discr_list = m.capture('discr_list')
> +check(discr_list.value == [0x1, 0x5a, 0x80, 0x1],
> +      'check discriminant list')
> diff --git a/gcc/testsuite/gnat.dg/dwarf/debug12.adb b/gcc/testsuite/gnat.dg/dwarf/debug12.adb
> new file mode 100644
> index 00000000000..1fa9f27aa9b
> --- /dev/null
> +++ b/gcc/testsuite/gnat.dg/dwarf/debug12.adb
> @@ -0,0 +1,10 @@
> +--  { dg-options "-cargs -gdwarf-4 -margs" }
> +--  { dg-do assemble }
> +--  { dg-final { python-test debug12.py } }
> +
> +package body Debug12 is
> +   function Get_A2 return Boolean is
> +   begin
> +      return A2;
> +   end Get_A2;
> +end Debug12;
> diff --git a/gcc/testsuite/gnat.dg/dwarf/debug12.ads b/gcc/testsuite/gnat.dg/dwarf/debug12.ads
> new file mode 100644
> index 00000000000..dbc5896cc73
> --- /dev/null
> +++ b/gcc/testsuite/gnat.dg/dwarf/debug12.ads
> @@ -0,0 +1,8 @@
> +package Debug12 is
> +   type Bit_Array is array (Positive range <>) of Boolean
> +      with Pack;
> +   A  : Bit_Array := (1 .. 10 => False);
> +   A2 : Boolean renames A (2);
> +
> +   function Get_A2 return Boolean;
> +end Debug12;
> diff --git a/gcc/testsuite/gnat.dg/dwarf/debug12.py b/gcc/testsuite/gnat.dg/dwarf/debug12.py
> new file mode 100644
> index 00000000000..41e589b2ff1
> --- /dev/null
> +++ b/gcc/testsuite/gnat.dg/dwarf/debug12.py
> @@ -0,0 +1,9 @@
> +import dwarfutils
> +from dwarfutils.data import Capture, DIE, Matcher
> +from testutils import check
> +
> +
> +cu = dwarfutils.parse_dwarf()
> +
> +a2 = cu.find(tag='DW_TAG_variable', name='debug12__a2___XR_debug12__a___XEXS2')
> +a2.check_attr('DW_AT_location', [('DW_OP_const1s', '-1')])
> diff --git a/gcc/testsuite/gnat.dg/dwarf/debug9.adb b/gcc/testsuite/gnat.dg/dwarf/debug9.adb
> new file mode 100644
> index 00000000000..9ed66b55cdf
> --- /dev/null
> +++ b/gcc/testsuite/gnat.dg/dwarf/debug9.adb
> @@ -0,0 +1,45 @@
> +--  { dg-options "-cargs -g -fgnat-encodings=minimal -dA -margs" }
> +--  { dg-do assemble }
> +--  { dg-final { python-test debug9.py } }
> +
> +procedure Debug9 is
> +   type Array_Type is array (Natural range <>) of Integer;
> +   type Record_Type (L1, L2 : Natural) is record
> +      I1 : Integer;
> +      A1 : Array_Type (1 .. L1);
> +      I2 : Integer;
> +      A2 : Array_Type (1 .. L2);
> +      I3 : Integer;
> +   end record;
> +
> +   function Get (L1, L2 : Natural) return Record_Type is
> +      Result : Record_Type (L1, L2);
> +   begin
> +      Result.I1 := 1;
> +      for I in Result.A1'Range loop
> +         Result.A1 (I) := I;
> +      end loop;
> +      Result.I2 := 2;
> +      for I in Result.A2'Range loop
> +         Result.A2 (I) := I;
> +      end loop;
> +      Result.I3 := 3;
> +      return Result;
> +   end Get;
> +
> +   R1 : Record_Type := Get (0, 0);
> +   R2 : Record_Type := Get (1, 0);
> +   R3 : Record_Type := Get (0, 1);
> +   R4 : Record_Type := Get (2, 2);
> +
> +   procedure Process (R : Record_Type) is
> +   begin
> +      null;
> +   end Process;
> +
> +begin
> +   Process (R1);
> +   Process (R2);
> +   Process (R3);
> +   Process (R4);
> +end Debug9;
> diff --git a/gcc/testsuite/gnat.dg/dwarf/debug9.py b/gcc/testsuite/gnat.dg/dwarf/debug9.py
> new file mode 100644
> index 00000000000..560f69d4ec7
> --- /dev/null
> +++ b/gcc/testsuite/gnat.dg/dwarf/debug9.py
> @@ -0,0 +1,22 @@
> +import dwarfutils
> +from dwarfutils.data import Capture, DIE, Matcher
> +from testutils import check, print_pass, print_fail
> +
> +
> +cu = dwarfutils.parse_dwarf()
> +cu_die = cu.root
> +
> +# Check that array and structure types are not declared as compilation
> +# unit-level types.
> +types = cu.find(
> +    predicate=lambda die: die.tag in ('DW_TAG_structure_type ',
> +                                      'DW_TAG_array_type'),
> +    single=False
> +)
> +
> +global_types = [t for t in types if t.parent == cu_die]
> +check(not global_types, 'check composite types are not global')
> +if global_types:
> +    print('Global types:')
> +    for t in global_types:
> +        print('  {}'.format(t))
> diff --git a/gcc/testsuite/gnat.dg/dwarf/dwarf.exp b/gcc/testsuite/gnat.dg/dwarf/dwarf.exp
> new file mode 100644
> index 00000000000..cbf21a9829a
> --- /dev/null
> +++ b/gcc/testsuite/gnat.dg/dwarf/dwarf.exp
> @@ -0,0 +1,39 @@
> +# Copyright (C) 2017 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +# Testsuite driver for testcases that check the DWARF output with Python
> +# scripts.
> +
> +load_lib gnat-dg.exp
> +load_lib gcc-python.exp
> +load_lib gcc-dwarf.exp
> +
> +# This series of tests require a working Python interpreter and a supported
> +# host tool to dump DWARF.
> +if { ![check-python-available] || ![detect-dwarf-dump-tool] } {
> +    return
> +}
> +
> +# Initialize `dg'.
> +dg-init
> +
> +# Main loop.
> +if {[check-python-available]} {
> +    dg-runtest [lsort [glob $srcdir/$subdir/*.adb]] "" ""
> +}
> +
> +# All done.
> +dg-finish
> diff --git a/gcc/testsuite/lib/gcc-dwarf.exp b/gcc/testsuite/lib/gcc-dwarf.exp
> new file mode 100644
> index 00000000000..5e0e6117e16
> --- /dev/null
> +++ b/gcc/testsuite/lib/gcc-dwarf.exp
> @@ -0,0 +1,41 @@
> +# Copyright (C) 2017 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +# Helpers to run tools to dump DWARF
> +
> +load_lib "remote.exp"
> +
> +# Look for a tool that we can use to dump DWARF. If nothing is found, return 0.
> +#
> +# If one is found, return 1, set the DWARF_DUMP_TOOL_KIND environment variable
> +# to contain the class of tool detected (e.g. objdump) and set the
> +# DWARF_DUMP_TOOL to the name of the tool program (e.g. arm-eabi-objdump).
> +
> +proc detect-dwarf-dump-tool { args } {
> +
> +    # Look for an objdump corresponding to the current target
> +    set objdump [transform objdump]
> +    set result [local_exec "which $objdump" "" "" 300]
> +    set status [lindex $result 0]
> +
> +    if { $status == 0 } {
> +       setenv DWARF_DUMP_TOOL_KIND objdump
> +       setenv DWARF_DUMP_TOOL $objdump
> +       return 1
> +    }
> +
> +    return 0
> +}
> diff --git a/gcc/testsuite/python/dwarfutils/__init__.py b/gcc/testsuite/python/dwarfutils/__init__.py
> new file mode 100644
> index 00000000000..246fbbd15be
> --- /dev/null
> +++ b/gcc/testsuite/python/dwarfutils/__init__.py
> @@ -0,0 +1,70 @@
> +# Copyright (C) 2017 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +# Helpers to parse and check DWARF in object files.
> +#
> +# The purpose of these is to make it easy to write "smart" tests on DWARF
> +# information: pattern matching on DIEs and their attributes, check links
> +# between DIEs, etc. Doing these checks using abstract representations of DIEs
> +# is far easier than scanning the generated assembly!
> +
> +import os
> +import sys
> +
> +import dwarfutils.objdump
> +
> +
> +# Fetch the DWARF parsing function that correspond to the DWARF dump tool to
> +# use.
> +DWARF_DUMP_TOOL_KIND = os.environ['DWARF_DUMP_TOOL_KIND']
> +DWARF_DUMP_TOOL = os.environ['DWARF_DUMP_TOOL']
> +
> +dwarf_parsers = {
> +    'objdump': dwarfutils.objdump.parse_dwarf,
> +}
> +try:
> +    dwarf_parser = dwarf_parsers[DWARF_DUMP_TOOL_KIND]
> +except KeyError:
> +    raise RuntimeError('Unhandled DWARF dump tool: {}'.format(
> +        DWARF_DUMP_TOOL_KIND
> +    ))
> +
> +
> +def parse_dwarf(object_file=None, single_cu=True):
> +    """
> +    Fetch and decode DWARF compilation units in `object_file`.
> +
> +    If `single_cu` is True, make sure there is exactly one compilation unit and
> +    return it. Otherwise, return compilation units as a list.
> +
> +    :param str|None object_file: Name of the object file to process. If left to
> +        None, `sys.argv[1]` is used instead.
> +
> +    :rtype: dwarfutils.data.CompilationUnit
> +           |list[dwarfutils.data.CompilationUnit]
> +    """
> +    if object_file is None:
> +        object_file = sys.argv[1]
> +    result = dwarf_parser(object_file)
> +
> +    if single_cu:
> +        if not result:
> +            return None
> +        if len(result) > 1:
> +            raise ValueError('Multiple compilation units found')
> +        return result[0]
> +    else:
> +        return result
> diff --git a/gcc/testsuite/python/dwarfutils/data.py b/gcc/testsuite/python/dwarfutils/data.py
> new file mode 100644
> index 00000000000..6b91d5bd779
> --- /dev/null
> +++ b/gcc/testsuite/python/dwarfutils/data.py
> @@ -0,0 +1,597 @@
> +# Copyright (C) 2017 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +# Data structures to represent DWARF compilation units, DIEs and attributes,
> +# and helpers to perform various checks on them.
> +
> +from testutils import check
> +
> +
> +class Abbrev(object):
> +    """DWARF abbreviation entry."""
> +
> +    def __init__(self, number, tag, has_children):
> +        """
> +        :param int number: Abbreviation number, which is 1-based, as in the
> +            DWARF standard.
> +        :param str|int tag: Tag name or, if unknown, tag number.
> +        :param bool has_children: Whether DIEs will have children.
> +        """
> +        self.number = number
> +        self.tag = tag
> +        self.has_children = has_children
> +        self.attributes = []
> +
> +    def add_attribute(self, name, form):
> +        """
> +        :param str|int name: Attribute name or, if unknown, attribute number.
> +        :param str form: Form for this attribute.
> +        """
> +        self.attributes.append((name, form))
> +
> +
> +class CompilationUnit(object):
> +    """DWARF compilation unit."""
> +
> +    def __init__(self, offset, length, is_32bit, version, abbrevs,
> +                 pointer_size):
> +        """
> +        :param int offset: Offset of this compilation unit in the .debug_info
> +            section.
> +        :param int length: Value of the length field for this compilation unit.
> +        :param bool is_32bit: Whether this compilation unit is encoded in the
> +            32-bit format. If not, it must be the 64-bit one.
> +        :param int version: DWARF version used by this compilation unit.
> +        :param list[Abbrev] abbrevs: List of abbreviations for this compilation
> +            unit.
> +        :param int pointer_size: Size of pointers for this architecture.
> +        """
> +        self.offset = offset
> +        self.length = length
> +        self.is_32bit = is_32bit
> +        self.version = version
> +        self.abbrevs = abbrevs
> +        self.pointer_size = pointer_size
> +
> +        self.root = None
> +        self.offset_to_die = {}
> +
> +    def set_root(self, die):
> +        assert self.root is None, ('Trying to create the root DIE of a'
> +                                   ' compilation unit that already has one')
> +        self.root = die
> +
> +    def find(self, *args, **kwargs):
> +        return self.root.find(*args, **kwargs)
> +
> +
> +class DIE(object):
> +    """DWARF information entry."""
> +
> +    def __init__(self, cu, level, offset, abbrev_number):
> +        """
> +        :param CompilationUnit cu: Compilation unit this DIE belongs to.
> +        :param int level: Depth for this DIE.
> +        :param int offset: Offset of this DIE in the .debug_info section.
> +        :param int abbrev_number: Abbreviation number for this DIE.
> +        """
> +        self.cu = cu
> +        self.cu.offset_to_die[offset] = self
> +
> +        self.level = level
> +        self.offset = offset
> +        self.abbrev_number = abbrev_number
> +
> +        self.parent = None
> +        self.attributes = []
> +        self.children = []
> +
> +    @property
> +    def abbrev(self):
> +        """Abbreviation for this DIE.
> +
> +        :rtype: Abbrev
> +        """
> +        # The abbreviation number is 1-based, but list indexes are 0-based
> +        return self.cu.abbrevs[self.abbrev_number - 1]
> +
> +    @property
> +    def tag(self):
> +        """Tag for this DIE.
> +
> +        :rtype: str|int
> +        """
> +        return self.abbrev.tag
> +
> +    @property
> +    def has_children(self):
> +        return self.abbrev.has_children
> +
> +    def get_attr(self, name, single=True, or_error=True):
> +        """Look for an attribute in this DIE.
> +
> +        :param str|int name: Attribute name, or number if name is unknown.
> +        :param bool single: If true, this will raise a KeyError for
> +            zero/multiple matches and return an Attribute instance when found.
> +            Otherwise, return a potentially empty list of attributes.
> +        :param bool or_error: When True, if `single` is True and no attribute
> +            is found, return None instead of raising a KeyError.
> +        :rtype: Attribute|list[Attribute]
> +        """
> +        result = [a for a in self.attributes if a.name == name]
> +
> +        if single:
> +            if not result:
> +                if or_error:
> +                    raise KeyError('No {} attribute in {}'.format(name, self))
> +                else:
> +                    return None
> +            if len(result) > 1:
> +                raise KeyError('Multiple {} attributes in {}'.format(name,
> +                                                                     self))
> +            return result[0]
> +        else:
> +            return result
> +
> +    def check_attr(self, name, value):
> +        """Check for the presence/value of an attribute.
> +
> +        :param str|int name: Attribute name, or number if name is unknown.
> +        :param value: If None, check that the attribute is not present.
> +            Otherwise, check that the attribute exists and that its value
> +            matches `value`.
> +        """
> +        m = MatchResult()
> +        Matcher._match_attr(self, name, value, m)
> +        check(
> +            m.succeeded,
> +            m.mismatch_reason or 'check attribute {} of {}'.format(name, self)
> +        )
> +
> +    def get_child(self, child_index):
> +        """Get a DIE child.
> +
> +        :param int child_index: Index of the child to fetch (zero-based index).
> +        :rtype: DIE
> +        """
> +        return self.children[child_index]
> +
> +    @property
> +    def name(self):
> +        """Return the name (DW_AT_name) for this DIE, if any.
> +
> +        :rtype: str|None
> +        """
> +        name = self.get_attr('DW_AT_name', or_error=False)
> +        return name.value if name is not None else None
> +
> +    def __str__(self):
> +        tag = (self.tag if isinstance(self.tag, str) else
> +               'DIE {}'.format(self.tag))
> +        name = self.name
> +        fmt = '{tag} "{name}"' if name else '{tag}'
> +        return fmt.format(tag=tag, name=name)
> +
> +    def __repr__(self):
> +        return '<{} at {:#x}>'.format(self, self.offset)
> +
> +    def matches(self, tag=None, name=None):
> +        """Return whether this DIE matches expectations.
> +
> +        :rtype: bool
> +        """
> +        return ((tag is None or self.tag == tag) and
> +                (name is None or self.name == name))
> +
> +    def tree_matches(self, matcher):
> +        """Match this DIE against the given match object.
> +
> +        :param Matcher matcher: Match object used to check the structure of
> +            this DIE.
> +        :rtype: MatchResult
> +        """
> +        return matcher.matches(self)
> +
> +    def tree_check(self, matcher):
> +        """Like `tree_matches`, but also check that the DIE matches."""
> +        m = self.tree_matches(matcher)
> +        check(
> +            m.succeeded,
> +            m.mismatch_reason or 'check structure of {}'.format(self)
> +        )
> +        return m
> +
> +    def find(self, predicate=None, tag=None, name=None, recursive=True,
> +             single=True):
> +        """Look for a DIE that satisfies the given expectations.
> +
> +        :param None|(DIE) -> bool predicate: If provided, function that filters
> +            out DIEs when it returns False.
> +        :param str|int|None tag: If provided, filter out DIEs whose tag does
> +            not match.
> +        :param str|None name: If provided, filter out DIEs whose name (see
> +            the `name` property) does not match.
> +        :param bool recursive: If True, perform the search recursively in
> +            self's children.
> +        :param bool single: If True, look for a single DIE and raise a
> +            ValueError if none or several DIEs are found. Otherwise, return a
> +            potentially empty list of DIEs.
> +
> +        :rtype: DIE|list[DIE]
> +        """
> +        def p(die):
> +            return ((predicate is None or predicate(die)) and
> +                    die.matches(tag, name))
> +        result = self._find(p, recursive)
> +
> +        if single:
> +            if not result:
> +                raise ValueError('No matching DIE found')
> +            if len(result) > 1:
> +                raise ValueError('Multiple matching DIEs found')
> +            return result[0]
> +        else:
> +            return result
> +
> +    def _find(self, predicate, recursive):
> +        result = []
> +
> +        if predicate(self):
> +            result.append(self)
> +
> +        for c in self.children:
> +            if not recursive:
> +                if predicate(c):
> +                    result.append(c)
> +            else:
> +                result.extend(c._find(predicate, recursive))
> +
> +        return result
> +
> +    def next_attribute_form(self, name):
> +        """Return the form of the next attribute this DIE requires.
> +
> +        Used during DIE tree construction.
> +
> +        :param str name: Expected name for this attribute. The abbreviation
> +            will confirm it.
> +        :rtype: str
> +        """
> +        assert len(self.attributes) < len(self.abbrev.attributes)
> +        expected_name, form = self.abbrev.attributes[len(self.attributes)]
> +        assert name == expected_name, (
> +            'Attribute desynchronization in {}'.format(self)
> +        )
> +        return form
> +
> +    def add_attribute(self, name, form, offset, value):
> +        """Add an attribute to this DIE.
> +
> +        Used during DIE tree construction. See Attribute's constructor for the
> +        meaning of arguments.
> +        """
> +        self.attributes.append(Attribute(self, name, form, offset, value))
> +
> +    def add_child(self, child):
> +        """Add a DIE child to this DIE.
> +
> +        Used during DIE tree construction.
> +
> +        :param DIE child: DIE to append.
> +        """
> +        assert self.has_children
> +        assert child.parent is None
> +        child.parent = self
> +        self.children.append(child)
> +
> +
> +class Attribute(object):
> +    """DIE attribute."""
> +
> +    def __init__(self, die, name, form, offset, value):
> +        """
> +        :param DIE die: DIE that will own this attribute.
> +        :param str|int name: Attribute name, or attribute number if unknown.
> +        :param str form: Attribute form.
> +        :param int offset: Offset of this attribute in the .debug_info section.
> +        :param value: Decoded value for this attribute. If it's a Defer
> +            instance, decoding will happen the first time the "value" property
> +            is evaluated.
> +        """
> +        self.die = die
> +        self.name = name
> +        self.form = form
> +        self.offset = offset
> +
> +        if isinstance(value, Defer):
> +            self._value = None
> +            self._value_getter = value
> +        else:
> +            self._value = value
> +            self._value_getter = None
> +            self._refine_value()
> +
> +    @property
> +    def value(self):
> +        if self._value_getter:
> +            self._value = self._value_getter.get()
> +            self._value_getter = None
> +            self._refine_value()
> +        return self._value
> +
> +    def _refine_value(self):
> +        # If we hold a location expression, bind it to this attribute
> +        if isinstance(self._value, Exprloc):
> +            self._value.attribute = self
> +
> +    def __repr__(self):
> +        label = (self.name if isinstance(self.name, str) else
> +                 'Attribute {}'.format(self.name))
> +        return '<{} at {:#x}>'.format(label, self.offset)
> +
> +
> +class Exprloc(object):
> +    """DWARF location expression."""
> +
> +    def __init__(self, byte_list, operations):
> +        """
> +        :param list[int] byte_list: List of bytes that encode this expression.
> +        :param list[(str, ...)] operations: List of operations this expression
> +            contains. Each expression is a tuple whose first element is the
> +            opcode name (DW_OP_...) and whose other elements are operands.
> +        """
> +        self.attribute = None
> +        self.byte_list = byte_list
> +        self.operations = operations
> +
> +    @property
> +    def die(self):
> +        return self.attribute.die
> +
> +    @staticmethod
> +    def format_operation(operation):
> +        opcode = operation[0]
> +        operands = operation[1:]
> +        return '{}: {}'.format(opcode, ' '.join(operands))
> +
> +    def matches(self, operations):
> +        """Match this list of operations to `operations`.
> +
> +        :param list[(str, ...)] operations: List of operations to match.
> +        :rtype: bool
> +        """
> +        return self.operations == operations
> +
> +    def __repr__(self):
> +        return '{} ({})'.format(
> +            ' '.join(hex(b) for b in self.byte_list),
> +            '; '.join(self.format_operation(op) for op in self.operations)
> +        )
> +
> +
> +class Defer(object):
> +    """Helper to defer a computation."""
> +
> +    def __init__(self, func):
> +        """
> +        :param () -> T func: Callback to perform the computation.
> +        """
> +        self.func = func
> +
> +    def get(self):
> +        """
> +        :rtype: T
> +        """
> +        return self.func()
> +
> +
> +class Matcher(object):
> +    """Specification for DIE tree pattern matching."""
> +
> +    def __init__(self, tag=None, name=None, attrs=None, children=None,
> +                 capture=None):
> +        """
> +        :param None|str tag: If provided, name of the tag that DIEs must match.
> +        :param None|str name: If provided, name that DIEs must match (see the
> +            DIE.name property).
> +        :param attrs: If provided, dictionary that specifies attribute
> +            expectations. Keys are attribute names. Values can be:
> +
> +              * None, so that attribute must be undefined in the DIE;
> +              * a value, so that attribute must be defined and the value must
> +                match;
> +              * a Capture instance, so that the attribute value (or None, if
> +                undefined) is captured.
> +
> +        :param None | list[DIE|Capture] children: If provided, list of DIEs
> +            that children must match. Capture instances match any DIE and
> +            captures it.
> +
> +        :param str|None capture: If provided, capture the DIE to match with the
> +            given name.
> +        """
> +        self.tag = tag
> +        self.name = name
> +        self.attrs = attrs
> +        self.children = children
> +        self.capture_name = capture
> +
> +    def matches(self, die):
> +        """Pattern match the given DIE.
> +
> +        :param DIE die: DIE to match.
> +        :rtype: MatchResult
> +        """
> +        result = MatchResult()
> +        self._match_die(die, result)
> +        return result
> +
> +    def _match_die(self, die, result):
> +        """Helper for the "matches" method.
> +
> +        Return whether DIE could be matched. If not, a message to describe why
> +        is recorded in `result`.
> +
> +        :param DIE die: DIE to match.
> +        :param MatchResult result: Holder for the result of the match.
> +        :rtype: bool
> +        """
> +
> +        # If asked to, check the DIE tag
> +        if self.tag is not None and self.tag != die.tag:
> +            result.mismatch_reason = '{} is expected to be a {}'.format(
> +                die, self.tag
> +            )
> +            return False
> +
> +        # If asked to, check the DIE name
> +        if self.name is not None and self.name != die.name:
> +            result.mismatch_reason = (
> +                '{} is expected to be called "{}"'.format(self.name,
> +                                                          die.name)
> +            )
> +            return False
> +
> +        # Check attribute expectations
> +        if self.attrs:
> +            for n, v in self.attrs.items():
> +                if not self._match_attr(die, n, v, result):
> +                    return False
> +
> +        # Check children expectations
> +        if self.children is not None:
> +
> +            # The number of children must match
> +            if len(self.children) != len(die.children):
> +                result.mismatch_reason = (
> +                    '{} has {} children, {} expected'.format(
> +                        die, len(die.children), len(self.children)
> +                    )
> +                )
> +                return False
> +
> +            # Then each child must match the corresponding child matcher
> +            for matcher_child, die_child in zip(self.children,
> +                                                die.children):
> +                # Capture instances matches anything and captures it
> +                if isinstance(matcher_child, Capture):
> +                    result.dict[matcher_child.name] = die_child
> +
> +                elif not matcher_child._match_die(die_child, result):
> +                    return False
> +
> +        # Capture the input DIE if asked to
> +        if self.capture_name:
> +            result.dict[self.capture_name] = die
> +
> +        # If no check failed, the DIE matches the pattern
> +        return True
> +
> +    @staticmethod
> +    def _match_attr(die, attr_name, attr_value, result):
> +        """Helper for the "matches" method.
> +
> +        Return whether the `attr_name` attribute in DIE matches the
> +        `attr_value` expectation. If not, a message to describe why is recorded
> +        in `result`.
> +
> +        :param DIE die: DIE that contain the attribute to match.
> +        :param str attr_name: Attribute name.
> +        :param attr_value: Attribute expectation. See attrs's description in
> +            Match.__init__ docstring for possible values.
> +        """
> +        attr = die.get_attr(attr_name, or_error=False)
> +
> +        if attr_value is None:
> +            # The attribute is expected not to be defined
> +            if attr is None:
> +                return True
> +
> +            result.mismatch_reason = (
> +                '{} has a {} attribute, none expected'.format(
> +                    die, attr_name
> +                )
> +            )
> +            return False
> +
> +        # Capture instances matches anything and capture it
> +        if isinstance(attr_value, Capture):
> +            result.dict[attr_value.name] = attr
> +            return True
> +
> +        # If we reach this point, the attribute is supposed to be defined:
> +        # check it is.
> +        if attr is None:
> +            result.mismatch_reason = (
> +                '{} is missing a {} attribute'.format(die, attr_name)
> +            )
> +            return False
> +
> +        # Check the value of the attribute matches
> +        if isinstance(attr.value, Exprloc):
> +            is_matching = attr.value.matches(attr_value)
> +        else:
> +            is_matching = attr.value == attr_value
> +        if not is_matching:
> +            result.mismatch_reason = (
> +                '{}: {} is {}, expected to be {}'.format(
> +                    die, attr_name, attr.value, attr_value
> +                )
> +            )
> +            return False
> +
> +        # If no check failed, the attribute matches the pattern
> +        return True
> +
> +
> +class Capture(object):
> +    """Placeholder in Matcher tree patterns.
> +
> +    This is used to capture specific elements during pattern matching.
> +    """
> +    def __init__(self, name):
> +        """
> +        :param str name: Capture name.
> +        """
> +        self.name = name
> +
> +
> +class MatchResult(object):
> +    """Holder for the result of a DIE tree pattern match."""
> +
> +    def __init__(self):
> +        self.dict = {}
> +
> +        self.mismatch_reason = None
> +        """
> +        If left to None, the match succeded. Otherwise, must be set to a string
> +        that describes why the match failed.
> +
> +        :type: None|str
> +        """
> +
> +    @property
> +    def succeeded(self):
> +        return self.mismatch_reason is None
> +
> +    def capture(self, name):
> +        """Return what has been captured by the `name` capture.
> +
> +        This is valid iff the match succeded.
> +
> +        :param str name: Capture name:
> +        """
> +        return self.dict[name]
> diff --git a/gcc/testsuite/python/dwarfutils/helpers.py b/gcc/testsuite/python/dwarfutils/helpers.py
> new file mode 100644
> index 00000000000..f5e77896ae6
> --- /dev/null
> +++ b/gcc/testsuite/python/dwarfutils/helpers.py
> @@ -0,0 +1,11 @@
> +import sys
> +
> +
> +def as_ascii(str_or_byte):
> +    """
> +    Python 2/3 compatibility helper.
> +
> +    In Python 2, just return the input. In Python 3, decode the input as ASCII.
> +    """
> +    return (str_or_byte if sys.version_info.major < 3 else
> +            str_or_byte.decode('ascii'))
> diff --git a/gcc/testsuite/python/dwarfutils/objdump.py b/gcc/testsuite/python/dwarfutils/objdump.py
> new file mode 100644
> index 00000000000..52cfc06c03b
> --- /dev/null
> +++ b/gcc/testsuite/python/dwarfutils/objdump.py
> @@ -0,0 +1,338 @@
> +# Copyright (C) 2017 Free Software Foundation, Inc.
> +
> +# This program is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3 of the License, or
> +# (at your option) any later version.
> +#
> +# This program is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +#
> +# You should have received a copy of the GNU General Public License
> +# along with GCC; see the file COPYING3.  If not see
> +# <http://www.gnu.org/licenses/>.
> +
> +# objdump-based DWARF parser
> +
> +# TODO: for now, this assumes that there is only one compilation unit per
> +# object file. This should be implemented later if needed.
> +
> +import re
> +import subprocess
> +
> +import dwarfutils
> +from dwarfutils.data import Abbrev, CompilationUnit, Defer, DIE, Exprloc
> +from dwarfutils.helpers import as_ascii
> +
> +
> +abbrev_tag_re = re.compile(r'\s+(?P<number>\d+)'
> +                           r'\s+(?P<tag>DW_TAG_[a-zA-Z0-9_]+)'
> +                           r'\s+\[(?P<has_children>.*)\]')
> +attr_re = re.compile(r'\s+(?P<attr>DW_AT(_[a-zA-Z0-9_]+| value: \d+))'
> +                     r'\s+(?P<form>DW_FORM(_[a-zA-Z0-9_]+| value: \d+))')
> +
> +compilation_unit_re = re.compile(r'\s+Compilation Unit @ offset'
> +                                 r' (?P<offset>0x[0-9a-f]+):')
> +compilation_unit_attr_re = re.compile(r'\s+(?P<name>[A-Z][a-zA-Z ]*):'
> +                                      r'\s+(?P<value>.*)')
> +die_re = re.compile(r'\s+<(?P<level>\d+)>'
> +                    r'<(?P<offset>[0-9a-f]+)>:'
> +                    r' Abbrev Number: (?P<abbrev_number>\d+)'
> +                    r'( \((?P<tag>DW_TAG_[a-zA-Z0-9_]+)\))?')
> +die_attr_re = re.compile(r'\s+<(?P<offset>[0-9a-f]+)>'
> +                         r'\s+(?P<attr>DW_AT_[a-zA-Z0-9_]+)'
> +                         r'\s*: (?P<value>.*)')
> +
> +indirect_string_re = re.compile(r'\(indirect string, offset: 0x[0-9a-f]+\):'
> +                                r' (?P<value>.*)')
> +language_re = re.compile(r'(?P<number>\d+)\s+\((?P<name>.*)\)')
> +block_re = re.compile(r'\d+ byte block: (?P<value>[0-9a-f ]+)')
> +loc_expr_re = re.compile(r'\d+ byte block:'
> +                         r' (?P<bytes>[0-9a-f ]+)'
> +                         r'\s+\((?P<expr>.*)\)')
> +
> +
> +def parse_dwarf(object_file):
> +    """
> +    Implementation of dwarfutils.parse_dwarf for objdump.
> +
> +    Run objdump on `object_file` and parse the list compilation units it
> +    contains.
> +
> +    :param str object_file: Name of the object file to process.
> +    :rtype: list[CompilationUnit]
> +    """
> +    abbrevs = parse_abbrevs(object_file)
> +
> +    lines = [as_ascii(line).rstrip()
> +             for line in subprocess.check_output(
> +                 [dwarfutils.DWARF_DUMP_TOOL, '--dwarf=info', object_file]
> +             ).splitlines()
> +             if line.strip()]
> +    i = [0]
> +    def next_line():
> +        if i[0] >= len(lines):
> +            return None
> +        i[0] += 1
> +        return lines[i[0] - 1]
> +
> +    result = []
> +    die_stack = []
> +    last_die = None
> +
> +    while True:
> +        line = next_line()
> +        if line is None:
> +            break
> +
> +        # Try to match the beginning of a compilation unit
> +        m = compilation_unit_re.match(line)
> +        if m:
> +            offset = int(m.group('offset'), 16)
> +
> +            attrs = {}
> +            while True:
> +                m = compilation_unit_attr_re.match(next_line())
> +                if not m:
> +                    i[0] -= 1
> +                    break
> +                attrs[m.group('name')] = m.group('value')
> +
> +            length, is_32bit = attrs['Length'].split()
> +            length = int(length, 16)
> +            is_32bit = is_32bit == '(32-bit)'
> +
> +            version = int(attrs['Version'])
> +            abbrev_offset = int(attrs['Abbrev Offset'], 16)
> +            pointer_size = int(attrs['Pointer Size'])
> +
> +            assert abbrev_offset == 0, ('Multiple compilations unit are not'
> +                                        ' handled for now')
> +            abbrevs_sublist = list(abbrevs)
> +
> +            result.append(CompilationUnit(offset, length, is_32bit, version,
> +                                          abbrevs_sublist, pointer_size))
> +            continue
> +
> +        # Try to match the beginning of a DIE
> +        m = die_re.match(line)
> +        if m:
> +            assert result, 'Invalid DIE: missing containing compilation unit'
> +            cu = result[-1]
> +
> +            level = int(m.group('level'))
> +            offset = int(m.group('offset'), 16)
> +            abbrev_number = int(m.group('abbrev_number'))
> +            tag = m.group('tag')
> +
> +            assert level == len(die_stack)
> +
> +            # The end of child list is represented as a special DIE with
> +            # abbreviation number 0.
> +            if tag is None:
> +                assert abbrev_number == 0
> +                die_stack.pop()
> +                continue
> +
> +            die = DIE(cu, level, offset, abbrev_number)
> +            last_die = die
> +            assert die.tag == tag, 'Unexpected tag for {}: got {}'.format(
> +                die, tag
> +            )
> +            if die_stack:
> +                die_stack[-1].add_child(die)
> +            else:
> +                cu.set_root(die)
> +            if die.has_children:
> +                die_stack.append(die)
> +            continue
> +
> +        # Try to match an attribute
> +        m = die_attr_re.match(line)
> +        if m:
> +            assert die_stack, 'Invalid attribute: missing containing DIE'
> +            die = last_die
> +
> +            offset = int(m.group('offset'), 16)
> +            name = m.group('attr')
> +            value = m.group('value')
> +
> +            form = die.next_attribute_form(name)
> +            try:
> +                value_decoder = value_decoders[form]
> +            except KeyError:
> +                pass
> +            else:
> +                try:
> +                    value = value_decoder(die, name, form, offset, value)
> +                except ValueError:
> +                    print('Error while decoding {} ({}) at {:#x}: {}'.format(
> +                        name, form, offset, value
> +                    ))
> +                    raise
> +            die.add_attribute(name, form, offset, value)
> +            continue
> +
> +        # Otherwise, we must be processing "header" text before the dump
> +        # itself: just discard it.
> +        assert not result, 'Unhandled output: ' + line
> +
> +    return result
> +
> +
> +def parse_abbrevs(object_file):
> +    """
> +    Run objdump on `object_file` and parse the list of abbreviations it
> +    contains.
> +
> +    :param str object_file: Name of the object file to process.
> +    :rtype: list[Abbrev]
> +    """
> +    result = []
> +
> +    for line in subprocess.check_output(
> +        [dwarfutils.DWARF_DUMP_TOOL, '--dwarf=abbrev', object_file]
> +    ).splitlines():
> +        line = as_ascii(line).rstrip()
> +        if not line:
> +            continue
> +
> +        # Try to match a new abbrevation
> +        m = abbrev_tag_re.match(line)
> +        if m:
> +            number = int(m.group('number'))
> +            tag = m.group('tag')
> +            has_children = m.group('has_children')
> +            assert has_children in ('has children', 'no children')
> +            has_children = has_children == 'has children'
> +
> +            result.append(Abbrev(number, tag, has_children))
> +            continue
> +
> +        # Try to match an attribute
> +        m = attr_re.match(line)
> +        if m:
> +            assert result, 'Invalid attribute: missing containing abbreviation'
> +            name = m.group('attr')
> +            form = m.group('form')
> +
> +            # When objdump finds unknown abbreviation numbers or unknown form
> +            # numbers, it cannot turn them into names.
> +            if name.startswith('DW_AT value'):
> +                name = int(name.split()[-1])
> +            if form.startswith('DW_FORM value'):
> +                form = int(form.split()[-1])
> +
> +            # The (0, 0) couple marks the end of the attribute list
> +            if name != 0 or form != 0:
> +                result[-1].add_attribute(name, form)
> +            continue
> +
> +        # Otherwise, we must be processing "header" text before the dump
> +        # itself: just discard it.
> +        assert not result, 'Unhandled output: ' + line
> +
> +    return result
> +
> +
> +# Decoders for attribute values
> +
> +def _decode_flag_present(die, name, form, offset, value):
> +    return True
> +
> +
> +def _decode_flag(die, name, form, offset, value):
> +    return bool(int(value))
> +
> +
> +def _decode_data(die, name, form, offset, value):
> +    if name == 'DW_AT_language':
> +        m = language_re.match(value)
> +        assert m, 'Unhandled language value: {}'.format(value)
> +        return m.group('name')
> +
> +    elif name == 'DW_AT_encoding':
> +        m = language_re.match(value)
> +        assert m, 'Unhandled encoding value: {}'.format(value)
> +        return m.group('name')
> +
> +    return int(value, 16) if value.startswith('0x') else int(value)
> +
> +
> +def _decode_ref(die, name, form, offset, value):
> +    assert value[0] == '<' and value[-1] == '>'
> +    offset = int(value[1:-1], 16)
> +    return Defer(lambda: die.cu.offset_to_die[offset])
> +
> +
> +def _decode_indirect_string(die, name, form, offset, value):
> +    m = indirect_string_re.match(value)
> +    assert m, 'Unhandled indirect string: ' + value
> +    return m.group('value')
> +
> +
> +def _decode_block(die, name, form, offset, value, no_exprloc=False):
> +    if (
> +        not no_exprloc and
> +        name in ('DW_AT_location', 'DW_AT_data_member_location')
> +    ):
> +        return _decode_exprloc(die, name, form, offset, value, )
> +
> +    m = block_re.match(value)
> +    assert m, 'Unhandled block value: {}'.format(value)
> +    return [int(b, 16) for b in m.group('value').split()]
> +
> +
> +def _decode_exprloc(die, name, form, offset, value):
> +    m = loc_expr_re.match(value)
> +    if not m:
> +        # Even though they have the expected DW_FORM_exploc form, objdump does
> +        # not decode some location expressions such as DW_AT_byte_size. In this
> +        # case, return a dummy block decoding instead.
> +        # TODO: implement raw bytes parsing into expressions instead.
> +        return _decode_block(die, name, form, offset, value, no_exprloc=True)
> +
> +    byte_list = [int(b, 16) for b in m.group('bytes').split()]
> +
> +    expr = m.group('expr')
> +    operations = []
> +    for op in expr.split('; '):
> +        chunks = op.split(': ', 1)
> +        assert len(chunks) <= 2, (
> +            'Unhandled DWARF expression operation: {}'.format(op)
> +        )
> +        opcode = chunks[0]
> +        operands = chunks[1].split() if len(chunks) == 2 else []
> +        operations.append((opcode, ) + tuple(operands))
> +
> +    return Exprloc(byte_list, operations)
> +
> +
> +value_decoders = {
> +    'DW_FORM_flag_present': _decode_flag_present,
> +    'DW_FORM_flag': _decode_flag,
> +
> +    'DW_FORM_data1': _decode_data,
> +    'DW_FORM_data2': _decode_data,
> +    'DW_FORM_data4': _decode_data,
> +    'DW_FORM_data8': _decode_data,
> +    'DW_FORM_sdata': _decode_data,
> +    'DW_FORM_udata': _decode_data,
> +
> +    'DW_FORM_ref4': _decode_ref,
> +    'DW_FORM_ref8': _decode_ref,
> +
> +    'DW_FORM_strp': _decode_indirect_string,
> +
> +    'DW_FORM_block': _decode_block,
> +    'DW_FORM_block1': _decode_block,
> +    'DW_FORM_block2': _decode_block,
> +    'DW_FORM_block4': _decode_block,
> +    'DW_FORM_block8': _decode_block,
> +    'DW_FORM_block8': _decode_block,
> +    'DW_FORM_exprloc': _decode_exprloc,
> +
> +    # TODO: handle all existing forms
> +}
> --
> 2.13.0
>

Pierre-Marie de Rodat July 27, 2017, 8:59 a.m. UTC | #3

On 07/26/2017 07:09 PM, David Malcolm wrote:
>> +    If `single_cu` is True, make sure there is exactly one
>> compilation unit and
> 
> "is True" -> "is true"

Fixed.

>> +        :param bool or_error: When True, if `single` is True and no
>> attribute
> 
> "True" -> "true" in two places

Fixed.

>> +        :param None|(DIE) -> bool predicate: If provided, function
>> that filters
>> +            out DIEs when it returns False.

You did not suggested, but I replaced “False” with “false” to be 
consistent. ;-)

>> +        :param bool single: If True, look for a single DIE and raise
>> a
> 
> "True" -> "true", I suppose

Fixed.

>> +        If left to None, the match succeded. Otherwise, must be set
> 
> 
> "succeded" -> "succeeded"

Fixed.

>> +        This is valid iff the match succeded.
> 
> here again.

Likewise.

>> +    In Python 2, just return the input. In Python 3, decode the
>> input as ASCII.
>> +    """
>> +    return (str_or_byte if sys.version_info.major < 3 else
>> +            str_or_byte.decode('ascii'))
> 
> Aha!  Python 2 and Python 3.
> 
> 
> Presumably this all runs with LANG=C so that there's no danger of any
> non-ASCII bytes?  (bytes.decode('ascii' will raise a UnicodeDecodeError
> if any byte >=128).

I’m not sure about the interaction with the locale. What I thought was: 
I’ve never seen non-ASCII strings in DWARF, nor in objdump’s output. I 
know it’s theorically possible: if that happens in the future (like some 
language allows non-ASCII identifier and yield non-ASCII names in 
DWARF), we’ll only have this function to fix.

> There's a fair amount of non-trivial parsing going on here.
> I wonder if it would be helpful to add a "unittest" suite for the
> parsing?
> (e.g. to have some precanned fragments of objdump output as strings,
> and to verify that they're parsed as expected).
> 
> Note that I'm not a reviewer for the testsuite, so this is just a
> suggestion.

That’s a good idea. Actually I think it will be very easy to write such 
tests *and* to assess Python code coverage for them. I’ll do this if 
this proposal is in good way to be accepted.

> Hope this is constructive

It totally was: thank you very much!

Pierre-Marie de Rodat July 27, 2017, 10:08 a.m. UTC | #4

On 07/27/2017 10:36 AM, Richard Biener wrote:
> Given that gdb can decode dwarf and we rely on gdb for guality and
> gdb has python scripting can we somehow walk its dwarf tree from
> within a python script?  That is, not need the dwarf decoding or
> objdump requirement?

I’m quite familiar with GDB’s Python scripting API and unfortunately, 
no, it does not provide any access to raw debugging information: 
<https://sourceware.org/gdb/onlinedocs/gdb/Python-API.html>. All we have 
is access to ~source-level entities such as variables, functions and 
types (and “objfiles” themselves, but we can’t do anything interesting 
with them), so there is no way other way than testing dynamic behavior, 
i.e. checking that variables are properly read/decoded, etc. which is 
what we already do in guality tests.

> On IRC I suggested to use pre-existing python DWARF decoders
> which we might be able to import into the tree.  We'd still need them
> to handle non-ELF object formats or somehow extract DWARF from
> other containers to an ELF file (objcopy to the rescue...).
> 
> That said, not needing to write a DWARF / object file decoder
> would be nice.

Yes. On IRC, I mentionned pyelftools 
(https://github.com/eliben/pyelftools/), which knows about ELF and 
DWARF, and that, I think, we could plug on some PE/XCOFF/… extractor to 
parse embedded DWARF. In any case, I feel it would not be simpler than 
what I sent. Of course I’m still open to suggestions. :-)

> I see your testcases have associated .py files.  There are a few
> existing "simple" dwarf testcases that would benefit from being
> able to embed matching into the testcase source file itself?  Thus
> have TCL autogenerate a .py file for the testing from, say
> 
> /* { dg-final { scan-dwarf { "Matcher('DW_TAG_member', 'i',
>                        attrs={'DW_AT_type': Capture('s0_i_type')})" } } } */
> 
> do you think that's feasible or doesn't it make much sense because
> it would essentially match anywhere?  Or we'd end up with a
> gazillion of scan-dwarf variants?

I think this is a good idea! If it is technically possible to have such 
multi-line statements in comments, I think this would be easy. I’ll 
prepare the engine for the next patchset version and I’ll try to find 
existing tests that could be re-written this way. As long as the pattern 
isn’t too generic, I think it would makes sense: for instance if the 
input source has only one structure field called “i”, then the above 
pattern will make it possible to match its type precisely.

> I think a separate .py for checking is required anyway for the more
> complex cases.
I think so as well, for instance for the tests I sent so far.

diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2-py/dwarf2-py.exp b/gcc/testsuite/gcc.dg/debug/dwarf2-py/dwarf2-py.exp
new file mode 100644
index 00000000000..5c49bc81a55
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2-py/dwarf2-py.exp
@@ -0,0 +1,52 @@ 
+# Copyright (C) 2017 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+# 
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+# 
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# Testsuite driver for testcases that check the DWARF output with Python
+# scripts.
+
+load_lib gcc-dg.exp
+load_lib gcc-python.exp
+load_lib gcc-dwarf.exp
+
+# This series of tests require a working Python interpreter and a supported
+# host tool to dump DWARF.
+if { ![check-python-available] || ![detect-dwarf-dump-tool] } {
+    return
+}
+
+# If a testcase doesn't have special options, use these.
+global DEFAULT_CFLAGS
+if ![info exists DEFAULT_CFLAGS] then {
+    set DEFAULT_CFLAGS " -ansi -pedantic-errors -gdwarf"
+}
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+if {[check-python-available]} {
+    set comp_output [gcc_target_compile \
+	"$srcdir/$subdir/../trivial.c" "trivial.S" assembly \
+	"additional_flags=-gdwarf"]
+    if { ! [string match "*: target system does not support the * debug format*" \
+	$comp_output] } {
+	remove-build-file "trivial.S"
+	dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\] ] ] "" $DEFAULT_CFLAGS
+    }
+}
+
+# All done.
+dg-finish
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.c b/gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.c
new file mode 100644
index 00000000000..f7429a58179
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.c
@@ -0,0 +1,19 @@ 
+/* { dg-do assemble } */
+/* { dg-options "-gdwarf-3" } */
+/* { dg-final { python-test sso.py } } */
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+#define REVERSE_SSO __attribute__((scalar_storage_order("big-endian")));
+#else
+#define REVERSE_SSO __attribute__((scalar_storage_order("little-endian")));
+#endif
+
+struct S0 { int i; };
+
+struct S1 { int i; struct S0 s; } REVERSE_SSO;
+
+struct S2 { int a[4]; struct S0 s; } REVERSE_SSO;
+
+struct S0 s0;
+struct S1 s1;
+struct S2 s2;
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.py b/gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.py
new file mode 100644
index 00000000000..0c95abfe2b8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2-py/sso.py
@@ -0,0 +1,52 @@ 
+import dwarfutils
+from dwarfutils.data import Capture, DIE, Matcher
+from testutils import check
+
+
+cu = dwarfutils.parse_dwarf()
+s0 = cu.find(tag='DW_TAG_structure_type', name='S0')
+s1 = cu.find(tag='DW_TAG_structure_type', name='S1')
+s2 = cu.find(tag='DW_TAG_structure_type', name='S2')
+
+# Check the DIE structure of these structure types
+m0 = s0.tree_check(Matcher(
+    'DW_TAG_structure_type', 'S0',
+    children=[Matcher('DW_TAG_member', 'i',
+                      attrs={'DW_AT_type': Capture('s0_i_type')})]
+))
+m1 = s1.tree_check(Matcher(
+    'DW_TAG_structure_type', 'S1',
+    children=[
+        Matcher('DW_TAG_member', 'i',
+                attrs={'DW_AT_type': Capture('s1_i_type')}),
+        Matcher('DW_TAG_member', 's', attrs={'DW_AT_type': s0}),
+    ]
+))
+m2 = s2.tree_check(Matcher(
+    'DW_TAG_structure_type', 'S2',
+    children=[
+        Matcher('DW_TAG_member', 'a',
+                attrs={'DW_AT_type': Capture('s2_a_type')}),
+        Matcher('DW_TAG_member', 's', attrs={'DW_AT_type': s0}),
+    ]
+))
+
+# Now check that their scalar members have expected types
+s0_i_type = m0.capture('s0_i_type').value
+s1_i_type = m1.capture('s1_i_type').value
+s2_a_type = m2.capture('s2_a_type').value
+
+# S0.i must not have a DW_AT_endianity attribute.  S1.i must have one.
+s0_i_type.tree_check(Matcher('DW_TAG_base_type',
+                             attrs={'DW_AT_endianity': None}))
+s1_i_type.tree_check(Matcher('DW_TAG_base_type',
+                             attrs={'DW_AT_endianity': True}))
+
+# So does the integer type that S2.a contains.
+ma = s2_a_type.tree_check(Matcher(
+    'DW_TAG_array_type',
+    attrs={'DW_AT_type': Capture('element_type')}
+))
+element_type = ma.capture('element_type').value
+check(element_type == s1_i_type,
+      'check element type of S2.a is type of S1.i')
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.c b/gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.c
new file mode 100644
index 00000000000..e77adc0eaf5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.c
@@ -0,0 +1,13 @@ 
+/* PR 23190 */
+/* { dg-do assemble } */
+/* { dg-options "-O2 -gdwarf" } */
+/* { dg-final { python-test var2.py } } */
+
+static int foo;
+int bar;
+int main(void)
+{
+   foo += 3;
+   bar *= 5;
+   return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.py b/gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.py
new file mode 100644
index 00000000000..9a9b2c4a4ca
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/dwarf2-py/var2.py
@@ -0,0 +1,11 @@ 
+import dwarfutils
+from dwarfutils.data import Capture, DIE, Matcher
+from testutils import check
+
+
+cu = dwarfutils.parse_dwarf()
+foo = cu.find(tag='DW_TAG_variable', name='foo')
+bar = cu.find(tag='DW_TAG_variable', name='bar')
+
+foo.check_attr('DW_AT_location', [('DW_OP_addr', '0')])
+bar.check_attr('DW_AT_location', [('DW_OP_addr', '0')])
diff --git a/gcc/testsuite/gnat.dg/dg.exp b/gcc/testsuite/gnat.dg/dg.exp
index 228c71e85bb..dff86600957 100644
--- a/gcc/testsuite/gnat.dg/dg.exp
+++ b/gcc/testsuite/gnat.dg/dg.exp
@@ -18,6 +18,7 @@ 
 
 # Load support procs.
 load_lib gnat-dg.exp
+load_lib gcc-python.exp
 
 # If a testcase doesn't have special options, use these.
 global DEFAULT_CFLAGS
diff --git a/gcc/testsuite/gnat.dg/dwarf/debug11.adb b/gcc/testsuite/gnat.dg/dwarf/debug11.adb
new file mode 100644
index 00000000000..a87470925f1
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/dwarf/debug11.adb
@@ -0,0 +1,19 @@ 
+--  { dg-options "-cargs -O0 -g -dA -fgnat-encodings=minimal -margs" }
+--  { dg-do assemble }
+--  { dg-final { python-test debug11.py } }
+
+with Ada.Text_IO;
+
+procedure Debug11 is
+   type Rec_Type (C : Character) is record
+      case C is
+         when 'Z' .. Character'Val (128) => I : Integer;
+         when others                     => null;
+      end case;
+   end record;
+   --  R : Rec_Type := ('Z', 2);
+   R : Rec_Type ('Z');
+begin
+   R.I := 0;
+   Ada.Text_IO.Put_Line ("" & R.C);
+end Debug11;
diff --git a/gcc/testsuite/gnat.dg/dwarf/debug11.py b/gcc/testsuite/gnat.dg/dwarf/debug11.py
new file mode 100644
index 00000000000..26c3fdfeeda
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/dwarf/debug11.py
@@ -0,0 +1,51 @@ 
+import dwarfutils
+from dwarfutils.data import Capture, DIE, Matcher
+from testutils import check, print_pass
+
+
+cu = dwarfutils.parse_dwarf()
+rec_type = cu.find(tag='DW_TAG_structure_type', name='debug11__rec_type')
+
+check(rec_type.parent.matches(tag='DW_TAG_subprogram', name='debug11'),
+      'check that rec_type appears in the expected context')
+
+# Check that rec_type has the expected DIE structure
+m = rec_type.tree_check(Matcher(
+    'DW_TAG_structure_type', 'debug11__rec_type',
+    children=[
+        Matcher('DW_TAG_member', 'c', capture='c'),
+        Matcher(
+            'DW_TAG_variant_part',
+            attrs={'DW_AT_discr': Capture('discr')},
+            children=[
+                Matcher(
+                    'DW_TAG_variant',
+                    attrs={'DW_AT_discr_list': Capture('discr_list'),
+                           'DW_AT_discr_value': None},
+                    children=[
+                        Matcher('DW_TAG_member', 'i'),
+                    ]
+                ),
+                Matcher(
+                    'DW_TAG_variant',
+                    attrs={'DW_AT_discr_list': None,
+                           'DW_AT_discr_value': None},
+                    children=[]
+                )
+            ]
+        )
+    ]
+))
+
+# Check that DW_AT_discr refers to the expected DW_TAG_member
+c = m.capture('c')
+discr = m.capture('discr')
+check(c == discr.value, 'check that discriminant is {}'.format(discr.value))
+
+# Check that DW_AT_discr_list has the expected content: the C discriminant must
+# be properly described as unsigned, hence the 0x5a ('Z') and 0x80 0x01 (128)
+# values in the DW_AT_discr_list attribute. If it was described as signed, we
+# would have instead 90 and -128.
+discr_list = m.capture('discr_list')
+check(discr_list.value == [0x1, 0x5a, 0x80, 0x1],
+      'check discriminant list')
diff --git a/gcc/testsuite/gnat.dg/dwarf/debug12.adb b/gcc/testsuite/gnat.dg/dwarf/debug12.adb
new file mode 100644
index 00000000000..1fa9f27aa9b
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/dwarf/debug12.adb
@@ -0,0 +1,10 @@ 
+--  { dg-options "-cargs -gdwarf-4 -margs" }
+--  { dg-do assemble }
+--  { dg-final { python-test debug12.py } }
+
+package body Debug12 is
+   function Get_A2 return Boolean is
+   begin
+      return A2;
+   end Get_A2;
+end Debug12;
diff --git a/gcc/testsuite/gnat.dg/dwarf/debug12.ads b/gcc/testsuite/gnat.dg/dwarf/debug12.ads
new file mode 100644
index 00000000000..dbc5896cc73
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/dwarf/debug12.ads
@@ -0,0 +1,8 @@ 
+package Debug12 is
+   type Bit_Array is array (Positive range <>) of Boolean
+      with Pack;
+   A  : Bit_Array := (1 .. 10 => False);
+   A2 : Boolean renames A (2);
+
+   function Get_A2 return Boolean;
+end Debug12;
diff --git a/gcc/testsuite/gnat.dg/dwarf/debug12.py b/gcc/testsuite/gnat.dg/dwarf/debug12.py
new file mode 100644
index 00000000000..41e589b2ff1
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/dwarf/debug12.py
@@ -0,0 +1,9 @@ 
+import dwarfutils
+from dwarfutils.data import Capture, DIE, Matcher
+from testutils import check
+
+
+cu = dwarfutils.parse_dwarf()
+
+a2 = cu.find(tag='DW_TAG_variable', name='debug12__a2___XR_debug12__a___XEXS2')
+a2.check_attr('DW_AT_location', [('DW_OP_const1s', '-1')])
diff --git a/gcc/testsuite/gnat.dg/dwarf/debug9.adb b/gcc/testsuite/gnat.dg/dwarf/debug9.adb
new file mode 100644
index 00000000000..9ed66b55cdf
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/dwarf/debug9.adb
@@ -0,0 +1,45 @@ 
+--  { dg-options "-cargs -g -fgnat-encodings=minimal -dA -margs" }
+--  { dg-do assemble }
+--  { dg-final { python-test debug9.py } }
+
+procedure Debug9 is
+   type Array_Type is array (Natural range <>) of Integer;
+   type Record_Type (L1, L2 : Natural) is record
+      I1 : Integer;
+      A1 : Array_Type (1 .. L1);
+      I2 : Integer;
+      A2 : Array_Type (1 .. L2);
+      I3 : Integer;
+   end record;
+
+   function Get (L1, L2 : Natural) return Record_Type is
+      Result : Record_Type (L1, L2);
+   begin
+      Result.I1 := 1;
+      for I in Result.A1'Range loop
+         Result.A1 (I) := I;
+      end loop;
+      Result.I2 := 2;
+      for I in Result.A2'Range loop
+         Result.A2 (I) := I;
+      end loop;
+      Result.I3 := 3;
+      return Result;
+   end Get;
+
+   R1 : Record_Type := Get (0, 0);
+   R2 : Record_Type := Get (1, 0);
+   R3 : Record_Type := Get (0, 1);
+   R4 : Record_Type := Get (2, 2);
+
+   procedure Process (R : Record_Type) is
+   begin
+      null;
+   end Process;
+
+begin
+   Process (R1);
+   Process (R2);
+   Process (R3);
+   Process (R4);
+end Debug9;
diff --git a/gcc/testsuite/gnat.dg/dwarf/debug9.py b/gcc/testsuite/gnat.dg/dwarf/debug9.py
new file mode 100644
index 00000000000..560f69d4ec7
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/dwarf/debug9.py
@@ -0,0 +1,22 @@ 
+import dwarfutils
+from dwarfutils.data import Capture, DIE, Matcher
+from testutils import check, print_pass, print_fail
+
+
+cu = dwarfutils.parse_dwarf()
+cu_die = cu.root
+
+# Check that array and structure types are not declared as compilation
+# unit-level types.
+types = cu.find(
+    predicate=lambda die: die.tag in ('DW_TAG_structure_type ',
+                                      'DW_TAG_array_type'),
+    single=False
+)
+
+global_types = [t for t in types if t.parent == cu_die]
+check(not global_types, 'check composite types are not global')
+if global_types:
+    print('Global types:')
+    for t in global_types:
+        print('  {}'.format(t))
diff --git a/gcc/testsuite/gnat.dg/dwarf/dwarf.exp b/gcc/testsuite/gnat.dg/dwarf/dwarf.exp
new file mode 100644
index 00000000000..cbf21a9829a
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/dwarf/dwarf.exp
@@ -0,0 +1,39 @@ 
+# Copyright (C) 2017 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+# 
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+# 
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# Testsuite driver for testcases that check the DWARF output with Python
+# scripts.
+
+load_lib gnat-dg.exp
+load_lib gcc-python.exp
+load_lib gcc-dwarf.exp
+
+# This series of tests require a working Python interpreter and a supported
+# host tool to dump DWARF.
+if { ![check-python-available] || ![detect-dwarf-dump-tool] } {
+    return
+}
+
+# Initialize `dg'.
+dg-init
+
+# Main loop.
+if {[check-python-available]} {
+    dg-runtest [lsort [glob $srcdir/$subdir/*.adb]] "" ""
+}
+
+# All done.
+dg-finish
diff --git a/gcc/testsuite/lib/gcc-dwarf.exp b/gcc/testsuite/lib/gcc-dwarf.exp
new file mode 100644
index 00000000000..5e0e6117e16
--- /dev/null
+++ b/gcc/testsuite/lib/gcc-dwarf.exp
@@ -0,0 +1,41 @@ 
+# Copyright (C) 2017 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+# 
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+# 
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# Helpers to run tools to dump DWARF
+
+load_lib "remote.exp"
+
+# Look for a tool that we can use to dump DWARF. If nothing is found, return 0.
+#
+# If one is found, return 1, set the DWARF_DUMP_TOOL_KIND environment variable
+# to contain the class of tool detected (e.g. objdump) and set the
+# DWARF_DUMP_TOOL to the name of the tool program (e.g. arm-eabi-objdump).
+
+proc detect-dwarf-dump-tool { args } {
+
+    # Look for an objdump corresponding to the current target
+    set objdump [transform objdump]
+    set result [local_exec "which $objdump" "" "" 300]
+    set status [lindex $result 0]
+
+    if { $status == 0 } {
+	setenv DWARF_DUMP_TOOL_KIND objdump
+	setenv DWARF_DUMP_TOOL $objdump
+	return 1
+    }
+
+    return 0
+}
diff --git a/gcc/testsuite/python/dwarfutils/__init__.py b/gcc/testsuite/python/dwarfutils/__init__.py
new file mode 100644
index 00000000000..246fbbd15be
--- /dev/null
+++ b/gcc/testsuite/python/dwarfutils/__init__.py
@@ -0,0 +1,70 @@ 
+# Copyright (C) 2017 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# Helpers to parse and check DWARF in object files.
+#
+# The purpose of these is to make it easy to write "smart" tests on DWARF
+# information: pattern matching on DIEs and their attributes, check links
+# between DIEs, etc. Doing these checks using abstract representations of DIEs
+# is far easier than scanning the generated assembly!
+
+import os
+import sys
+
+import dwarfutils.objdump
+
+
+# Fetch the DWARF parsing function that correspond to the DWARF dump tool to
+# use.
+DWARF_DUMP_TOOL_KIND = os.environ['DWARF_DUMP_TOOL_KIND']
+DWARF_DUMP_TOOL = os.environ['DWARF_DUMP_TOOL']
+
+dwarf_parsers = {
+    'objdump': dwarfutils.objdump.parse_dwarf,
+}
+try:
+    dwarf_parser = dwarf_parsers[DWARF_DUMP_TOOL_KIND]
+except KeyError:
+    raise RuntimeError('Unhandled DWARF dump tool: {}'.format(
+        DWARF_DUMP_TOOL_KIND
+    ))
+
+
+def parse_dwarf(object_file=None, single_cu=True):
+    """
+    Fetch and decode DWARF compilation units in `object_file`.
+
+    If `single_cu` is True, make sure there is exactly one compilation unit and
+    return it. Otherwise, return compilation units as a list.
+
+    :param str|None object_file: Name of the object file to process. If left to
+        None, `sys.argv[1]` is used instead.
+
+    :rtype: dwarfutils.data.CompilationUnit
+           |list[dwarfutils.data.CompilationUnit]
+    """
+    if object_file is None:
+        object_file = sys.argv[1]
+    result = dwarf_parser(object_file)
+
+    if single_cu:
+        if not result:
+            return None
+        if len(result) > 1:
+            raise ValueError('Multiple compilation units found')
+        return result[0]
+    else:
+        return result
diff --git a/gcc/testsuite/python/dwarfutils/data.py b/gcc/testsuite/python/dwarfutils/data.py
new file mode 100644
index 00000000000..6b91d5bd779
--- /dev/null
+++ b/gcc/testsuite/python/dwarfutils/data.py
@@ -0,0 +1,597 @@ 
+# Copyright (C) 2017 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# Data structures to represent DWARF compilation units, DIEs and attributes,
+# and helpers to perform various checks on them.
+
+from testutils import check
+
+
+class Abbrev(object):
+    """DWARF abbreviation entry."""
+
+    def __init__(self, number, tag, has_children):
+        """
+        :param int number: Abbreviation number, which is 1-based, as in the
+            DWARF standard.
+        :param str|int tag: Tag name or, if unknown, tag number.
+        :param bool has_children: Whether DIEs will have children.
+        """
+        self.number = number
+        self.tag = tag
+        self.has_children = has_children
+        self.attributes = []
+
+    def add_attribute(self, name, form):
+        """
+        :param str|int name: Attribute name or, if unknown, attribute number.
+        :param str form: Form for this attribute.
+        """
+        self.attributes.append((name, form))
+
+
+class CompilationUnit(object):
+    """DWARF compilation unit."""
+
+    def __init__(self, offset, length, is_32bit, version, abbrevs,
+                 pointer_size):
+        """
+        :param int offset: Offset of this compilation unit in the .debug_info
+            section.
+        :param int length: Value of the length field for this compilation unit.
+        :param bool is_32bit: Whether this compilation unit is encoded in the
+            32-bit format. If not, it must be the 64-bit one.
+        :param int version: DWARF version used by this compilation unit.
+        :param list[Abbrev] abbrevs: List of abbreviations for this compilation
+            unit.
+        :param int pointer_size: Size of pointers for this architecture.
+        """
+        self.offset = offset
+        self.length = length
+        self.is_32bit = is_32bit
+        self.version = version
+        self.abbrevs = abbrevs
+        self.pointer_size = pointer_size
+
+        self.root = None
+        self.offset_to_die = {}
+
+    def set_root(self, die):
+        assert self.root is None, ('Trying to create the root DIE of a'
+                                   ' compilation unit that already has one')
+        self.root = die
+
+    def find(self, *args, **kwargs):
+        return self.root.find(*args, **kwargs)
+
+
+class DIE(object):
+    """DWARF information entry."""
+
+    def __init__(self, cu, level, offset, abbrev_number):
+        """
+        :param CompilationUnit cu: Compilation unit this DIE belongs to.
+        :param int level: Depth for this DIE.
+        :param int offset: Offset of this DIE in the .debug_info section.
+        :param int abbrev_number: Abbreviation number for this DIE.
+        """
+        self.cu = cu
+        self.cu.offset_to_die[offset] = self
+
+        self.level = level
+        self.offset = offset
+        self.abbrev_number = abbrev_number
+
+        self.parent = None
+        self.attributes = []
+        self.children = []
+
+    @property
+    def abbrev(self):
+        """Abbreviation for this DIE.
+
+        :rtype: Abbrev
+        """
+        # The abbreviation number is 1-based, but list indexes are 0-based
+        return self.cu.abbrevs[self.abbrev_number - 1]
+
+    @property
+    def tag(self):
+        """Tag for this DIE.
+
+        :rtype: str|int
+        """
+        return self.abbrev.tag
+
+    @property
+    def has_children(self):
+        return self.abbrev.has_children
+
+    def get_attr(self, name, single=True, or_error=True):
+        """Look for an attribute in this DIE.
+
+        :param str|int name: Attribute name, or number if name is unknown.
+        :param bool single: If true, this will raise a KeyError for
+            zero/multiple matches and return an Attribute instance when found.
+            Otherwise, return a potentially empty list of attributes.
+        :param bool or_error: When True, if `single` is True and no attribute
+            is found, return None instead of raising a KeyError.
+        :rtype: Attribute|list[Attribute]
+        """
+        result = [a for a in self.attributes if a.name == name]
+
+        if single:
+            if not result:
+                if or_error:
+                    raise KeyError('No {} attribute in {}'.format(name, self))
+                else:
+                    return None
+            if len(result) > 1:
+                raise KeyError('Multiple {} attributes in {}'.format(name,
+                                                                     self))
+            return result[0]
+        else:
+            return result
+
+    def check_attr(self, name, value):
+        """Check for the presence/value of an attribute.
+
+        :param str|int name: Attribute name, or number if name is unknown.
+        :param value: If None, check that the attribute is not present.
+            Otherwise, check that the attribute exists and that its value
+            matches `value`.
+        """
+        m = MatchResult()
+        Matcher._match_attr(self, name, value, m)
+        check(
+            m.succeeded,
+            m.mismatch_reason or 'check attribute {} of {}'.format(name, self)
+        )
+
+    def get_child(self, child_index):
+        """Get a DIE child.
+
+        :param int child_index: Index of the child to fetch (zero-based index).
+        :rtype: DIE
+        """
+        return self.children[child_index]
+
+    @property
+    def name(self):
+        """Return the name (DW_AT_name) for this DIE, if any.
+
+        :rtype: str|None
+        """
+        name = self.get_attr('DW_AT_name', or_error=False)
+        return name.value if name is not None else None
+
+    def __str__(self):
+        tag = (self.tag if isinstance(self.tag, str) else
+               'DIE {}'.format(self.tag))
+        name = self.name
+        fmt = '{tag} "{name}"' if name else '{tag}'
+        return fmt.format(tag=tag, name=name)
+
+    def __repr__(self):
+        return '<{} at {:#x}>'.format(self, self.offset)
+
+    def matches(self, tag=None, name=None):
+        """Return whether this DIE matches expectations.
+
+        :rtype: bool
+        """
+        return ((tag is None or self.tag == tag) and
+                (name is None or self.name == name))
+
+    def tree_matches(self, matcher):
+        """Match this DIE against the given match object.
+
+        :param Matcher matcher: Match object used to check the structure of
+            this DIE.
+        :rtype: MatchResult
+        """
+        return matcher.matches(self)
+
+    def tree_check(self, matcher):
+        """Like `tree_matches`, but also check that the DIE matches."""
+        m = self.tree_matches(matcher)
+        check(
+            m.succeeded,
+            m.mismatch_reason or 'check structure of {}'.format(self)
+        )
+        return m
+
+    def find(self, predicate=None, tag=None, name=None, recursive=True,
+             single=True):
+        """Look for a DIE that satisfies the given expectations.
+
+        :param None|(DIE) -> bool predicate: If provided, function that filters
+            out DIEs when it returns False.
+        :param str|int|None tag: If provided, filter out DIEs whose tag does
+            not match.
+        :param str|None name: If provided, filter out DIEs whose name (see
+            the `name` property) does not match.
+        :param bool recursive: If True, perform the search recursively in
+            self's children.
+        :param bool single: If True, look for a single DIE and raise a
+            ValueError if none or several DIEs are found. Otherwise, return a
+            potentially empty list of DIEs.
+
+        :rtype: DIE|list[DIE]
+        """
+        def p(die):
+            return ((predicate is None or predicate(die)) and
+                    die.matches(tag, name))
+        result = self._find(p, recursive)
+
+        if single:
+            if not result:
+                raise ValueError('No matching DIE found')
+            if len(result) > 1:
+                raise ValueError('Multiple matching DIEs found')
+            return result[0]
+        else:
+            return result
+
+    def _find(self, predicate, recursive):
+        result = []
+
+        if predicate(self):
+            result.append(self)
+
+        for c in self.children:
+            if not recursive:
+                if predicate(c):
+                    result.append(c)
+            else:
+                result.extend(c._find(predicate, recursive))
+
+        return result
+
+    def next_attribute_form(self, name):
+        """Return the form of the next attribute this DIE requires.
+
+        Used during DIE tree construction.
+
+        :param str name: Expected name for this attribute. The abbreviation
+            will confirm it.
+        :rtype: str
+        """
+        assert len(self.attributes) < len(self.abbrev.attributes)
+        expected_name, form = self.abbrev.attributes[len(self.attributes)]
+        assert name == expected_name, (
+            'Attribute desynchronization in {}'.format(self)
+        )
+        return form
+
+    def add_attribute(self, name, form, offset, value):
+        """Add an attribute to this DIE.
+
+        Used during DIE tree construction. See Attribute's constructor for the
+        meaning of arguments.
+        """
+        self.attributes.append(Attribute(self, name, form, offset, value))
+
+    def add_child(self, child):
+        """Add a DIE child to this DIE.
+
+        Used during DIE tree construction.
+
+        :param DIE child: DIE to append.
+        """
+        assert self.has_children
+        assert child.parent is None
+        child.parent = self
+        self.children.append(child)
+
+
+class Attribute(object):
+    """DIE attribute."""
+
+    def __init__(self, die, name, form, offset, value):
+        """
+        :param DIE die: DIE that will own this attribute.
+        :param str|int name: Attribute name, or attribute number if unknown.
+        :param str form: Attribute form.
+        :param int offset: Offset of this attribute in the .debug_info section.
+        :param value: Decoded value for this attribute. If it's a Defer
+            instance, decoding will happen the first time the "value" property
+            is evaluated.
+        """
+        self.die = die
+        self.name = name
+        self.form = form
+        self.offset = offset
+
+        if isinstance(value, Defer):
+            self._value = None
+            self._value_getter = value
+        else:
+            self._value = value
+            self._value_getter = None
+            self._refine_value()
+
+    @property
+    def value(self):
+        if self._value_getter:
+            self._value = self._value_getter.get()
+            self._value_getter = None
+            self._refine_value()
+        return self._value
+
+    def _refine_value(self):
+        # If we hold a location expression, bind it to this attribute
+        if isinstance(self._value, Exprloc):
+            self._value.attribute = self
+
+    def __repr__(self):
+        label = (self.name if isinstance(self.name, str) else
+                 'Attribute {}'.format(self.name))
+        return '<{} at {:#x}>'.format(label, self.offset)
+
+
+class Exprloc(object):
+    """DWARF location expression."""
+
+    def __init__(self, byte_list, operations):
+        """
+        :param list[int] byte_list: List of bytes that encode this expression.
+        :param list[(str, ...)] operations: List of operations this expression
+            contains. Each expression is a tuple whose first element is the
+            opcode name (DW_OP_...) and whose other elements are operands.
+        """
+        self.attribute = None
+        self.byte_list = byte_list
+        self.operations = operations
+
+    @property
+    def die(self):
+        return self.attribute.die
+
+    @staticmethod
+    def format_operation(operation):
+        opcode = operation[0]
+        operands = operation[1:]
+        return '{}: {}'.format(opcode, ' '.join(operands))
+
+    def matches(self, operations):
+        """Match this list of operations to `operations`.
+
+        :param list[(str, ...)] operations: List of operations to match.
+        :rtype: bool
+        """
+        return self.operations == operations
+
+    def __repr__(self):
+        return '{} ({})'.format(
+            ' '.join(hex(b) for b in self.byte_list),
+            '; '.join(self.format_operation(op) for op in self.operations)
+        )
+
+
+class Defer(object):
+    """Helper to defer a computation."""
+
+    def __init__(self, func):
+        """
+        :param () -> T func: Callback to perform the computation.
+        """
+        self.func = func
+
+    def get(self):
+        """
+        :rtype: T
+        """
+        return self.func()
+
+
+class Matcher(object):
+    """Specification for DIE tree pattern matching."""
+
+    def __init__(self, tag=None, name=None, attrs=None, children=None,
+                 capture=None):
+        """
+        :param None|str tag: If provided, name of the tag that DIEs must match.
+        :param None|str name: If provided, name that DIEs must match (see the
+            DIE.name property).
+        :param attrs: If provided, dictionary that specifies attribute
+            expectations. Keys are attribute names. Values can be:
+
+              * None, so that attribute must be undefined in the DIE;
+              * a value, so that attribute must be defined and the value must
+                match;
+              * a Capture instance, so that the attribute value (or None, if
+                undefined) is captured.
+
+        :param None | list[DIE|Capture] children: If provided, list of DIEs
+            that children must match. Capture instances match any DIE and
+            captures it.
+
+        :param str|None capture: If provided, capture the DIE to match with the
+            given name.
+        """
+        self.tag = tag
+        self.name = name
+        self.attrs = attrs
+        self.children = children
+        self.capture_name = capture
+
+    def matches(self, die):
+        """Pattern match the given DIE.
+
+        :param DIE die: DIE to match.
+        :rtype: MatchResult
+        """
+        result = MatchResult()
+        self._match_die(die, result)
+        return result
+
+    def _match_die(self, die, result):
+        """Helper for the "matches" method.
+
+        Return whether DIE could be matched. If not, a message to describe why
+        is recorded in `result`.
+
+        :param DIE die: DIE to match.
+        :param MatchResult result: Holder for the result of the match.
+        :rtype: bool
+        """
+
+        # If asked to, check the DIE tag
+        if self.tag is not None and self.tag != die.tag:
+            result.mismatch_reason = '{} is expected to be a {}'.format(
+                die, self.tag
+            )
+            return False
+
+        # If asked to, check the DIE name
+        if self.name is not None and self.name != die.name:
+            result.mismatch_reason = (
+                '{} is expected to be called "{}"'.format(self.name,
+                                                          die.name)
+            )
+            return False
+
+        # Check attribute expectations
+        if self.attrs:
+            for n, v in self.attrs.items():
+                if not self._match_attr(die, n, v, result):
+                    return False
+
+        # Check children expectations
+        if self.children is not None:
+
+            # The number of children must match
+            if len(self.children) != len(die.children):
+                result.mismatch_reason = (
+                    '{} has {} children, {} expected'.format(
+                        die, len(die.children), len(self.children)
+                    )
+                )
+                return False
+
+            # Then each child must match the corresponding child matcher
+            for matcher_child, die_child in zip(self.children,
+                                                die.children):
+                # Capture instances matches anything and captures it
+                if isinstance(matcher_child, Capture):
+                    result.dict[matcher_child.name] = die_child
+
+                elif not matcher_child._match_die(die_child, result):
+                    return False
+
+        # Capture the input DIE if asked to
+        if self.capture_name:
+            result.dict[self.capture_name] = die
+
+        # If no check failed, the DIE matches the pattern
+        return True
+
+    @staticmethod
+    def _match_attr(die, attr_name, attr_value, result):
+        """Helper for the "matches" method.
+
+        Return whether the `attr_name` attribute in DIE matches the
+        `attr_value` expectation. If not, a message to describe why is recorded
+        in `result`.
+
+        :param DIE die: DIE that contain the attribute to match.
+        :param str attr_name: Attribute name.
+        :param attr_value: Attribute expectation. See attrs's description in
+            Match.__init__ docstring for possible values.
+        """
+        attr = die.get_attr(attr_name, or_error=False)
+
+        if attr_value is None:
+            # The attribute is expected not to be defined
+            if attr is None:
+                return True
+
+            result.mismatch_reason = (
+                '{} has a {} attribute, none expected'.format(
+                    die, attr_name
+                )
+            )
+            return False
+
+        # Capture instances matches anything and capture it
+        if isinstance(attr_value, Capture):
+            result.dict[attr_value.name] = attr
+            return True
+
+        # If we reach this point, the attribute is supposed to be defined:
+        # check it is.
+        if attr is None:
+            result.mismatch_reason = (
+                '{} is missing a {} attribute'.format(die, attr_name)
+            )
+            return False
+
+        # Check the value of the attribute matches
+        if isinstance(attr.value, Exprloc):
+            is_matching = attr.value.matches(attr_value)
+        else:
+            is_matching = attr.value == attr_value
+        if not is_matching:
+            result.mismatch_reason = (
+                '{}: {} is {}, expected to be {}'.format(
+                    die, attr_name, attr.value, attr_value
+                )
+            )
+            return False
+
+        # If no check failed, the attribute matches the pattern
+        return True
+
+
+class Capture(object):
+    """Placeholder in Matcher tree patterns.
+
+    This is used to capture specific elements during pattern matching.
+    """
+    def __init__(self, name):
+        """
+        :param str name: Capture name.
+        """
+        self.name = name
+
+
+class MatchResult(object):
+    """Holder for the result of a DIE tree pattern match."""
+
+    def __init__(self):
+        self.dict = {}
+
+        self.mismatch_reason = None
+        """
+        If left to None, the match succeded. Otherwise, must be set to a string
+        that describes why the match failed.
+
+        :type: None|str
+        """
+
+    @property
+    def succeeded(self):
+        return self.mismatch_reason is None
+
+    def capture(self, name):
+        """Return what has been captured by the `name` capture.
+
+        This is valid iff the match succeded.
+
+        :param str name: Capture name:
+        """
+        return self.dict[name]
diff --git a/gcc/testsuite/python/dwarfutils/helpers.py b/gcc/testsuite/python/dwarfutils/helpers.py
new file mode 100644
index 00000000000..f5e77896ae6
--- /dev/null
+++ b/gcc/testsuite/python/dwarfutils/helpers.py
@@ -0,0 +1,11 @@ 
+import sys
+
+
+def as_ascii(str_or_byte):
+    """
+    Python 2/3 compatibility helper.
+
+    In Python 2, just return the input. In Python 3, decode the input as ASCII.
+    """
+    return (str_or_byte if sys.version_info.major < 3 else
+            str_or_byte.decode('ascii'))
diff --git a/gcc/testsuite/python/dwarfutils/objdump.py b/gcc/testsuite/python/dwarfutils/objdump.py
new file mode 100644
index 00000000000..52cfc06c03b
--- /dev/null
+++ b/gcc/testsuite/python/dwarfutils/objdump.py
@@ -0,0 +1,338 @@ 
+# Copyright (C) 2017 Free Software Foundation, Inc.
+
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# objdump-based DWARF parser
+
+# TODO: for now, this assumes that there is only one compilation unit per
+# object file. This should be implemented later if needed.
+
+import re
+import subprocess
+
+import dwarfutils
+from dwarfutils.data import Abbrev, CompilationUnit, Defer, DIE, Exprloc
+from dwarfutils.helpers import as_ascii
+
+
+abbrev_tag_re = re.compile(r'\s+(?P<number>\d+)'
+                           r'\s+(?P<tag>DW_TAG_[a-zA-Z0-9_]+)'
+                           r'\s+\[(?P<has_children>.*)\]')
+attr_re = re.compile(r'\s+(?P<attr>DW_AT(_[a-zA-Z0-9_]+| value: \d+))'
+                     r'\s+(?P<form>DW_FORM(_[a-zA-Z0-9_]+| value: \d+))')
+
+compilation_unit_re = re.compile(r'\s+Compilation Unit @ offset'
+                                 r' (?P<offset>0x[0-9a-f]+):')
+compilation_unit_attr_re = re.compile(r'\s+(?P<name>[A-Z][a-zA-Z ]*):'
+                                      r'\s+(?P<value>.*)')
+die_re = re.compile(r'\s+<(?P<level>\d+)>'
+                    r'<(?P<offset>[0-9a-f]+)>:'
+                    r' Abbrev Number: (?P<abbrev_number>\d+)'
+                    r'( \((?P<tag>DW_TAG_[a-zA-Z0-9_]+)\))?')
+die_attr_re = re.compile(r'\s+<(?P<offset>[0-9a-f]+)>'
+                         r'\s+(?P<attr>DW_AT_[a-zA-Z0-9_]+)'
+                         r'\s*: (?P<value>.*)')
+
+indirect_string_re = re.compile(r'\(indirect string, offset: 0x[0-9a-f]+\):'
+                                r' (?P<value>.*)')
+language_re = re.compile(r'(?P<number>\d+)\s+\((?P<name>.*)\)')
+block_re = re.compile(r'\d+ byte block: (?P<value>[0-9a-f ]+)')
+loc_expr_re = re.compile(r'\d+ byte block:'
+                         r' (?P<bytes>[0-9a-f ]+)'
+                         r'\s+\((?P<expr>.*)\)')
+
+
+def parse_dwarf(object_file):
+    """
+    Implementation of dwarfutils.parse_dwarf for objdump.
+
+    Run objdump on `object_file` and parse the list compilation units it
+    contains.
+
+    :param str object_file: Name of the object file to process.
+    :rtype: list[CompilationUnit]
+    """
+    abbrevs = parse_abbrevs(object_file)
+
+    lines = [as_ascii(line).rstrip()
+             for line in subprocess.check_output(
+                 [dwarfutils.DWARF_DUMP_TOOL, '--dwarf=info', object_file]
+             ).splitlines()
+             if line.strip()]
+    i = [0]
+    def next_line():
+        if i[0] >= len(lines):
+            return None
+        i[0] += 1
+        return lines[i[0] - 1]
+
+    result = []
+    die_stack = []
+    last_die = None
+
+    while True:
+        line = next_line()
+        if line is None:
+            break
+
+        # Try to match the beginning of a compilation unit
+        m = compilation_unit_re.match(line)
+        if m:
+            offset = int(m.group('offset'), 16)
+
+            attrs = {}
+            while True:
+                m = compilation_unit_attr_re.match(next_line())
+                if not m:
+                    i[0] -= 1
+                    break
+                attrs[m.group('name')] = m.group('value')
+
+            length, is_32bit = attrs['Length'].split()
+            length = int(length, 16)
+            is_32bit = is_32bit == '(32-bit)'
+
+            version = int(attrs['Version'])
+            abbrev_offset = int(attrs['Abbrev Offset'], 16)
+            pointer_size = int(attrs['Pointer Size'])
+
+            assert abbrev_offset == 0, ('Multiple compilations unit are not'
+                                        ' handled for now')
+            abbrevs_sublist = list(abbrevs)
+
+            result.append(CompilationUnit(offset, length, is_32bit, version,
+                                          abbrevs_sublist, pointer_size))
+            continue
+
+        # Try to match the beginning of a DIE
+        m = die_re.match(line)
+        if m:
+            assert result, 'Invalid DIE: missing containing compilation unit'
+            cu = result[-1]
+
+            level = int(m.group('level'))
+            offset = int(m.group('offset'), 16)
+            abbrev_number = int(m.group('abbrev_number'))
+            tag = m.group('tag')
+
+            assert level == len(die_stack)
+
+            # The end of child list is represented as a special DIE with
+            # abbreviation number 0.
+            if tag is None:
+                assert abbrev_number == 0
+                die_stack.pop()
+                continue
+
+            die = DIE(cu, level, offset, abbrev_number)
+            last_die = die
+            assert die.tag == tag, 'Unexpected tag for {}: got {}'.format(
+                die, tag
+            )
+            if die_stack:
+                die_stack[-1].add_child(die)
+            else:
+                cu.set_root(die)
+            if die.has_children:
+                die_stack.append(die)
+            continue
+
+        # Try to match an attribute
+        m = die_attr_re.match(line)
+        if m:
+            assert die_stack, 'Invalid attribute: missing containing DIE'
+            die = last_die
+
+            offset = int(m.group('offset'), 16)
+            name = m.group('attr')
+            value = m.group('value')
+
+            form = die.next_attribute_form(name)
+            try:
+                value_decoder = value_decoders[form]
+            except KeyError:
+                pass
+            else:
+                try:
+                    value = value_decoder(die, name, form, offset, value)
+                except ValueError:
+                    print('Error while decoding {} ({}) at {:#x}: {}'.format(
+                        name, form, offset, value
+                    ))
+                    raise
+            die.add_attribute(name, form, offset, value)
+            continue
+
+        # Otherwise, we must be processing "header" text before the dump
+        # itself: just discard it.
+        assert not result, 'Unhandled output: ' + line
+
+    return result
+
+
+def parse_abbrevs(object_file):
+    """
+    Run objdump on `object_file` and parse the list of abbreviations it
+    contains.
+
+    :param str object_file: Name of the object file to process.
+    :rtype: list[Abbrev]
+    """
+    result = []
+
+    for line in subprocess.check_output(
+        [dwarfutils.DWARF_DUMP_TOOL, '--dwarf=abbrev', object_file]
+    ).splitlines():
+        line = as_ascii(line).rstrip()
+        if not line:
+            continue
+
+        # Try to match a new abbrevation
+        m = abbrev_tag_re.match(line)
+        if m:
+            number = int(m.group('number'))
+            tag = m.group('tag')
+            has_children = m.group('has_children')
+            assert has_children in ('has children', 'no children')
+            has_children = has_children == 'has children'
+
+            result.append(Abbrev(number, tag, has_children))
+            continue
+
+        # Try to match an attribute
+        m = attr_re.match(line)
+        if m:
+            assert result, 'Invalid attribute: missing containing abbreviation'
+            name = m.group('attr')
+            form = m.group('form')
+
+            # When objdump finds unknown abbreviation numbers or unknown form
+            # numbers, it cannot turn them into names.
+            if name.startswith('DW_AT value'):
+                name = int(name.split()[-1])
+            if form.startswith('DW_FORM value'):
+                form = int(form.split()[-1])
+
+            # The (0, 0) couple marks the end of the attribute list
+            if name != 0 or form != 0:
+                result[-1].add_attribute(name, form)
+            continue
+
+        # Otherwise, we must be processing "header" text before the dump
+        # itself: just discard it.
+        assert not result, 'Unhandled output: ' + line
+
+    return result
+
+
+# Decoders for attribute values
+
+def _decode_flag_present(die, name, form, offset, value):
+    return True
+
+
+def _decode_flag(die, name, form, offset, value):
+    return bool(int(value))
+
+
+def _decode_data(die, name, form, offset, value):
+    if name == 'DW_AT_language':
+        m = language_re.match(value)
+        assert m, 'Unhandled language value: {}'.format(value)
+        return m.group('name')
+
+    elif name == 'DW_AT_encoding':
+        m = language_re.match(value)
+        assert m, 'Unhandled encoding value: {}'.format(value)
+        return m.group('name')
+
+    return int(value, 16) if value.startswith('0x') else int(value)
+
+
+def _decode_ref(die, name, form, offset, value):
+    assert value[0] == '<' and value[-1] == '>'
+    offset = int(value[1:-1], 16)
+    return Defer(lambda: die.cu.offset_to_die[offset])
+
+
+def _decode_indirect_string(die, name, form, offset, value):
+    m = indirect_string_re.match(value)
+    assert m, 'Unhandled indirect string: ' + value
+    return m.group('value')
+
+
+def _decode_block(die, name, form, offset, value, no_exprloc=False):
+    if (
+        not no_exprloc and
+        name in ('DW_AT_location', 'DW_AT_data_member_location')
+    ):
+        return _decode_exprloc(die, name, form, offset, value, )
+
+    m = block_re.match(value)
+    assert m, 'Unhandled block value: {}'.format(value)
+    return [int(b, 16) for b in m.group('value').split()]
+
+
+def _decode_exprloc(die, name, form, offset, value):
+    m = loc_expr_re.match(value)
+    if not m:
+        # Even though they have the expected DW_FORM_exploc form, objdump does
+        # not decode some location expressions such as DW_AT_byte_size. In this
+        # case, return a dummy block decoding instead.
+        # TODO: implement raw bytes parsing into expressions instead.
+        return _decode_block(die, name, form, offset, value, no_exprloc=True)
+
+    byte_list = [int(b, 16) for b in m.group('bytes').split()]
+
+    expr = m.group('expr')
+    operations = []
+    for op in expr.split('; '):
+        chunks = op.split(': ', 1)
+        assert len(chunks) <= 2, (
+            'Unhandled DWARF expression operation: {}'.format(op)
+        )
+        opcode = chunks[0]
+        operands = chunks[1].split() if len(chunks) == 2 else []
+        operations.append((opcode, ) + tuple(operands))
+
+    return Exprloc(byte_list, operations)
+
+
+value_decoders = {
+    'DW_FORM_flag_present': _decode_flag_present,
+    'DW_FORM_flag': _decode_flag,
+
+    'DW_FORM_data1': _decode_data,
+    'DW_FORM_data2': _decode_data,
+    'DW_FORM_data4': _decode_data,
+    'DW_FORM_data8': _decode_data,
+    'DW_FORM_sdata': _decode_data,
+    'DW_FORM_udata': _decode_data,
+
+    'DW_FORM_ref4': _decode_ref,
+    'DW_FORM_ref8': _decode_ref,
+
+    'DW_FORM_strp': _decode_indirect_string,
+
+    'DW_FORM_block': _decode_block,
+    'DW_FORM_block1': _decode_block,
+    'DW_FORM_block2': _decode_block,
+    'DW_FORM_block4': _decode_block,
+    'DW_FORM_block8': _decode_block,
+    'DW_FORM_block8': _decode_block,
+    'DW_FORM_exprloc': _decode_exprloc,
+
+    # TODO: handle all existing forms
+}

[2/2] Introduce Python testcases to check DWARF output

Commit Message

Comments

Patch