diff mbox series

[v2,03/18] modules: add qemu-modinfo utility

Message ID 20210610055755.538119-4-kraxel@redhat.com
State New
Headers show
Series modules: add metadata database | expand

Commit Message

Gerd Hoffmann June 10, 2021, 5:57 a.m. UTC
Scan .modinfo sections of qemu modules,
write module metadata to modinfo.json.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 qemu-modinfo.c | 270 +++++++++++++++++++++++++++++++++++++++++++++++++
 meson.build    |  11 ++
 2 files changed, 281 insertions(+)
 create mode 100644 qemu-modinfo.c

Comments

Gerd Hoffmann June 10, 2021, 1:04 p.m. UTC | #1
Hi Paolo,

> +if config_host.has_key('CONFIG_MODULES')
> +   qemu_modinfo = executable('qemu-modinfo', files('qemu-modinfo.c') + genh,
> +                             dependencies: [glib, qemuutil], install: have_tools)
> +   custom_target('modinfo.json',
> +                 input: [ softmmu_mods, block_mods ],
> +                 output: 'modinfo.json',
> +                 install: true,
> +                 install_dir: qemu_moddir,
> +                 command: [ qemu_modinfo, '.' ])
> +endif

I have trouble with this one.  Tried to declaring the modules as "input"
to make sure meson will only run qemu-modinfo when it is done building
the module.  But now and then I get build errors because qemu-modinfo
runs in parallel to a module build and qemu-modinfo throws an read error
because of that.

Any clue what is going on here?

thanks,
  Gerd
Daniel P. Berrangé June 10, 2021, 1:13 p.m. UTC | #2
On Thu, Jun 10, 2021 at 03:04:24PM +0200, Gerd Hoffmann wrote:
>   Hi Paolo,
> 
> > +if config_host.has_key('CONFIG_MODULES')
> > +   qemu_modinfo = executable('qemu-modinfo', files('qemu-modinfo.c') + genh,
> > +                             dependencies: [glib, qemuutil], install: have_tools)
> > +   custom_target('modinfo.json',
> > +                 input: [ softmmu_mods, block_mods ],
> > +                 output: 'modinfo.json',
> > +                 install: true,
> > +                 install_dir: qemu_moddir,
> > +                 command: [ qemu_modinfo, '.' ])
> > +endif
> 
> I have trouble with this one.  Tried to declaring the modules as "input"
> to make sure meson will only run qemu-modinfo when it is done building
> the module.  But now and then I get build errors because qemu-modinfo
> runs in parallel to a module build and qemu-modinfo throws an read error
> because of that.

softmmu_mods and block_mods are both lists already, so this sets a
nested list and I wonder if it confuses meson  ? eg do you need

 input: softmmu_mods + block_mods

Alternatively there is option to do:

  'depends: softmmu_mods + block_mods

though the meson docs claim that's not required if they're
already listed against 'input:'


Regards,
Daniel
Paolo Bonzini June 14, 2021, 8:34 a.m. UTC | #3
On 10/06/21 15:13, Daniel P. Berrangé wrote:
> On Thu, Jun 10, 2021 at 03:04:24PM +0200, Gerd Hoffmann wrote:
>>    Hi Paolo,
>>
>>> +if config_host.has_key('CONFIG_MODULES')
>>> +   qemu_modinfo = executable('qemu-modinfo', files('qemu-modinfo.c') + genh,
>>> +                             dependencies: [glib, qemuutil], install: have_tools)
>>> +   custom_target('modinfo.json',
>>> +                 input: [ softmmu_mods, block_mods ],
>>> +                 output: 'modinfo.json',
>>> +                 install: true,
>>> +                 install_dir: qemu_moddir,
>>> +                 command: [ qemu_modinfo, '.' ])
>>> +endif
>>
>> I have trouble with this one.  Tried to declaring the modules as "input"
>> to make sure meson will only run qemu-modinfo when it is done building
>> the module.  But now and then I get build errors because qemu-modinfo
>> runs in parallel to a module build and qemu-modinfo throws an read error
>> because of that.
> 
> softmmu_mods and block_mods are both lists already, so this sets a
> nested list and I wonder if it confuses meson  ? eg do you need
> 
>   input: softmmu_mods + block_mods

No, that should be fine.  Perhaps it's because the inputs are not part 
of the command?  You can check what the build.ninja rule for 
modinfo.json looks like.

Also:

- it would be better to support both directories and file names, so that 
stale modules are not included in modinfo.json

- modinfo.json needs to be disabled on non-ELF platforms (x86, Windows). 
  One alternative is to use libbfd instead of including an ELF parser.

- If modinfo.json has to be installed, you need to build modinfo for the 
build machine in order to support cross compiling.  That however would 
require a cross libbfd, which is a pain.  Do you really need to install 
it?  Can the functionality instead be included in the main QEMU binary 
with a query-modules command or something like that.

Paolo

> Alternatively there is option to do:
> 
>    'depends: softmmu_mods + block_mods
> 
> though the meson docs claim that's not required if they're
> already listed against 'input:'
Paolo Bonzini June 14, 2021, 2:36 p.m. UTC | #4
On 14/06/21 10:34, Paolo Bonzini wrote:
> - modinfo.json needs to be disabled on non-ELF platforms (x86, Windows). 
>   One alternative is to use libbfd instead of including an ELF parser.
> 
> - If modinfo.json has to be installed, you need to build modinfo for the 
> build machine in order to support cross compiling.  That however would 
> require a cross libbfd, which is a pain.  Do you really need to install 
> it?  Can the functionality instead be included in the main QEMU binary 
> with a query-modules command or something like that.

Another possibility to eschew .o parsing is to add something like this 
to the sources

#ifdef QEMU_MODINFO
#define MODULE_METADATA(key, value) \
    =<>= MODINFO key value
#else
#define MODULE_METADATA(key, value)
#endif

MODULE_METADATA("opts", "spice")

A Python script could parse compile_commands.json, add -E -DQEMU_MODINFO 
to the command-line option, and look in the output for the metadata.

Paolo
Gerd Hoffmann June 14, 2021, 3:01 p.m. UTC | #5
Hi,

> > softmmu_mods and block_mods are both lists already, so this sets a
> > nested list and I wonder if it confuses meson  ? eg do you need
> > 
> >   input: softmmu_mods + block_mods
> 
> No, that should be fine.  Perhaps it's because the inputs are not part of
> the command?  You can check what the build.ninja rule for modinfo.json looks
> like.
> 
> Also:
> 
> - it would be better to support both directories and file names, so that
> stale modules are not included in modinfo.json

When turning qemu-modinfo into a pure build-utility (see below) there is
no reason to not explicitly list all module files on the command line.

> - modinfo.json needs to be disabled on non-ELF platforms (x86, Windows).

On windows modules are not supported.
Do we have any other non-ELF platforms?

> One alternative is to use libbfd instead of including an ELF parser.
> 
> - If modinfo.json has to be installed, you need to build modinfo for the
> build machine in order to support cross compiling.  That however would
> require a cross libbfd, which is a pain.  Do you really need to install it?

Do you mean installing modinfo.json or installing qemu-modinfo?  For the
latter I can see that not installing it removes some cross-compiling
headaches.  And, yes, we can turn this into a pure build utility
generating a static database (be it json or -- as suggested by Daniel --
a C source file with the data structures).

> Can the functionality instead be included in the main QEMU binary with a
> query-modules command or something like that.

Well, the meta-data database is meant for qemu itself, not external
users.  I was just using json because the infrastructure to serialize +
parse json exists.  Not sure a "query-modinfo" command makes sense.
Would be trivial to implement though if libvirt finds this useful
(assuming we stick to json).

I was toying with a completely different idea:  Have a "qemu
-generate-modinfo".  That would basically try to load each module, while
doing so record the type_register() (+ other register) calls the module
is doing, when done write out the database with the registrations done
by each module.

Problem with that approach is that it doesn't work for module
dependencies ...

Comments on the idea?  Suggestions for the module dependency problem?
Could maybe libbfd be used to find module (symbol) dependencies
automatically without writing a full dynamic linker?

take care,
  Gerd
Daniel P. Berrangé June 14, 2021, 3:08 p.m. UTC | #6
On Mon, Jun 14, 2021 at 05:01:59PM +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > > softmmu_mods and block_mods are both lists already, so this sets a
> > > nested list and I wonder if it confuses meson  ? eg do you need
> > > 
> > >   input: softmmu_mods + block_mods
> > 
> > No, that should be fine.  Perhaps it's because the inputs are not part of
> > the command?  You can check what the build.ninja rule for modinfo.json looks
> > like.
> > 
> > Also:
> > 
> > - it would be better to support both directories and file names, so that
> > stale modules are not included in modinfo.json
> 
> When turning qemu-modinfo into a pure build-utility (see below) there is
> no reason to not explicitly list all module files on the command line.
> 
> > - modinfo.json needs to be disabled on non-ELF platforms (x86, Windows).
> 
> On windows modules are not supported.

Does anyone recall why modules aren't supported on Windows

> Do we have any other non-ELF platforms?

macOS uses dynlib IIUC ?

> Problem with that approach is that it doesn't work for module
> dependencies ...
> 
> Comments on the idea?  Suggestions for the module dependency problem?
> Could maybe libbfd be used to find module (symbol) dependencies
> automatically without writing a full dynamic linker?

Is there any value in exploring use of libclang ?  It gives us a real
C parser that we can use to extract information from the C source. In
libvirt we have experimental patches (not yet merged) using libclang to
auto-generate XML parser helpers from struct annotations. It is quite
nice compared to any other hacks for extracting information from C
source files without using a proper parser.  libclang can be accessed
from Python3 via its bindings and IIUC should be usable on all our
build platforms


Regards,
Daniel
Gerd Hoffmann June 15, 2021, 4:49 a.m. UTC | #7
Hi,

> Another possibility to eschew .o parsing is to add something like this to
> the sources
> 
> #ifdef QEMU_MODINFO
> #define MODULE_METADATA(key, value) \
>    =<>= MODINFO key value
> #else
> #define MODULE_METADATA(key, value)
> #endif
> 
> MODULE_METADATA("opts", "spice")
> 
> A Python script could parse compile_commands.json, add -E -DQEMU_MODINFO to
> the command-line option, and look in the output for the metadata.

Hmm, worth trying, although I guess it would be easier to code this up
straight in meson.build and pull the information you need out of the
source set, especially as you'll know then which source files are
compiled into which module.

take care,
  Gerd
Gerd Hoffmann June 15, 2021, 4:54 a.m. UTC | #8
> > Problem with that approach is that it doesn't work for module
> > dependencies ...
> > 
> > Comments on the idea?  Suggestions for the module dependency problem?
> > Could maybe libbfd be used to find module (symbol) dependencies
> > automatically without writing a full dynamic linker?
> 
> Is there any value in exploring use of libclang ?  It gives us a real
> C parser that we can use to extract information from the C source. In
> libvirt we have experimental patches (not yet merged) using libclang to
> auto-generate XML parser helpers from struct annotations. It is quite
> nice compared to any other hacks for extracting information from C
> source files without using a proper parser.  libclang can be accessed
> from Python3 via its bindings and IIUC should be usable on all our
> build platforms

Could you do something along the lines of ...

  (1) find constructors
  (2) find type_register() calls in the constructor and the
      TypeInfo structs passed to those calls.
  (3) inspect the TypeInfo structs to figure the QOM type names.

... with libclang?

take care,
  Gerd
Gerd Hoffmann June 15, 2021, 7:56 a.m. UTC | #9
On Tue, Jun 15, 2021 at 06:49:15AM +0200, Gerd Hoffmann wrote:
>   Hi,
> 
> > Another possibility to eschew .o parsing is to add something like this to
> > the sources
> > 
> > #ifdef QEMU_MODINFO
> > #define MODULE_METADATA(key, value) \
> >    =<>= MODINFO key value
> > #else
> > #define MODULE_METADATA(key, value)
> > #endif
> > 
> > MODULE_METADATA("opts", "spice")
> > 
> > A Python script could parse compile_commands.json, add -E -DQEMU_MODINFO to
> > the command-line option, and look in the output for the metadata.
> 
> Hmm, worth trying, although I guess it would be easier to code this up
> straight in meson.build and pull the information you need out of the
> source set, especially as you'll know then which source files are
> compiled into which module.

Hmm, looks like I actually need both.  Seems there is no easy way to get
the cflags out of a source_set to construct a cpp command line.  Pulling
this out of compile_commands.json with jq works though.

With the patch below I get nice ${module}.modinfo files with the
metadata, now I only need a (probably python) script to collect
them and create a modinfo.c which we can link into qemu.

take care,
  Gerd

From 3edc033935d2dd4ec607ac6395548a327151ad06 Mon Sep 17 00:00:00 2001
From: Gerd Hoffmann <kraxel@redhat.com>
Date: Tue, 15 Jun 2021 09:23:52 +0200
Subject: [PATCH] try -DQEMU_MODINFO

---
 include/qemu/module.h | 22 ++++++----------------
 meson.build           |  7 +++++++
 scripts/modinfo.sh    |  8 ++++++++
 3 files changed, 21 insertions(+), 16 deletions(-)
 create mode 100644 scripts/modinfo.sh

diff --git a/include/qemu/module.h b/include/qemu/module.h
index 7825f6d8c847..5acfa423dc4f 100644
--- a/include/qemu/module.h
+++ b/include/qemu/module.h
@@ -74,22 +74,12 @@ void module_load_qom_one(const char *type);
 void module_load_qom_all(void);
 void module_allow_arch(const char *arch);
 
-/*
- * macros to store module metadata in a .modinfo section.
- * qemu-modinfo utility will collect the metadata.
- *
- * Use "objdump -t -s -j .modinfo ${module}.so" to inspect.
- */
-
-#define ___PASTE(a, b) a##b
-#define __PASTE(a, b) ___PASTE(a, b)
-
-#define modinfo(kind, value)                             \
-    static const char __PASTE(kind, __LINE__)[]          \
-        __attribute__((__used__))                        \
-        __attribute__((section(".modinfo")))             \
-        __attribute__((aligned(1)))                      \
-        = stringify(kind) "=" value
+#ifdef QEMU_MODINFO
+# define modinfo(kind, value) \
+    MODINFO_START kind value MODINFO_END
+#else
+# define modinfo(kind, value)
+#endif
 
 #define module_obj(name) modinfo(obj, name)
 #define module_dep(name) modinfo(dep, name)
diff --git a/meson.build b/meson.build
index 46ebc07dbb67..d8661755adf9 100644
--- a/meson.build
+++ b/meson.build
@@ -2050,12 +2050,19 @@ target_modules += { 'accel' : { 'qtest': qtest_module_ss,
 
 block_mods = []
 softmmu_mods = []
+modinfo = find_program('scripts/modinfo.sh')
 foreach d, list : modules
   foreach m, module_ss : list
     if enable_modules and targetos != 'windows'
       module_ss = module_ss.apply(config_all, strict: false)
       sl = static_library(d + '-' + m, [genh, module_ss.sources()],
                           dependencies: [modulecommon, module_ss.dependencies()], pic: true)
+      custom_target(d + '-' + m + '.modinfo',
+                    output: d + '-' + m + '.modinfo',
+                    input: module_ss.sources(),
+                    build_by_default: true, # to be removed when added to a target
+                    capture: true,
+                    command: [modinfo, '@INPUT@'])
       if d == 'block'
         block_mods += sl
       else
diff --git a/scripts/modinfo.sh b/scripts/modinfo.sh
new file mode 100644
index 000000000000..8f4495d4523d
--- /dev/null
+++ b/scripts/modinfo.sh
@@ -0,0 +1,8 @@
+#!/bin/sh
+for input in "$@"; do
+    command=$(jq  -r ".[] | select(.file == \"$input\") | .command " compile_commands.json)
+    command="${command%% -M*}"
+    command="$command -DQEMU_MODINFO -E $input"
+    $command
+done | grep MODINFO
+exit 0
Daniel P. Berrangé June 15, 2021, 9:27 a.m. UTC | #10
On Tue, Jun 15, 2021 at 06:54:41AM +0200, Gerd Hoffmann wrote:
> > > Problem with that approach is that it doesn't work for module
> > > dependencies ...
> > > 
> > > Comments on the idea?  Suggestions for the module dependency problem?
> > > Could maybe libbfd be used to find module (symbol) dependencies
> > > automatically without writing a full dynamic linker?
> > 
> > Is there any value in exploring use of libclang ?  It gives us a real
> > C parser that we can use to extract information from the C source. In
> > libvirt we have experimental patches (not yet merged) using libclang to
> > auto-generate XML parser helpers from struct annotations. It is quite
> > nice compared to any other hacks for extracting information from C
> > source files without using a proper parser.  libclang can be accessed
> > from Python3 via its bindings and IIUC should be usable on all our
> > build platforms
> 
> Could you do something along the lines of ...
> 
>   (1) find constructors
>   (2) find type_register() calls in the constructor and the
>       TypeInfo structs passed to those calls.
>   (3) inspect the TypeInfo structs to figure the QOM type names.
> 
> ... with libclang?

In theory that should all be doable. I'm not very familiar myself with
libclang, but IIUC you basically get given the abstract syntax tree
and have to traverse it to find the info you want. This is kind of
low level but the info should all be there if you know how to find
it.

As an answer to (1) and part of (2), the following code I hacked up
quickly, finds all constructors that contain "type_register" calls.
Would need to find the arg to the calls and match that up to the
static structs too.


from clang.cindex import Index, CursorKind

def is_constructor(cursor):
    for bit in cursor.get_children():
        if bit.kind == CursorKind.UNEXPOSED_ATTR:
            for tok in bit.get_tokens():
                if tok.spelling == "constructor":
                    return True
    return False

def find_constructors(cursor):
    for cursor in cursor.get_children():
        if cursor.kind == CursorKind.FUNCTION_DECL:
            if is_constructor(cursor):
                yield cursor

def has_type_register(cursor):
    for cursor in constructor.get_children():
        if cursor.kind == CursorKind.COMPOUND_STMT:
            for c in cursor.get_children():
                if c.kind == CursorKind.CALL_EXPR:
                    if c.displayname == "type_register":
                        return True
    return False
                
index = Index.create()
tu = index.parse("demo.c")
for constructor in find_constructors(tu.cursor):
    has_reg = has_type_register(constructor)
    if has_reg:
        print("Constructor with type_register: " + constructor.displayname)


I tested with a short example

#include <stdio.h>

struct Foo {
  int bar;
};

static void type_register(struct Foo *foo) {
  printf("%d\n", foo->bar);
}
  
__attribute__((constructor)) static void startit(void) 
{
  static struct Foo foo = { 42 };
  type_register(&foo);
}

int main(int argc, char **argv) {
  printf("Running main\n");
}


$ python demo.py
Constructor with type register: startit()


Regards,
Daniel
Gerd Hoffmann June 15, 2021, 1:07 p.m. UTC | #11
> > > A Python script could parse compile_commands.json, add -E -DQEMU_MODINFO to
> > > the command-line option, and look in the output for the metadata.
> > 
> > Hmm, worth trying, although I guess it would be easier to code this up
> > straight in meson.build and pull the information you need out of the
> > source set, especially as you'll know then which source files are
> > compiled into which module.
> 
> Hmm, looks like I actually need both.  Seems there is no easy way to get
> the cflags out of a source_set to construct a cpp command line.  Pulling
> this out of compile_commands.json with jq works though.

Well, easy until I look at target-specific modules where the
"source file" -> "command" mapping isn't unique any more.  Which makes
this route less attractive ...

Any idea on getting the cflags in meson.build  Or maybe I can somehow
ask meson to run the cpp pass only for a given source set?

thanks,
  Gerd
Paolo Bonzini June 15, 2021, 1:35 p.m. UTC | #12
On 15/06/21 15:07, Gerd Hoffmann wrote:
>> Hmm, looks like I actually need both.  Seems there is no easy way to get
>> the cflags out of a source_set to construct a cpp command line.  Pulling
>> this out of compile_commands.json with jq works though.
> Well, easy until I look at target-specific modules where the
> "source file" -> "command" mapping isn't unique any more.  Which makes
> this route less attractive ...

I was almost giving up... but it looks like the result of 
extract_all_objects(recursive: true) can be passed to custom_target(). 
Then you can match it after compile_commands.json's "output" key.

Paolo

> Any idea on getting the cflags in meson.build  Or maybe I can somehow
> ask meson to run the cpp pass only for a given source set?
Gerd Hoffmann June 16, 2021, 9:16 a.m. UTC | #13
On Tue, Jun 15, 2021 at 03:35:44PM +0200, Paolo Bonzini wrote:
> On 15/06/21 15:07, Gerd Hoffmann wrote:
> > > Hmm, looks like I actually need both.  Seems there is no easy way to get
> > > the cflags out of a source_set to construct a cpp command line.  Pulling
> > > this out of compile_commands.json with jq works though.
> > Well, easy until I look at target-specific modules where the
> > "source file" -> "command" mapping isn't unique any more.  Which makes
> > this route less attractive ...
> 
> I was almost giving up... but it looks like the result of
> extract_all_objects(recursive: true) can be passed to custom_target(). Then
> you can match it after compile_commands.json's "output" key.

Seems the custom_target commands do not land in compile_commands.json.

But I have figured meanwhile that looking for the target name in the
command line works reliable.  That will will match
-DCONFIG_TARGET="${target}-config-target.h".

Current WIP patch below, seems to work nicely.  Whole patch series needs
an overhaul now ...

From 70c96336e38f1a7f114aee2c7ef023546cc560e5 Mon Sep 17 00:00:00 2001
From: Gerd Hoffmann <kraxel@redhat.com>
Date: Tue, 15 Jun 2021 09:23:52 +0200
Subject: [PATCH] try -DQEMU_MODINFO

---
 scripts/modinfo-collect.py  | 66 +++++++++++++++++++++++++++++++++++++
 scripts/modinfo-generate.py | 61 ++++++++++++++++++++++++++++++++++
 include/qemu/module.h       | 33 ++++++++++---------
 meson.build                 | 28 +++++++++++++++-
 4 files changed, 171 insertions(+), 17 deletions(-)
 create mode 100755 scripts/modinfo-collect.py
 create mode 100755 scripts/modinfo-generate.py

diff --git a/scripts/modinfo-collect.py b/scripts/modinfo-collect.py
new file mode 100755
index 000000000000..b804099cfd1e
--- /dev/null
+++ b/scripts/modinfo-collect.py
@@ -0,0 +1,66 @@
+#!/usr/bin/env python3
+# -*- coding: utf-8 -*-
+
+import os
+import sys
+import json
+import shlex
+import subprocess
+
+def find_command(src, target, compile_commands):
+    for command in compile_commands:
+        if command['file'] != src:
+            continue
+        if target != '' and command['command'].find(target) == -1:
+            continue
+        return command['command']
+    return 'false'
+
+def process_command(src, command):
+    skip = False
+    arg = False
+    out = []
+    for item in shlex.split(command):
+        if arg:
+            out.append(x)
+            arg = False
+            continue
+        if skip:
+            skip = False
+            continue
+        if item == '-MF' or item == '-MQ' or item == '-o':
+            skip = True
+            continue
+        if item == '-c':
+            skip = True
+            continue
+        out.append(item)
+    out.append('-DQEMU_MODINFO')
+    out.append('-E')
+    out.append(src)
+    return out
+
+def main(args):
+    target = ''
+    if args[0] == '--target':
+        args.pop(0)
+        target = args.pop(0)
+        print("MODINFO_DEBUG target %s" % target)
+        arch = target[:-8] # cut '-softmmu'
+        print("MODINFO_START arch \"%s\" MODINFO_END" % arch)
+    with open('compile_commands.json') as f:
+        compile_commands = json.load(f)
+    for src in args:
+        print("MODINFO_DEBUG src %s" % src)
+        command = find_command(src, target, compile_commands)
+        cmdline = process_command(src, command)
+        print("MODINFO_DEBUG cmd", cmdline)
+        result = subprocess.run(cmdline, capture_output = True, text = True)
+        if result.returncode != 0:
+            sys.exit(result.returncode)
+        for line in result.stdout.split('\n'):
+            if line.find('MODINFO') != -1:
+                print(line)
+
+if __name__ == "__main__":
+    main(sys.argv[1:])
diff --git a/scripts/modinfo-generate.py b/scripts/modinfo-generate.py
new file mode 100755
index 000000000000..b37d2e8edab9
--- /dev/null
+++ b/scripts/modinfo-generate.py
@@ -0,0 +1,61 @@
+#!/usr/bin/env python3
+# -*- coding: utf-8 -*-
+
+import os
+import sys
+
+def print_array(name, values):
+    if len(values) == 0:
+        return
+    print("    .%s = ((const char*[]){ %s, NULL })," % (name, ", ".join(values)))
+
+def generate(name, lines):
+    arch = ""
+    objs = []
+    deps = []
+    opts = []
+    for line in lines:
+        if line.startswith("MODINFO_START"):
+            items = line.split()
+            if items[1] == 'obj':
+                objs.append(items[2])
+            elif items[1] == 'dep':
+                deps.append(items[2])
+            elif items[1] == 'opts':
+                opts.append(items[2])
+            elif items[1] == 'arch':
+                arch = items[2];
+            else:
+                print("unknown:", items[1])
+                exit(1)
+
+    print("    .name = \"%s\"," % name)
+    if arch != "":
+        print("    .arch = %s," % arch)
+    print_array("objs", objs)
+    print_array("deps", deps)
+    print_array("opts", opts)
+    print("},{");
+
+def print_pre():
+    print("/* generated by scripts/modinfo.py */")
+    print("#include \"qemu/osdep.h\"")
+    print("#include \"qemu/module.h\"")
+    print("const QemuModinfo qemu_modinfo[] = {{")
+
+def print_post():
+    print("    /* end of list */")
+    print("}};")
+
+def main(args):
+    print_pre()
+    for modinfo in args:
+        with open(modinfo) as f:
+            lines = f.readlines()
+        print("    /* %s */" % modinfo)
+        (basename, ext) = os.path.splitext(modinfo)
+        generate(basename, lines)
+    print_post()
+
+if __name__ == "__main__":
+    main(sys.argv[1:])
diff --git a/include/qemu/module.h b/include/qemu/module.h
index 7825f6d8c847..23e92fff8484 100644
--- a/include/qemu/module.h
+++ b/include/qemu/module.h
@@ -74,26 +74,27 @@ void module_load_qom_one(const char *type);
 void module_load_qom_all(void);
 void module_allow_arch(const char *arch);
 
-/*
- * macros to store module metadata in a .modinfo section.
- * qemu-modinfo utility will collect the metadata.
- *
- * Use "objdump -t -s -j .modinfo ${module}.so" to inspect.
- */
-
-#define ___PASTE(a, b) a##b
-#define __PASTE(a, b) ___PASTE(a, b)
-
-#define modinfo(kind, value)                             \
-    static const char __PASTE(kind, __LINE__)[]          \
-        __attribute__((__used__))                        \
-        __attribute__((section(".modinfo")))             \
-        __attribute__((aligned(1)))                      \
-        = stringify(kind) "=" value
+/* scripts/modinfo.sh collects module info (using -DQEMU_MODINFO) */
+#ifdef QEMU_MODINFO
+# define modinfo(kind, value) \
+    MODINFO_START kind value MODINFO_END
+#else
+# define modinfo(kind, value)
+#endif
 
 #define module_obj(name) modinfo(obj, name)
 #define module_dep(name) modinfo(dep, name)
 #define module_arch(name) modinfo(arch, name)
 #define module_opts(name) modinfo(opts, name)
 
+typedef struct QemuModinfo QemuModinfo;
+struct QemuModinfo {
+    const char *name;
+    const char *arch;
+    const char **objs;
+    const char **deps;
+    const char **opts;
+};
+extern const QemuModinfo qemu_modinfo[];
+
 #endif
diff --git a/meson.build b/meson.build
index 46ebc07dbb67..f5c7ba979981 100644
--- a/meson.build
+++ b/meson.build
@@ -2048,6 +2048,10 @@ target_modules += { 'accel' : { 'qtest': qtest_module_ss,
 # Library dependencies #
 ########################
 
+modinfo_collect = find_program('scripts/modinfo-collect.py')
+modinfo_generate = find_program('scripts/modinfo-generate.py')
+modinfo_files = []
+
 block_mods = []
 softmmu_mods = []
 foreach d, list : modules
@@ -2056,6 +2060,11 @@ foreach d, list : modules
       module_ss = module_ss.apply(config_all, strict: false)
       sl = static_library(d + '-' + m, [genh, module_ss.sources()],
                           dependencies: [modulecommon, module_ss.dependencies()], pic: true)
+      modinfo_files += custom_target(d + '-' + m + '.modinfo',
+                                     output: d + '-' + m + '.modinfo',
+                                     input: module_ss.sources(),
+                                     capture: true,
+                                     command: [modinfo_collect, '@INPUT@'])
       if d == 'block'
         block_mods += sl
       else
@@ -2084,12 +2093,18 @@ foreach d, list : target_modules
                     '-DCONFIG_DEVICES="@0@-config-devices.h"'.format(target)]
           target_module_ss = module_ss.apply(config_target, strict: false)
           if target_module_ss.sources() != []
-            sl = static_library(d + '-' + m + '-' + config_target['TARGET_NAME'],
+            module_name = d + '-' + m + '-' + config_target['TARGET_NAME']
+            sl = static_library(module_name,
                                 [genh, target_module_ss.sources()],
                                 dependencies: [modulecommon, target_module_ss.dependencies()],
                                 include_directories: target_inc,
                                 c_args: c_args,
                                 pic: true)
+            modinfo_files += custom_target(module_name + '.modinfo',
+                                           output: module_name + '.modinfo',
+                                           input: target_module_ss.sources(),
+                                           capture: true,
+                                           command: [modinfo_collect, '--target', target, '@INPUT@'])
             softmmu_mods += sl
           endif
         endif
@@ -2100,6 +2115,17 @@ foreach d, list : target_modules
   endforeach
 endforeach
 
+if enable_modules
+  modinfo_src = custom_target('modinfo.c',
+                              output: 'modinfo.c',
+                              input: modinfo_files,
+                              command: [modinfo_generate, '@INPUT@'],
+                              capture: true)
+  modinfo_lib = static_library('modinfo', modinfo_src)
+  modinfo_dep = declare_dependency(link_whole: modinfo_lib)
+  softmmu_ss.add(modinfo_dep)
+endif
+
 nm = find_program('nm')
 undefsym = find_program('scripts/undefsym.py')
 block_syms = custom_target('block.syms', output: 'block.syms',
Paolo Bonzini June 16, 2021, 10:53 a.m. UTC | #14
On 16/06/21 11:16, Gerd Hoffmann wrote:
>> I was almost giving up... but it looks like the result of
>> extract_all_objects(recursive: true) can be passed to custom_target(). Then
>> you can match it after compile_commands.json's "output" key.
>
> Seems the custom_target commands do not land in compile_commands.json.

No, they don't.

The idea was expressed a bit too concisely. :)  I was thinking of using 
extract_all_objects on the module static library, passing the result to 
modinfo-collect, and looking up the names in compile_commands.json.

Paolo

> But I have figured meanwhile that looking for the target name in the
> command line works reliable.  That will will match
> -DCONFIG_TARGET="${target}-config-target.h".
> 
> Current WIP patch below, seems to work nicely.  Whole patch series needs
> an overhaul now ...
diff mbox series

Patch

diff --git a/qemu-modinfo.c b/qemu-modinfo.c
new file mode 100644
index 000000000000..611dbdb00683
--- /dev/null
+++ b/qemu-modinfo.c
@@ -0,0 +1,270 @@ 
+/*
+ * QEMU module parser
+ *
+ * read modules, find modinfo section, parse & store metadata.
+ *
+ * Copyright Red Hat, Inc. 2021
+ *
+ * Authors:
+ *     Gerd Hoffmann <kraxel@redhat.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+#include "elf.h"
+#include <stdint.h>
+#include <dirent.h>
+
+#include "qapi/qapi-types-modules.h"
+#include "qapi/qapi-visit-modules.h"
+#include "qapi/qobject-output-visitor.h"
+#include "qapi/qmp/qjson.h"
+#include "qapi/qmp/qstring.h"
+
+#if INTPTR_MAX == INT32_MAX
+# define Elf_Ehdr Elf32_Ehdr
+# define Elf_Shdr Elf32_Shdr
+# define ELFCLASS ELFCLASS32
+#elif INTPTR_MAX == INT64_MAX
+# define Elf_Ehdr Elf64_Ehdr
+# define Elf_Shdr Elf64_Shdr
+# define ELFCLASS ELFCLASS64
+#else
+# error Huh?  Neither 32-bit nor 64-bit host.
+#endif
+
+static const char *moddir = CONFIG_QEMU_MODDIR;
+static const char *dsosuf = CONFIG_HOST_DSOSUF;
+
+static ModuleInfo *modinfo(const char *module, char *info, size_t size)
+{
+    ModuleInfo *modinfo;
+    strList *sl;
+    size_t pos = 0, len;
+
+    modinfo = g_new0(ModuleInfo, 1);
+    modinfo->name = g_strdup(module);
+
+    if (info) {
+        do {
+            if (strncmp(info + pos, "obj=", 4) == 0) {
+                sl = g_new0(strList, 1);
+                sl->value = g_strdup(info + pos + 4);
+                sl->next = modinfo->objs;
+                modinfo->objs = sl;
+                modinfo->has_objs = true;
+            } else if (strncmp(info + pos, "dep=", 4) == 0) {
+                sl = g_new0(strList, 1);
+                sl->value = g_strdup(info + pos + 4);
+                sl->next = modinfo->deps;
+                modinfo->deps = sl;
+                modinfo->has_deps = true;
+            } else if (strncmp(info + pos, "arch=", 5) == 0) {
+                modinfo->arch = g_strdup(info + pos + 5);
+                modinfo->has_arch = true;
+            } else if (strncmp(info + pos, "opts=", 5) == 0) {
+                modinfo->opts = g_strdup(info + pos + 5);
+                modinfo->has_opts = true;
+            } else {
+                fprintf(stderr, "unknown tag: %s\n", info + pos);
+                exit(1);
+            }
+            len = strlen(info + pos) + 1;
+            pos += len;
+        } while (pos < size);
+    }
+
+    return modinfo;
+}
+
+static void elf_read_section_hdr(FILE *fp, Elf_Ehdr *ehdr,
+                                 int section, Elf_Shdr *shdr)
+{
+    size_t pos, len;
+    int ret;
+
+    pos = ehdr->e_shoff + section * ehdr->e_shentsize;
+    len = MIN(ehdr->e_shentsize, sizeof(*shdr));
+
+    ret = fseek(fp, pos, SEEK_SET);
+    if (ret != 0) {
+        fprintf(stderr, "seek error\n");
+        exit(1);
+    }
+
+    memset(shdr, 0, sizeof(*shdr));
+    ret = fread(shdr, len, 1, fp);
+    if (ret != 1) {
+        fprintf(stderr, "read error\n");
+        exit(1);
+    }
+}
+
+static void *elf_read_section(FILE *fp, Elf_Ehdr *ehdr,
+                              int section, size_t *size)
+{
+    Elf_Shdr shdr;
+    void *data;
+    int ret;
+
+    elf_read_section_hdr(fp, ehdr, section, &shdr);
+    if (shdr.sh_offset && shdr.sh_size) {
+        ret = fseek(fp, shdr.sh_offset, SEEK_SET);
+        if (ret != 0) {
+            fprintf(stderr, "seek error\n");
+            exit(1);
+        }
+
+        data = g_malloc(shdr.sh_size);
+        ret = fread(data, shdr.sh_size, 1, fp);
+        if (ret != 1) {
+            fprintf(stderr, "read error\n");
+            exit(1);
+        }
+        *size = shdr.sh_size;
+    } else {
+        data = NULL;
+        *size = 0;
+    }
+    return data;
+}
+
+static ModuleInfo *elf_parse_module(const char *module,
+                                    const char *filename)
+{
+    Elf_Ehdr ehdr;
+    Elf_Shdr shdr;
+    FILE *fp;
+    int ret, i;
+    char *str;
+    size_t str_size;
+    char *info;
+    size_t info_size;
+
+    fp = fopen(filename, "r");
+    if (NULL == fp) {
+        fprintf(stderr, "open %s: %s\n", filename, strerror(errno));
+        exit(1);
+    }
+
+    ret = fread(&ehdr, sizeof(ehdr), 1, fp);
+    if (ret != 1) {
+        fprintf(stderr, "read error (%s)\n", filename);
+        exit(1);
+    }
+
+    if (ehdr.e_ident[EI_MAG0] != ELFMAG0 ||
+        ehdr.e_ident[EI_MAG1] != ELFMAG1 ||
+        ehdr.e_ident[EI_MAG2] != ELFMAG2 ||
+        ehdr.e_ident[EI_MAG3] != ELFMAG3) {
+        fprintf(stderr, "not an elf file (%s)\n", filename);
+        exit(1);
+    }
+    if (ehdr.e_ident[EI_CLASS] != ELFCLASS64) {
+        fprintf(stderr, "elf class mismatch (%s)\n", filename);
+        exit(1);
+    }
+    if (ehdr.e_shoff == 0) {
+        fprintf(stderr, "no section header (%s)\n", filename);
+        exit(1);
+    }
+
+    /* read string table */
+    if (ehdr.e_shstrndx == 0) {
+        fprintf(stderr, "no section strings (%s)\n", filename);
+        exit(1);
+    }
+    str = elf_read_section(fp, &ehdr, ehdr.e_shstrndx, &str_size);
+    if (NULL == str) {
+        fprintf(stderr, "no section strings (%s)\n", filename);
+        exit(1);
+    }
+
+    /* find and read modinfo section */
+    info = NULL;
+    for (i = 0; i < ehdr.e_shnum; i++) {
+        elf_read_section_hdr(fp, &ehdr, i, &shdr);
+        if (!shdr.sh_name) {
+            continue;
+        }
+        if (strcmp(str + shdr.sh_name, ".modinfo") == 0) {
+            info = elf_read_section(fp, &ehdr, i, &info_size);
+        }
+    }
+    fclose(fp);
+
+    return modinfo(module, info, info_size);
+}
+
+int main(int argc, char **argv)
+{
+    DIR *dir;
+    FILE *fp;
+    ModuleInfo *modinfo;
+    ModuleInfoList *modlist;
+    Modules *modules;
+    Visitor *v;
+    QObject *obj;
+    Error *errp = NULL;
+    struct dirent *ent;
+    char *ext, *file, *name;
+    GString *gjson;
+    QString *qjson;
+    const char *json;
+
+    if (argc > 1) {
+        moddir = argv[1];
+    }
+
+    dir = opendir(moddir);
+    if (dir == NULL) {
+        fprintf(stderr, "opendir(%s): %s\n", moddir, strerror(errno));
+        exit(1);
+    }
+
+    modules = g_new0(Modules, 1);
+    while (NULL != (ent = readdir(dir))) {
+        ext = strrchr(ent->d_name, '.');
+        if (!ext) {
+            continue;
+        }
+        if (strcmp(ext, dsosuf) != 0) {
+            continue;
+        }
+
+        name = g_strndup(ent->d_name, ext - ent->d_name);
+        file = g_strdup_printf("%s/%s", moddir, ent->d_name);
+        modinfo = elf_parse_module(name, file);
+        g_free(file);
+        g_free(name);
+
+        modlist = g_new0(ModuleInfoList, 1);
+        modlist->value = modinfo;
+        modlist->next = modules->list;
+        modules->list = modlist;
+    }
+    closedir(dir);
+
+    v = qobject_output_visitor_new(&obj);
+    visit_type_Modules(v, NULL, &modules, &errp);
+    visit_complete(v, &obj);
+    visit_free(v);
+
+    gjson = qobject_to_json(obj);
+    qjson = qstring_from_gstring(gjson);
+    json = qstring_get_str(qjson);
+
+    file = g_strdup_printf("%s/modinfo.json", moddir);
+    fp = fopen(file, "w");
+    if (fp == NULL) {
+        fprintf(stderr, "open(%s): %s\n", file, strerror(errno));
+        exit(1);
+    }
+    fprintf(fp, "%s", json);
+    fclose(fp);
+
+    printf("%s written\n", file);
+    g_free(file);
+    return 0;
+}
diff --git a/meson.build b/meson.build
index d2a9ce91f556..9823c5889140 100644
--- a/meson.build
+++ b/meson.build
@@ -2380,6 +2380,17 @@  if xkbcommon.found()
                            dependencies: [qemuutil, xkbcommon], install: have_tools)
 endif
 
+if config_host.has_key('CONFIG_MODULES')
+   qemu_modinfo = executable('qemu-modinfo', files('qemu-modinfo.c') + genh,
+                             dependencies: [glib, qemuutil], install: have_tools)
+   custom_target('modinfo.json',
+                 input: [ softmmu_mods, block_mods ],
+                 output: 'modinfo.json',
+                 install: true,
+                 install_dir: qemu_moddir,
+                 command: [ qemu_modinfo, '.' ])
+endif
+
 if have_tools
   qemu_img = executable('qemu-img', [files('qemu-img.c'), hxdep],
              dependencies: [authz, block, crypto, io, qom, qemuutil], install: true)