diff mbox series

[v4,2/2] meson: Deserialize the man pages and html builds

Message ID 20230503203947.3417-3-farosas@suse.de
State New
Headers show
Series docs: Speedup docs build | expand

Commit Message

Fabiano Rosas May 3, 2023, 8:39 p.m. UTC
For the documentation builds (man pages & manual), we let Sphinx
decide when to rebuild and use a depfile to know when to trigger the
make target.

We currently use a trick of having the man pages custom_target take as
input the html pages custom_target object, which causes both targets
to be executed if one of the dependencies has changed. However, having
this at the custom_target level means that the two builds are
effectively serialized.

We can eliminate the dependency between the targets by adding a second
depfile for the man pages build, allowing them to be parallelized by
ninja while keeping sphinx in charge of deciding when to rebuild.

Since they can now run in parallel, separate the Sphinx cache
directory of the two builds. We need this not only for data
consistency but also because Sphinx writes builder-dependent
environment information to the cache directory (see notes under
smartquotes_excludes in sphinx docs [1]).

Note that after this patch the commands `make man` and `make html`
only build the specified target. To keep the old behavior of building
both targets, use `make man html` or `make sphinxdocs`.

1- https://www.sphinx-doc.org/en/master/usage/configuration.html

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 docs/meson.build | 36 ++++++++++++++++++++----------------
 1 file changed, 20 insertions(+), 16 deletions(-)

Comments

Peter Maydell May 4, 2023, 8:51 a.m. UTC | #1
On Wed, 3 May 2023 at 21:39, Fabiano Rosas <farosas@suse.de> wrote:
>
> For the documentation builds (man pages & manual), we let Sphinx
> decide when to rebuild and use a depfile to know when to trigger the
> make target.
>
> We currently use a trick of having the man pages custom_target take as
> input the html pages custom_target object, which causes both targets
> to be executed if one of the dependencies has changed. However, having
> this at the custom_target level means that the two builds are
> effectively serialized.
>
> We can eliminate the dependency between the targets by adding a second
> depfile for the man pages build, allowing them to be parallelized by
> ninja while keeping sphinx in charge of deciding when to rebuild.
>
> Since they can now run in parallel, separate the Sphinx cache
> directory of the two builds. We need this not only for data
> consistency but also because Sphinx writes builder-dependent
> environment information to the cache directory (see notes under
> smartquotes_excludes in sphinx docs [1]).

The sphinx-build manpage disagrees about that last part.
https://www.sphinx-doc.org/en/master/man/sphinx-build.html
says about -d:
"with this option you can select a different cache directory
 (the doctrees can be shared between all builders)"

If we don't share the cache directory, presumably Sphinx
now ends up parsing all the input files twice, once per
builder, rather than being able to share them?

thanks
-- PMM
Fabiano Rosas May 4, 2023, 12:06 p.m. UTC | #2
Peter Maydell <peter.maydell@linaro.org> writes:

> On Wed, 3 May 2023 at 21:39, Fabiano Rosas <farosas@suse.de> wrote:
>>
>> For the documentation builds (man pages & manual), we let Sphinx
>> decide when to rebuild and use a depfile to know when to trigger the
>> make target.
>>
>> We currently use a trick of having the man pages custom_target take as
>> input the html pages custom_target object, which causes both targets
>> to be executed if one of the dependencies has changed. However, having
>> this at the custom_target level means that the two builds are
>> effectively serialized.
>>
>> We can eliminate the dependency between the targets by adding a second
>> depfile for the man pages build, allowing them to be parallelized by
>> ninja while keeping sphinx in charge of deciding when to rebuild.
>>
>> Since they can now run in parallel, separate the Sphinx cache
>> directory of the two builds. We need this not only for data
>> consistency but also because Sphinx writes builder-dependent
>> environment information to the cache directory (see notes under
>> smartquotes_excludes in sphinx docs [1]).
>
> The sphinx-build manpage disagrees about that last part.
> https://www.sphinx-doc.org/en/master/man/sphinx-build.html
> says about -d:
> "with this option you can select a different cache directory
>  (the doctrees can be shared between all builders)"
>

The issue I had is that sphinx by default uses smart quotes for html
builders, but not for man builders. But whichever builder runs first
gets to set the smartquotes option and that sticks for the next
builder. That causes our man pages to come up with fancy curly quotes
instead of ' which is probably not an issue, but I didn't want to
produce different output from what we already have today.

I ended up conflating the cache directory (-d) with the environment
(-E), so it is possible that we can reuse the cache but not the
environment (where I assume the smartquotes option is stored). Well, I
better go read the sphinx code and figure that out.

> If we don't share the cache directory, presumably Sphinx
> now ends up parsing all the input files twice, once per
> builder, rather than being able to share them?
>

Yes, but having it run in parallel from the ninja level is still
faster. Of course, if we could reuse the cache, this could potentially
be even faster. I'll try to determine if it is really safe to do so.
Peter Maydell May 4, 2023, 1:14 p.m. UTC | #3
On Thu, 4 May 2023 at 13:06, Fabiano Rosas <farosas@suse.de> wrote:
>
> Peter Maydell <peter.maydell@linaro.org> writes:
>
> > On Wed, 3 May 2023 at 21:39, Fabiano Rosas <farosas@suse.de> wrote:
> >> Since they can now run in parallel, separate the Sphinx cache
> >> directory of the two builds. We need this not only for data
> >> consistency but also because Sphinx writes builder-dependent
> >> environment information to the cache directory (see notes under
> >> smartquotes_excludes in sphinx docs [1]).
> >
> > The sphinx-build manpage disagrees about that last part.
> > https://www.sphinx-doc.org/en/master/man/sphinx-build.html
> > says about -d:
> > "with this option you can select a different cache directory
> >  (the doctrees can be shared between all builders)"
> >
>
> The issue I had is that sphinx by default uses smart quotes for html
> builders, but not for man builders. But whichever builder runs first
> gets to set the smartquotes option and that sticks for the next
> builder. That causes our man pages to come up with fancy curly quotes
> instead of ' which is probably not an issue, but I didn't want to
> produce different output from what we already have today.
>
> I ended up conflating the cache directory (-d) with the environment
> (-E), so it is possible that we can reuse the cache but not the
> environment (where I assume the smartquotes option is stored). Well, I
> better go read the sphinx code and figure that out.
>
> > If we don't share the cache directory, presumably Sphinx
> > now ends up parsing all the input files twice, once per
> > builder, rather than being able to share them?
> >
>
> Yes, but having it run in parallel from the ninja level is still
> faster. Of course, if we could reuse the cache, this could potentially
> be even faster. I'll try to determine if it is really safe to do so.

Yeah, I wouldn't be surprised if we need the caches separate
for concurrency reasons, so this may just be a "commit message
might need tweaking" nit.

-- PMM
Paolo Bonzini May 9, 2023, 12:07 p.m. UTC | #4
On 5/3/23 22:39, Fabiano Rosas wrote:
> For the documentation builds (man pages & manual), we let Sphinx
> decide when to rebuild and use a depfile to know when to trigger the
> make target.
> 
> We currently use a trick of having the man pages custom_target take as
> input the html pages custom_target object, which causes both targets
> to be executed if one of the dependencies has changed. However, having
> this at the custom_target level means that the two builds are
> effectively serialized.
> 
> We can eliminate the dependency between the targets by adding a second
> depfile for the man pages build, allowing them to be parallelized by
> ninja while keeping sphinx in charge of deciding when to rebuild.
> 
> Since they can now run in parallel, separate the Sphinx cache
> directory of the two builds. We need this not only for data
> consistency but also because Sphinx writes builder-dependent
> environment information to the cache directory (see notes under
> smartquotes_excludes in sphinx docs [1]).
> 
> Note that after this patch the commands `make man` and `make html`
> only build the specified target. To keep the old behavior of building
> both targets, use `make man html` or `make sphinxdocs`.
> 
> 1- https://www.sphinx-doc.org/en/master/usage/configuration.html

Unfortunately this breaks CentOS 8, which has an older version of ninja:

ninja: error: build.ninja:16369: multiple outputs aren't (yet?) 
supported by depslog; bring this up on the mailing list if it affects you

This was fixed in ninja 1.10.0.

Paolo
Fabiano Rosas May 22, 2023, 6:17 p.m. UTC | #5
Paolo Bonzini <pbonzini@redhat.com> writes:

> On 5/3/23 22:39, Fabiano Rosas wrote:
>> For the documentation builds (man pages & manual), we let Sphinx
>> decide when to rebuild and use a depfile to know when to trigger the
>> make target.
>> 
>> We currently use a trick of having the man pages custom_target take as
>> input the html pages custom_target object, which causes both targets
>> to be executed if one of the dependencies has changed. However, having
>> this at the custom_target level means that the two builds are
>> effectively serialized.
>> 
>> We can eliminate the dependency between the targets by adding a second
>> depfile for the man pages build, allowing them to be parallelized by
>> ninja while keeping sphinx in charge of deciding when to rebuild.
>> 
>> Since they can now run in parallel, separate the Sphinx cache
>> directory of the two builds. We need this not only for data
>> consistency but also because Sphinx writes builder-dependent
>> environment information to the cache directory (see notes under
>> smartquotes_excludes in sphinx docs [1]).
>> 
>> Note that after this patch the commands `make man` and `make html`
>> only build the specified target. To keep the old behavior of building
>> both targets, use `make man html` or `make sphinxdocs`.
>> 
>> 1- https://www.sphinx-doc.org/en/master/usage/configuration.html

Sorry it took me a while to get back to this, I've been caught in
downstream work.

>
> Unfortunately this breaks CentOS 8, which has an older version of ninja:
>
> ninja: error: build.ninja:16369: multiple outputs aren't (yet?) 
> supported by depslog; bring this up on the mailing list if it affects you
>
> This was fixed in ninja 1.10.0.
>

It looks like it would be easier to just wait until all our supported
build platforms reach this version.

Is this CentOS 8 or CentOS Stream 8? I believe CentOS Stream 8 would
drop from our support matrix at the end of this year. And CentOS 8
should have already dropped no? Due to Stream 9 being released in
2021. Unless we do not count Stream as a new version over plain CentOS.

For the dates and versions, I'm looking at:
https://en.wikipedia.org/wiki/CentOS
https://repology.org/project/ninja/versions
diff mbox series

Patch

diff --git a/docs/meson.build b/docs/meson.build
index 6d0986579e..858e737431 100644
--- a/docs/meson.build
+++ b/docs/meson.build
@@ -42,7 +42,9 @@  if sphinx_build.found()
 endif
 
 if build_docs
-  SPHINX_ARGS += ['-Dversion=' + meson.project_version(), '-Drelease=' + get_option('pkgversion')]
+  SPHINX_ARGS += ['-Dversion=' + meson.project_version(),
+                  '-Drelease=' + get_option('pkgversion'),
+                  '-Ddepfile=@DEPFILE@', '-Ddepfile_stamp=@OUTPUT0@']
 
   man_pages = {
         'qemu-ga.8': (have_ga ? 'man8' : ''),
@@ -61,41 +63,43 @@  if build_docs
   }
 
   sphinxdocs = []
-  sphinxmans = []
 
   private_dir = meson.current_build_dir() / 'manual.p'
   output_dir = meson.current_build_dir() / 'manual'
   input_dir = meson.current_source_dir()
 
-  this_manual = custom_target('QEMU manual',
+  manual = custom_target('QEMU manual',
                 build_by_default: build_docs,
-                output: 'docs.stamp',
+                output: 'manual.stamp',
                 input: files('conf.py'),
-                depfile: 'docs.d',
-                command: [SPHINX_ARGS, '-Ddepfile=@DEPFILE@',
-                          '-Ddepfile_stamp=@OUTPUT0@',
-                          '-b', 'html', '-d', private_dir,
+                depfile: 'manual.dep',
+                command: [SPHINX_ARGS, '-b', 'html', '-d', private_dir,
                           input_dir, output_dir])
-  sphinxdocs += this_manual
+  sphinxdocs += manual
   install_subdir(output_dir, install_dir: qemu_docdir, strip_directory: true)
 
-  these_man_pages = []
-  install_dirs = []
+  man_private_dir = meson.current_build_dir() / 'man.p'
+  # man.stamp is not installed
+  these_man_pages = ['man.stamp']
+  install_dirs = [false]
+
   foreach page, section : man_pages
     these_man_pages += page
     install_dirs += section == '' ? false : get_option('mandir') / section
   endforeach
 
-  sphinxmans += custom_target('QEMU man pages',
+
+  man_pages = custom_target('QEMU man pages',
                               build_by_default: build_docs,
                               output: these_man_pages,
-                              input: this_manual,
+                              depfile: 'man.dep',
                               install: build_docs,
                               install_dir: install_dirs,
-                              command: [SPHINX_ARGS, '-b', 'man', '-d', private_dir,
+                              command: [SPHINX_ARGS, '-b', 'man', '-d', man_private_dir,
                                         input_dir, meson.current_build_dir()])
+  sphinxdocs += man_pages
 
   alias_target('sphinxdocs', sphinxdocs)
-  alias_target('html', sphinxdocs)
-  alias_target('man', sphinxmans)
+  alias_target('html', manual)
+  alias_target('man', man_pages)
 endif