Message ID | 20200206173040.17337-20-peter.maydell@linaro.org |
---|---|
State | New |
Headers | show |
Series | Convert QAPI doc comments to generate rST instead of texinfo | expand |
Peter Maydell <peter.maydell@linaro.org> writes: > Our current QAPI doc-comment markup allows section headers > (introduced with a leading '=' or '==') anywhere in any documentation > comment. This works for texinfo because the texi generator simply > prints a texinfo heading directive at that point in the output > stream. For rST generation, since we're assembling a tree of > docutils nodes, this is awkward because a new section implies > starting a new section node at the top level of the tree and > generating text into there. > > New section headings in the middle of the documentation of a command > or event would be pretty nonsensical, and in fact we only ever output > new headings using 'freeform' doc comment blocks whose only content > is the single line of the heading, with two exceptions, which are in > the introductory freeform-doc-block at the top of > qapi/qapi-schema.json. > > Split that doc-comment up so that the heading lines are in their own > doc-comment. This will allow us to tighten the specification to > insist that heading lines are always standalone, rather than > requiring the rST document generator to look at every line in a doc > comment block and handle headings in odd places. > > This change makes no difference to the generated texi. > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > --- > qapi/qapi-schema.json | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json > index 9751b11f8f1..f7ba60a5d0b 100644 > --- a/qapi/qapi-schema.json > +++ b/qapi/qapi-schema.json > @@ -1,7 +1,9 @@ > # -*- Mode: Python -*- > ## > # = Introduction > -# > +## > + > +## > # This document describes all commands currently supported by QMP. > # > # Most of the time their usage is exactly the same as in the user Monitor, this > @@ -25,9 +27,13 @@ > # > # Please, refer to the QMP specification (docs/interop/qmp-spec.txt) for > # detailed information on the Server command and response formats. > -# > +## > + > +## > # = Stability Considerations > -# > +## > + > +## > # The current QMP command set (described in this file) may be useful for a > # number of use cases, however it's limited and several commands have bad > # defined semantics, specially with regard to command completion. I figure this is a minimally invasive patch to avoid complications in your rST generator. I'm afraid it sweeps the actual problem under the rug, namely flaws in our parsing and representation of doc comments. The doc comment parser doesn't recognize headings. Instead, that's done somewhere in the bowels of the Texinfo generator. Works as long as the input is "sane", happily generates invalid Texinfo otherwise, see tests/qapi-schema/doc-bad-section.json. The proper fix is to make the parser recognize headers in the places where headers make sense, and reject them elsewhere. But maybe we don't have to. Do you plan to support full rST in doc comments? If yes, why have our own syntax for headings? Why not leave it to rST? If no, do you plan to support a subset of rST? If yes, define it, please. How will it be enforced?
On Fri, 7 Feb 2020 at 15:35, Markus Armbruster <armbru@redhat.com> wrote: > > Peter Maydell <peter.maydell@linaro.org> writes: > > > Our current QAPI doc-comment markup allows section headers > > (introduced with a leading '=' or '==') anywhere in any documentation > > comment. This works for texinfo because the texi generator simply > > prints a texinfo heading directive at that point in the output > > stream. For rST generation, since we're assembling a tree of > > docutils nodes, this is awkward because a new section implies > > starting a new section node at the top level of the tree and > > generating text into there. > > > > New section headings in the middle of the documentation of a command > > or event would be pretty nonsensical, and in fact we only ever output > > new headings using 'freeform' doc comment blocks whose only content > > is the single line of the heading, with two exceptions, which are in > > the introductory freeform-doc-block at the top of > > qapi/qapi-schema.json. > > > > Split that doc-comment up so that the heading lines are in their own > > doc-comment. This will allow us to tighten the specification to > > insist that heading lines are always standalone, rather than > > requiring the rST document generator to look at every line in a doc > > comment block and handle headings in odd places. > I figure this is a minimally invasive patch to avoid complications in > your rST generator. I'm afraid it sweeps the actual problem under the > rug, namely flaws in our parsing and representation of doc comments. > > The doc comment parser doesn't recognize headings. Instead, that's done > somewhere in the bowels of the Texinfo generator. Works as long as the > input is "sane", happily generates invalid Texinfo otherwise, see > tests/qapi-schema/doc-bad-section.json. > > The proper fix is to make the parser recognize headers in the places > where headers make sense, and reject them elsewhere. > > But maybe we don't have to. Do you plan to support full rST in doc > comments? If yes, why have our own syntax for headings? Why not leave > it to rST? If no, do you plan to support a subset of rST? If yes, > define it, please. How will it be enforced? Doc comments do support full rST. However, (as the commit message here notes), if you're generating a tree of docutils nodes and one of them has a section heading in it then you'll get a result that looks like this: [root] - [ some section created by the script for a QAPI command ] - [ some section ] - [text nodes, etc going into this section] - [a section resulting from rST parsing the header inside the docstring] - [ next section created by the script for a QAPI command ] (ie you'll have defined a subsection within whatever document paragraph/section the current command is documenting, not a new top-level subsection which subsequent commands will become children of) What you actually want is that the new header results in a differently structured tree: [root] - [ some section created by the script for a QAPI command ] - [ some section ] - [text nodes, etc going into this section] - [ a new top level section whose header is whatever this header is ] - [ next section created by the script is a child of that section ] - [ etc ] There's no way to get that without actually noticing and handling headings specially as being something entirely different from a lump of documentation text. "A heading is a single-line special-case of a freeform comment" happens to be the way we mark up headings now in 99% of cases, so that's what I implemented. (The Sphinx extension will complain if there's trailing junk lines after a heading line at the beginning of a freeform comment block. If you use '== something' in a line in the middle of a doc comment, we'll just interpret that as rST source, which is to say a couple of literal equals signs at the start of a line.) thanks -- PMM
Peter Maydell <peter.maydell@linaro.org> writes: > On Fri, 7 Feb 2020 at 15:35, Markus Armbruster <armbru@redhat.com> wrote: >> >> Peter Maydell <peter.maydell@linaro.org> writes: >> >> > Our current QAPI doc-comment markup allows section headers >> > (introduced with a leading '=' or '==') anywhere in any documentation >> > comment. This works for texinfo because the texi generator simply >> > prints a texinfo heading directive at that point in the output >> > stream. For rST generation, since we're assembling a tree of >> > docutils nodes, this is awkward because a new section implies >> > starting a new section node at the top level of the tree and >> > generating text into there. >> > >> > New section headings in the middle of the documentation of a command >> > or event would be pretty nonsensical, and in fact we only ever output >> > new headings using 'freeform' doc comment blocks whose only content >> > is the single line of the heading, with two exceptions, which are in >> > the introductory freeform-doc-block at the top of >> > qapi/qapi-schema.json. >> > >> > Split that doc-comment up so that the heading lines are in their own >> > doc-comment. This will allow us to tighten the specification to >> > insist that heading lines are always standalone, rather than >> > requiring the rST document generator to look at every line in a doc >> > comment block and handle headings in odd places. > >> I figure this is a minimally invasive patch to avoid complications in >> your rST generator. I'm afraid it sweeps the actual problem under the >> rug, namely flaws in our parsing and representation of doc comments. >> >> The doc comment parser doesn't recognize headings. Instead, that's done >> somewhere in the bowels of the Texinfo generator. Works as long as the >> input is "sane", happily generates invalid Texinfo otherwise, see >> tests/qapi-schema/doc-bad-section.json. >> >> The proper fix is to make the parser recognize headers in the places >> where headers make sense, and reject them elsewhere. >> >> But maybe we don't have to. Do you plan to support full rST in doc >> comments? If yes, why have our own syntax for headings? Why not leave >> it to rST? If no, do you plan to support a subset of rST? If yes, >> define it, please. How will it be enforced? > > Doc comments do support full rST. However, (as the commit message > here notes), if you're generating a tree of docutils nodes and > one of them has a section heading in it then you'll get a result > that looks like this: > > [root] > - [ some section created by the script for a QAPI command ] > - [ some section ] > - [text nodes, etc going into this section] > - [a section resulting from rST parsing the header inside the docstring] > - [ next section created by the script for a QAPI command ] > > (ie you'll have defined a subsection within whatever document > paragraph/section the current command is documenting, not > a new top-level subsection which subsequent commands will > become children of) > > What you actually want is that the new header results in > a differently structured tree: > [root] > - [ some section created by the script for a QAPI command ] > - [ some section ] > - [text nodes, etc going into this section] > - [ a new top level section whose header is whatever this header is ] > - [ next section created by the script is a child of that section ] > - [ etc ] > > There's no way to get that without actually noticing and handling > headings specially as being something entirely different from > a lump of documentation text. "A heading is a single-line special-case > of a freeform comment" happens to be the way we mark up headings > now in 99% of cases, so that's what I implemented. (The Sphinx > extension will complain if there's trailing junk lines after > a heading line at the beginning of a freeform comment block. > If you use '== something' in a line in the middle of a doc > comment, we'll just interpret that as rST source, which is to > say a couple of literal equals signs at the start of a line.) A couple of remarks. Silently passing a "# == something" line to rST for literal (mis-)interpretation is not nice. It's the kind of indifference that led to the messes you cleaned up in PATCH 04 and 08. If the '=' markup is only valid in certain places, it should be rejected where it isn't. By refusing to translate "# == something" to rST (silently or loudly, doesn't matter), the first tree structure becomes impossible. Except when I do the translating myself: I can put an *rST* section wherever I want. I'm still having difficulties understanding what exactly we gain by translating '=' markup to rST. By the way, your implementation rejects ## # = Introduction # xxx ## but silently accepts ## # xxx # = Introduction ## doc-good.json has more instances of this issue. Before your series, we actually check we generate the Texinfo we expect for it. I can't find where you cover this now. It has saved me from my screwups more than once, so I don't want to lose that. Now let's put my doubts and your possible bugs / omissions aside and assume we want '=' markup, and we want to keep the resulting sections out of "sections created by the script for a QAPI command". A schema's documentation is a sequence of comment blocks. Each comment block is either a definition comment block or a free form comment block. Before your series, we recognize '=' markup everywhere, but that's basically wrong (see "flaws in our parsing and representation of doc comments" above). It should be accepted only in free-form comment blocks. That way, the free-form comment blocks build a section structure, and the definition comment blocks slot their stuff into this structure. Form a language design perspective, I can't see the need for restricting '=' further to occur only by themselves. Is it an issue of implementation perhaps?
On Sat, 8 Feb 2020 at 14:10, Markus Armbruster <armbru@redhat.com> wrote: > A couple of remarks. > > Silently passing a "# == something" line to rST for literal > (mis-)interpretation is not nice. It's the kind of indifference that > led to the messes you cleaned up in PATCH 04 and 08. If the '=' markup > is only valid in certain places, it should be rejected where it isn't. > > By refusing to translate "# == something" to rST (silently or loudly, > doesn't matter), the first tree structure becomes impossible. Except > when I do the translating myself: I can put an *rST* section wherever I > want. > > I'm still having difficulties understanding what exactly we gain by > translating '=' markup to rST. We don't translate '=' markup to rST. We use it as the way the input document tells us about the structure of the tree of docutils sections it wants us to create. No rST markup is ever involved in that process. If we wanted to do something with '# == something' in actual chunks of doc comment, we would have to specifically scan the doc comment for that. Currently we simply pass doc comments to the rST parser to be parsed. Then you have to document the syntax of a doc comment as "rST, except that as a special case a line starting == doesn't do what it does in normal rST text but is a syntax error". What I would like is: * doc comments are simply rST, and we interpret them as rST by passing them directly to the rST parser. (We do special case the "@var means ``var``" markup, but I would rather keep to a minimum the number of things we special case in that way.) * headings and subheadings affect the entire document structure; they are not rST source and are never interpreted as rST source or sent to the rST parser. They are a thing all of their own, not a bit of markup within a larger doc comment. * not to gratuitously change the way we in practice mark up headings in our document, because that makes the texi->rST transition harder for no clear gain. If we absolutely must mark them up in some other way than we do today, I can implement that, but it feels like unnecessary work. > By the way, your implementation rejects > > ## > # = Introduction > # xxx > ## > > but silently accepts > > ## > # xxx > # = Introduction > ## Yes. I could treat all freeform comments that aren't the special one-line heading/subheading as being normal freeform comments, which would avoid this inconsistency. That's probably better. > doc-good.json has more instances of this issue. Before your series, we > actually check we generate the Texinfo we expect for it. I can't find > where you cover this now. It has saved me from my screwups more than > once, so I don't want to lose that. Yes, we no longer have a test case for "do we generate what we expect to". That's harder to do with a Sphinx extension because we don't ever output rST source anywhere that we could compare to an expected version. (We still have the existing tests that the doc comments spit out by parser.py match expected output.) I'm dubious about running Sphinx and comparing the generated HTML because that seems like it would be vulnerable to test failures if Sphinx internals change the fine detail of how it outputs HTML. > Now let's put my doubts and your possible bugs / omissions aside and > assume we want '=' markup, and we want to keep the resulting sections > out of "sections created by the script for a QAPI command". > > A schema's documentation is a sequence of comment blocks. > > Each comment block is either a definition comment block or a free form > comment block. > > Before your series, we recognize '=' markup everywhere, but that's > basically wrong (see "flaws in our parsing and representation of doc > comments" above). It should be accepted only in free-form comment > blocks. > > That way, the free-form comment blocks build a section structure, and > the definition comment blocks slot their stuff into this structure. > > Form a language design perspective, I can't see the need for restricting > '=' further to occur only by themselves. > > Is it an issue of implementation perhaps? Do you mean: could we make the implementation take a freeform comment block like: ## # = Foo # # Some text # # = Bar # # More text ## and parse through that text to split it up into - "start new level-1 section with heading 'Foo'" - "a freeform comment block '\nSome text\n\n'" (for current section) - "start new level-1 section with heading 'Bar'" - "a freeform comment block '\nMore text\n\n'" (for current section) so that the input rST doesn't need to mark off the headings as being in their own freeform comment block ? Yes, we could do that, and it wouldn't be too hard. My issues with that are: (1) you now don't have a way to literally write '= Foo' to be interpreted the way that rST would interpret it in a comment block that's otherwise just "any rST markup is fine". (2) it falsely suggests that headings are OK in doc comment blocks and are just another kind of markup within a doc comment, when they aren't, and we're really just providing syntactic sugar here (3) it obscures the actual structure of the document, which is [root node] [section] [title 'Foo'] [Text 'Some text'] [section] [title 'Bar'] [Text 'More text'] where "Some text" and "More text" are in entirely different sections, even though they're in the same doc comment block. (If 'Foo' is a subsection and 'Bar' a section, the text can end up even further separated within the document tree.) (4) it breaks a general approach within the handling of doc comments, which is that we build the document based on the various bits of information about the symbol, etc, but a chunk of text from a document comment is handled simply by handing it off to the rST parser, not by doing some script-specific pre-parsing and mangling of it and then handing it to the rST parser. This in turn means that we need to document the syntax of comment blocks as not just "it's rST with the @var-means-``var`` extra" but "it's rST, with @var-means-``var``, and = Foo is invalid most places but within a freeform block means this other thing". I think that minimising the number of extras we add on top of "the syntax for a block of text in a doc comment is rST" makes things less confusing. thanks -- PMM
diff --git a/qapi/qapi-schema.json b/qapi/qapi-schema.json index 9751b11f8f1..f7ba60a5d0b 100644 --- a/qapi/qapi-schema.json +++ b/qapi/qapi-schema.json @@ -1,7 +1,9 @@ # -*- Mode: Python -*- ## # = Introduction -# +## + +## # This document describes all commands currently supported by QMP. # # Most of the time their usage is exactly the same as in the user Monitor, this @@ -25,9 +27,13 @@ # # Please, refer to the QMP specification (docs/interop/qmp-spec.txt) for # detailed information on the Server command and response formats. -# +## + +## # = Stability Considerations -# +## + +## # The current QMP command set (described in this file) may be useful for a # number of use cases, however it's limited and several commands have bad # defined semantics, specially with regard to command completion.
Our current QAPI doc-comment markup allows section headers (introduced with a leading '=' or '==') anywhere in any documentation comment. This works for texinfo because the texi generator simply prints a texinfo heading directive at that point in the output stream. For rST generation, since we're assembling a tree of docutils nodes, this is awkward because a new section implies starting a new section node at the top level of the tree and generating text into there. New section headings in the middle of the documentation of a command or event would be pretty nonsensical, and in fact we only ever output new headings using 'freeform' doc comment blocks whose only content is the single line of the heading, with two exceptions, which are in the introductory freeform-doc-block at the top of qapi/qapi-schema.json. Split that doc-comment up so that the heading lines are in their own doc-comment. This will allow us to tighten the specification to insist that heading lines are always standalone, rather than requiring the rST document generator to look at every line in a doc comment block and handle headings in odd places. This change makes no difference to the generated texi. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> --- qapi/qapi-schema.json | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)