[0/1] checkpatch: checker for comment block

Message ID	20181213175552.14857-1-wainersm@redhat.com
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Wainer dos Santos Moschetta <wainersm@redhat.com> To: qemu-devel@nongnu.org Date: Thu, 13 Dec 2018 12:55:51 -0500 Message-Id: <20181213175552.14857-1-wainersm@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] [PATCH 0/1] checkpatch: checker for comment block Precedence: list Cc: pbonzini@redhat.com, thuth@redhat.com, ehabkost@redhat.com, stefanha@redhat.com, Wainer dos Santos Moschetta <wainersm@redhat.com> Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
Series	checkpatch: checker for comment block \| expand [0/1] checkpatch: checker for comment block [1/1] checkpatch: check for malformed comment block.

Wainer dos Santos Moschetta Dec. 13, 2018, 5:55 p.m. UTC

Eduardo Habkost pointed out a malformed block of comments on my
patch [1] that I had ran checkpatch.pl and no warn/error was
reported. Then I realized the script does not catch such as
case (or it had a bug).

It turns out that checkpatch.pl does not parse comment blocks (If I understood
its code correctly...). So I implemented a checker that warns about:
1. block doesn't begin on its own line.
Example:
  /* blah blah
   * and blah blah
   */

2. line in the block doesn't start with asterisk.
Example:
  /*
    foo bar
    bar foo
   */
Note: only the first occurence (i.e 'foo bar') is reported.

3. block doesn't end on its own line.
Example:
  /*
   * blah blah
   * and blah */

References:
[1] https://www.mail-archive.com/qemu-devel@nongnu.org/msg580091.html

ps: last time I wrote Perl code was about 13 years ago. That remember me
of good times. :)

Wainer dos Santos Moschetta (1):
  checkpatch: check for malformed comment block.

 scripts/checkpatch.pl | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

Peter Maydell Dec. 13, 2018, 6:01 p.m. UTC | #1

On Thu, 13 Dec 2018 at 17:57, Wainer dos Santos Moschetta
<wainersm@redhat.com> wrote:
>
> Eduardo Habkost pointed out a malformed block of comments on my
> patch [1] that I had ran checkpatch.pl and no warn/error was
> reported. Then I realized the script does not catch such as
> case (or it had a bug).
>
> It turns out that checkpatch.pl does not parse comment blocks (If I understood
> its code correctly...). So I implemented a checker that warns about:
> 1. block doesn't begin on its own line.
> Example:
>   /* blah blah
>    * and blah blah
>    */

I sent a patch to do this a little while back:
 https://patchwork.kernel.org/patch/10561557/

It didn't get applied because Paolo disagreed with having
our tools enforcing what our style guide says.

Personally I think we should just commit my patch, and then
we can stop having people manually pointing out where
submitters' patches don't match CODING_STYLE.

thanks
-- PMM

Paolo Bonzini Dec. 13, 2018, 6:07 p.m. UTC | #2

On 13/12/18 19:01, Peter Maydell wrote:
> On Thu, 13 Dec 2018 at 17:57, Wainer dos Santos Moschetta
> <wainersm@redhat.com> wrote:
>>
>> Eduardo Habkost pointed out a malformed block of comments on my
>> patch [1] that I had ran checkpatch.pl and no warn/error was
>> reported. Then I realized the script does not catch such as
>> case (or it had a bug).
>>
>> It turns out that checkpatch.pl does not parse comment blocks (If I understood
>> its code correctly...). So I implemented a checker that warns about:
>> 1. block doesn't begin on its own line.
>> Example:
>>   /* blah blah
>>    * and blah blah
>>    */
> 
> I sent a patch to do this a little while back:
>  https://patchwork.kernel.org/patch/10561557/
> 
> It didn't get applied because Paolo disagreed with having
> our tools enforcing what our style guide says.

I didn't disagree with that---I disagreed with having a single style in
the style guide, because unlike most other blatant violations of the
coding style (eg. braces), this one is pervasive in maintained code and
I don't want code that I maintain to mix two comment styles.

So I proposed two alternatives:

- someone fixes all the comment blocks which are "starred" but don't
have a lone "/*" at the beginning, and then we can commit that patch;

- we allow "/* foo" on the first line, except for doc comments and for
the first line of the file (author/license block), and fix the style
guide accordingly.

Paolo

Peter Maydell Dec. 13, 2018, 6:21 p.m. UTC | #3

On Thu, 13 Dec 2018 at 18:07, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 13/12/18 19:01, Peter Maydell wrote:
> > I sent a patch to do this a little while back:
> >  https://patchwork.kernel.org/patch/10561557/
> >
> > It didn't get applied because Paolo disagreed with having
> > our tools enforcing what our style guide says.
>
> I didn't disagree with that---I disagreed with having a single style in
> the style guide, because unlike most other blatant violations of the
> coding style (eg. braces), this one is pervasive in maintained code and
> I don't want code that I maintain to mix two comment styles.
>
> So I proposed two alternatives:
>
> - someone fixes all the comment blocks which are "starred" but don't
> have a lone "/*" at the beginning, and then we can commit that patch;
>
> - we allow "/* foo" on the first line, except for doc comments and for
> the first line of the file (author/license block), and fix the style
> guide accordingly.

We came to a consensus on the comment style when we discussed
the patch which updated CODING_STYLE. I'm not personally
a fan of the result (I used to use "/* foo"), but what we have in the
doc is what we achieved consensus for. I'm not going to reopen
the "what should block comments look like" style debate.

We have always had older code which isn't following the new style,
and our general approach is that we don't do mass-style-fix patches
across the whole codebase. If you as a maintainer of a particular
sub-area want to do a style update you're free to do that.

thanks
-- PMM

Paolo Bonzini Dec. 13, 2018, 10:23 p.m. UTC | #4

On 13/12/18 19:21, Peter Maydell wrote:
> On Thu, 13 Dec 2018 at 18:07, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 13/12/18 19:01, Peter Maydell wrote:
>>> I sent a patch to do this a little while back:
>>>  https://patchwork.kernel.org/patch/10561557/
>>>
>>> It didn't get applied because Paolo disagreed with having
>>> our tools enforcing what our style guide says.
>>
>> I didn't disagree with that---I disagreed with having a single style in
>> the style guide, because unlike most other blatant violations of the
>> coding style (eg. braces), this one is pervasive in maintained code and
>> I don't want code that I maintain to mix two comment styles.
>>
>> So I proposed two alternatives:
>>
>> - someone fixes all the comment blocks which are "starred" but don't
>> have a lone "/*" at the beginning, and then we can commit that patch;
>>
>> - we allow "/* foo" on the first line, except for doc comments and for
>> the first line of the file (author/license block), and fix the style
>> guide accordingly.
> 
> We came to a consensus on the comment style when we discussed
> the patch which updated CODING_STYLE. I'm not personally
> a fan of the result (I used to use "/* foo"), but what we have in the
> doc is what we achieved consensus for. I'm not going to reopen
> the "what should block comments look like" style debate.

Sure, I don't want to do that either.  I accept the result of the
discussion; I don't accept introducing a new warning that will cause
over 700 files to become inconsistent sooner or later.  Therefore, the
only way to enforce the result of the discussion is to change the
existing comments, for example by having a script that maintainers can
use to change the existing comments in their files.  Having each of us
come up with their own script or doing it by hand is probably not a good
use of everyone's time.

Alternatively, fixing the style guide can also mean "explain why /* foo
is allowed by checkpatch even though it does not match the coding
style", without rehashing the discussion.

(BTW it may actually be a good idea to fix _some_ instances of bad
coding style, in particular the space-tab sequences and the files where
there are maybe 2 or 3 tabs that ended up there by mistake.  That's a
different topic).

Paolo

Markus Armbruster Dec. 14, 2018, 6:29 a.m. UTC | #5

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 13/12/18 19:21, Peter Maydell wrote:
>> On Thu, 13 Dec 2018 at 18:07, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> On 13/12/18 19:01, Peter Maydell wrote:
>>>> I sent a patch to do this a little while back:
>>>>  https://patchwork.kernel.org/patch/10561557/
>>>>
>>>> It didn't get applied because Paolo disagreed with having
>>>> our tools enforcing what our style guide says.
>>>
>>> I didn't disagree with that---I disagreed with having a single style in
>>> the style guide, because unlike most other blatant violations of the
>>> coding style (eg. braces), this one is pervasive in maintained code and
>>> I don't want code that I maintain to mix two comment styles.
>>>
>>> So I proposed two alternatives:
>>>
>>> - someone fixes all the comment blocks which are "starred" but don't
>>> have a lone "/*" at the beginning, and then we can commit that patch;
>>>
>>> - we allow "/* foo" on the first line, except for doc comments and for
>>> the first line of the file (author/license block), and fix the style
>>> guide accordingly.
>> 
>> We came to a consensus on the comment style when we discussed
>> the patch which updated CODING_STYLE. I'm not personally
>> a fan of the result (I used to use "/* foo"), but what we have in the
>> doc is what we achieved consensus for. I'm not going to reopen
>> the "what should block comments look like" style debate.
>
> Sure, I don't want to do that either.  I accept the result of the
> discussion; I don't accept introducing a new warning that will cause
> over 700 files to become inconsistent sooner or later.

By design, checkpatch.pl only checks *patches*.  Existing code doesn't
trigger warnings until it gets touched.  And then it should arguably be
made to conform to CODING_STYLE.  So, what's the problem again?  :)

>                                                         Therefore, the
> only way to enforce the result of the discussion is to change the
> existing comments,

I support cleaning up comment style wholesale[*].

>                    for example by having a script that maintainers can
> use to change the existing comments in their files.  Having each of us
> come up with their own script or doing it by hand is probably not a good
> use of everyone's time.

Sharing tools is good. 

> Alternatively, fixing the style guide can also mean "explain why /* foo
> is allowed by checkpatch even though it does not match the coding
> style", without rehashing the discussion.
>
> (BTW it may actually be a good idea to fix _some_ instances of bad
> coding style, in particular the space-tab sequences and the files where
> there are maybe 2 or 3 tabs that ended up there by mistake.  That's a
> different topic).

You've since posted patches for that.  Thanks.

>>>> Personally I think we should just commit my patch, and then
>>>> we can stop having people manually pointing out where
>>>> submitters' patches don't match CODING_STYLE.

Concur.  It has my R-by, modulo a commit message tweak.


[*] Same for other style violations.  Yes, it's churn, and yes, it'll
mess up git-blame some, but I'm convinced the presence of numerous bad
examples costs us more.  CODING_STYLE was committed almost a decade ago.
If we had cleaned up back then, the churn and the blame would be long
forgotten, and we would've spared ourselves plenty of review cycles and
quite a few style discussions.  It's late, but never too late.

Paolo Bonzini Dec. 14, 2018, 9:32 a.m. UTC | #6

On 14/12/18 07:29, Markus Armbruster wrote:
> Paolo Bonzini <pbonzini@redhat.com> writes:
> 
>> On 13/12/18 19:21, Peter Maydell wrote:
>>> On Thu, 13 Dec 2018 at 18:07, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>> On 13/12/18 19:01, Peter Maydell wrote:
>>>>> I sent a patch to do this a little while back:
>>>>>  https://patchwork.kernel.org/patch/10561557/
>>>>>
>>>>> It didn't get applied because Paolo disagreed with having
>>>>> our tools enforcing what our style guide says.
>>>>
>>>> I didn't disagree with that---I disagreed with having a single style in
>>>> the style guide, because unlike most other blatant violations of the
>>>> coding style (eg. braces), this one is pervasive in maintained code and
>>>> I don't want code that I maintain to mix two comment styles.
>>>>
>>>> So I proposed two alternatives:
>>>>
>>>> - someone fixes all the comment blocks which are "starred" but don't
>>>> have a lone "/*" at the beginning, and then we can commit that patch;
>>>>
>>>> - we allow "/* foo" on the first line, except for doc comments and for
>>>> the first line of the file (author/license block), and fix the style
>>>> guide accordingly.
>>>
>>> We came to a consensus on the comment style when we discussed
>>> the patch which updated CODING_STYLE. I'm not personally
>>> a fan of the result (I used to use "/* foo"), but what we have in the
>>> doc is what we achieved consensus for. I'm not going to reopen
>>> the "what should block comments look like" style debate.
>>
>> Sure, I don't want to do that either.  I accept the result of the
>> discussion; I don't accept introducing a new warning that will cause
>> over 700 files to become inconsistent sooner or later.
> 
> By design, checkpatch.pl only checks *patches*.  Existing code doesn't
> trigger warnings until it gets touched.  And then it should arguably be
> made to conform to CODING_STYLE.  So, what's the problem again?  :)

Once you add multiline comments to a file that has 3-line comments, they
have to be 4-line in order to appease checkpatch, and this create
inconsistencies.

In other words, it's not about checkpatch complaining on existing code.
 However, by running checkpatch on existing maintained code, you get a
measure of which files will grow an inconsistent style unless cleaned up
wholesale.

Anyway, in order to conclude the discussion and walk the walk, here is
the script.  It also converts GNU style to the anointed one.  I now
support applying Peter's patch, after the script is reviewed I'll send
it as a formal patch.

#! /bin/sh
#
# Fix multiline comments to match CODING_STYLE
#
# Copyright (C) 2018 Red Hat, Inc.
#
# Author: Paolo Bonzini
#
# Usage: scripts/fix-multiline-comments.sh [-i] FILE...
#
# -i edits the file in place (requires gawk 4.1.0).
#
# Set the AWK environment variable to choose the awk interpreter to use
# (default 'awk')

if test "$1" = -i; then
  # gawk extension
  inplace="-i inplace"
  shift
fi
${AWK-awk} $inplace 'BEGIN {
    getline first
    indent = -1
    while ((getline second) > 0) {
        # apply a star to the indent on lines after the first
        if (indent != -1) {
            if (first == "") {
                first = sp " *"
            } else if (substr(first, 1, indent + 2) == sp "  ") {
                first = sp " *" substr(first, indent + 3)
            }
        }

        is_lead = (first ~ /^[ \t]*\/\*/)
        is_trail = (first ~ /\*\//)
        if (is_lead && !is_trail) {
            # grab the indent at the start of a comment, but not for
            # single-line comments
            match(first, /^[ \t]*\/\*/)
            indent = RLENGTH - 2
            sp = substr(first, 1, indent)
        }

        # the regular expression filters out lone /*, /**, or */
        if (indent != -1 && !(first ~ /^[ \t]*(\/\*+|\*\/)[ \t]*$/)) {
            if (is_trail) {
                # split the trailing */ on a separate line
                match(first, /[ \t]*\*\//)
                first = substr(first, 1, RSTART - 1) "\n" sp " */"
            }
            if (is_lead) {
                # split the leading /* on a separate line
                match(first, /^[ \t]*\/\*+[ \t]*/)
                first = sp "/*\n" sp " *" substr(first, RLENGTH)
            }
        }
        if (is_trail) {
            indent = -1
        }
        print first
        first = second
    }
    print first
}' "$@"

Paolo

Peter Maydell Dec. 14, 2018, 11:20 a.m. UTC | #7

On Fri, 14 Dec 2018 at 06:29, Markus Armbruster <armbru@redhat.com> wrote:
>
> Paolo Bonzini <pbonzini@redhat.com> writes:
>
> > On 13/12/18 19:21, Peter Maydell wrote:
> >> On Thu, 13 Dec 2018 at 18:07, Paolo Bonzini <pbonzini@redhat.com> wrote:
> >>> On 13/12/18 19:01, Peter Maydell wrote:
> >>>> I sent a patch to do this a little while back:
> >>>>  https://patchwork.kernel.org/patch/10561557/

> >>>> Personally I think we should just commit my patch, and then
> >>>> we can stop having people manually pointing out where
> >>>> submitters' patches don't match CODING_STYLE.
>
> Concur.  It has my R-by, modulo a commit message tweak.

I have to admit I never really understood what tweak
you wanted making to the commit message. I'm happy
to make it clearer: do you want to suggest a rewording?

thanks
-- PMM

Wainer dos Santos Moschetta Dec. 14, 2018, 12:13 p.m. UTC | #8

On 12/13/2018 04:01 PM, Peter Maydell wrote:
> On Thu, 13 Dec 2018 at 17:57, Wainer dos Santos Moschetta
> <wainersm@redhat.com> wrote:
>> Eduardo Habkost pointed out a malformed block of comments on my
>> patch [1] that I had ran checkpatch.pl and no warn/error was
>> reported. Then I realized the script does not catch such as
>> case (or it had a bug).
>>
>> It turns out that checkpatch.pl does not parse comment blocks (If I understood
>> its code correctly...). So I implemented a checker that warns about:
>> 1. block doesn't begin on its own line.
>> Example:
>>    /* blah blah
>>     * and blah blah
>>     */
> I sent a patch to do this a little while back:
>   https://patchwork.kernel.org/patch/10561557/

Self-NACK my patch in favor of that, which has additional checks (e.g. * 
alignment).

>
> It didn't get applied because Paolo disagreed with having
> our tools enforcing what our style guide says.
>
> Personally I think we should just commit my patch, and then
> we can stop having people manually pointing out where
> submitters' patches don't match CODING_STYLE.

I am afraid that I can't give a firm option on this topic because it 
precedes my existence in this community, regardless I tend to agreed on 
Peter's reasoning.

Thanks!

- Wainer

T

>
> thanks
> -- PMM

Markus Armbruster Dec. 14, 2018, 12:31 p.m. UTC | #9

Peter Maydell <peter.maydell@linaro.org> writes:

> On Fri, 14 Dec 2018 at 06:29, Markus Armbruster <armbru@redhat.com> wrote:
>>
>> Paolo Bonzini <pbonzini@redhat.com> writes:
>>
>> > On 13/12/18 19:21, Peter Maydell wrote:
>> >> On Thu, 13 Dec 2018 at 18:07, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> >>> On 13/12/18 19:01, Peter Maydell wrote:
>> >>>> I sent a patch to do this a little while back:
>> >>>>  https://patchwork.kernel.org/patch/10561557/
>
>> >>>> Personally I think we should just commit my patch, and then
>> >>>> we can stop having people manually pointing out where
>> >>>> submitters' patches don't match CODING_STYLE.
>>
>> Concur.  It has my R-by, modulo a commit message tweak.
>
> I have to admit I never really understood what tweak
> you wanted making to the commit message. I'm happy
> to make it clearer: do you want to suggest a rewording?

The commit message claims "(The only changes needed are that Linux's
checkpatch.pl WARN() function takes an extra argument that ours does
not.)"  This isn't the case.  You also dropped the kernel's "Networking
with an initial /*" special case.

The simplest way to fix an incorrect claim is to delete it :)

If you want to explain what you changed, you could say something like
"(The only changes needed are that Linux's checkpatch.pl WARN() function
takes an extra argument that ours does not, and the kernel has a special
case for networking code we don't want.)"

Peter Maydell Dec. 14, 2018, 12:57 p.m. UTC | #10

On Fri, 14 Dec 2018 at 12:31, Markus Armbruster <armbru@redhat.com> wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:
> > On Fri, 14 Dec 2018 at 06:29, Markus Armbruster <armbru@redhat.com> wrote:
> > I have to admit I never really understood what tweak
> > you wanted making to the commit message. I'm happy
> > to make it clearer: do you want to suggest a rewording?
>
> The commit message claims "(The only changes needed are that Linux's
> checkpatch.pl WARN() function takes an extra argument that ours does
> not.)"  This isn't the case.  You also dropped the kernel's "Networking
> with an initial /*" special case.

The bit of the commit message you didn't quote says
"by backporting the relevant
parts of the Linux kernel's checkpatch.pl. (The only changes
needed are that Linux's checkpatch.pl WARN() function takes
an extra argument that ours does not.)"

The networking special case is not a "relevant part", which
is why it's not in the patch. The bracketed statement applies
to the code in the patch, not any lumps of code in the
kernel's checkpatch that are not in the patch.
Anyway, I've adjusted the commit message as you suggest.

Since we seem to now have consensus on the checkpatch patch,
I'm going to put it into the "misc" pull request I'm putting
together.

thanks
-- PMM

Markus Armbruster Dec. 14, 2018, 4:11 p.m. UTC | #11

Peter Maydell <peter.maydell@linaro.org> writes:

> On Fri, 14 Dec 2018 at 12:31, Markus Armbruster <armbru@redhat.com> wrote:
>> Peter Maydell <peter.maydell@linaro.org> writes:
>> > On Fri, 14 Dec 2018 at 06:29, Markus Armbruster <armbru@redhat.com> wrote:
>> > I have to admit I never really understood what tweak
>> > you wanted making to the commit message. I'm happy
>> > to make it clearer: do you want to suggest a rewording?
>>
>> The commit message claims "(The only changes needed are that Linux's
>> checkpatch.pl WARN() function takes an extra argument that ours does
>> not.)"  This isn't the case.  You also dropped the kernel's "Networking
>> with an initial /*" special case.
>
> The bit of the commit message you didn't quote says
> "by backporting the relevant
> parts of the Linux kernel's checkpatch.pl. (The only changes
> needed are that Linux's checkpatch.pl WARN() function takes
> an extra argument that ours does not.)"
>
> The networking special case is not a "relevant part", which
> is why it's not in the patch. The bracketed statement applies
> to the code in the patch, not any lumps of code in the
> kernel's checkpatch that are not in the patch.

I understand you logic now.

> Anyway, I've adjusted the commit message as you suggest.
>
> Since we seem to now have consensus on the checkpatch patch,
> I'm going to put it into the "misc" pull request I'm putting
> together.

Thanks!

[0/1] checkpatch: checker for comment block

Message

Comments