diff mbox

[v2] checkpatch: Detect newlines in error_report and other error functions

Message ID 1449858642-24267-1-git-send-email-jjherne@linux.vnet.ibm.com
State New
Headers show

Commit Message

Jason J. Herne Dec. 11, 2015, 6:30 p.m. UTC
We don't want newlines embedded in error messages. This seems to be a common
problem with new code so let's try to catch it with checkpatch.

This will not catch cases where newlines are inserted into the middle of an
existing multi-line statement. But those cases should be rare.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
---
 scripts/checkpatch.pl | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

Comments

Markus Armbruster Dec. 14, 2015, 12:47 p.m. UTC | #1
"Jason J. Herne" <jjherne@linux.vnet.ibm.com> writes:

> We don't want newlines embedded in error messages. This seems to be a common
> problem with new code so let's try to catch it with checkpatch.
>
> This will not catch cases where newlines are inserted into the middle of an
> existing multi-line statement. But those cases should be rare.
>
> Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>

Awesome!

Ironically, checkpatch complains a lot about this patch: 31 "code indent
should never use tabs" and four "line over 80 characters".  Since the
script uses tabs pretty consistently, I guess we'll want to ignore the
former.  Long lines are also frequent, but three of the four new ones
are comments that ought to be wrapped.  Could be done on commit.

To test this patch, I fed it a revert of my series.  Score:

* Revert "error: Clean up errors with embedded newlines (again), part 2"

  1/6

  Pretty difficult cases.  The last one is flagged, perhaps because the
  string is split right after an embedded newline.

* Revert "error: Clean up errors with embedded newlines (again), part 1"

  2/2

* Revert "error: Strip trailing '\n' from error string arguments (again)"

  10/23

  Could you look into catching a few more of these?

> ---
>  scripts/checkpatch.pl | 39 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
>
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index b0f6e11..51ea667 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -2511,6 +2511,45 @@ sub process {
>  			WARN("use QEMU instead of Qemu or QEmu\n" . $herecurr);
>  		}
>  
> +# Qemu error function tests
> +
> +	# Find newlines in error function text
> +	my $qemu_error_funcs = qr{error_setg|
> +				error_setg_errno|
> +				error_setg_win32|
> +				error_set|
> +				error_vprintf|
> +				error_printf|
> +				error_printf_unless_qmp|
> +				error_vreport|
> +				error_report}x;
> +
> +	if ($rawline =~ /\b(?:$qemu_error_funcs)\s*\(\s*\".*\\n/) {
> +		WARN("Error function text should not contain newlines\n" . $herecurr);
> +	}
> +
> +	# Continue checking for error function text that contains newlines. This
> +	# check handles cases where string literals are spread over multiple lines.
> +	# Example:
> +	# error_report("Error msg line #1"
> +	#              "Error msg line #2\n");
> +	my $quoted_newline_regex = qr{\+\s*\".*\\n.*\"};
> +	my $continued_str_literal = qr{\+\s*\".*\"};
> +
> +	if ($rawline =~ /$quoted_newline_regex/) {
> +		# Backtrack to first line that does not contain only a quoted literal
> +		# and assume that it is the start of the statement.
> +		my $i = $linenr - 2;
> +
> +		while (($i >= 0) & $rawlines[$i] =~ /$continued_str_literal/) {
> +			$i--;
> +		}

I guess this fails to backtrack over lines that consisting of string
literals and macros (that expand into string literals), such as in this
example from my Revert "error: Strip trailing '\n' from error string
arguments (again)":

     if (sector_num > bs->total_sectors) {
         error_report("Wrong offset: sector_num=0x%" PRIx64
-                     " total_sectors=0x%" PRIx64,
-                     sector_num, bs->total_sectors);
+                " total_sectors=0x%" PRIx64 "\n",
+                sector_num, bs->total_sectors);
         return -EIO;
     }

> +
> +		if ($rawlines[$i] =~ /\b(?:$qemu_error_funcs)\s*\(/) {
> +			WARN("Error function text should not contain newlines\n" . $herecurr);
> +		}
> +	}
> +
>  # check for non-portable ffs() calls that have portable alternatives in QEMU
>  		if ($line =~ /\bffs\(/) {
>  			ERROR("use ctz32() instead of ffs()\n" . $herecurr);

Even in its current state, better than nothing.
Jason J. Herne Dec. 14, 2015, 2:45 p.m. UTC | #2
On 12/14/2015 07:47 AM, Markus Armbruster wrote:
> "Jason J. Herne" <jjherne@linux.vnet.ibm.com> writes:
>
>> We don't want newlines embedded in error messages. This seems to be a common
>> problem with new code so let's try to catch it with checkpatch.
>>
>> This will not catch cases where newlines are inserted into the middle of an
>> existing multi-line statement. But those cases should be rare.
>>
>> Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
>
> Awesome!
>
> Ironically, checkpatch complains a lot about this patch: 31 "code indent
> should never use tabs" and four "line over 80 characters".  Since the
> script uses tabs pretty consistently, I guess we'll want to ignore the
> former.  Long lines are also frequent, but three of the four new ones
> are comments that ought to be wrapped.  Could be done on commit.
>

Yep, the whole file uses tabs. I think we should convert it to spaces to be
consistent with Qemu coding guidelines. I can work up that patch if it would
be accepted.

I'll fix the comments.

> To test this patch, I fed it a revert of my series.  Score:
>
> * Revert "error: Clean up errors with embedded newlines (again), part 2"
>
>    1/6
>
>    Pretty difficult cases.  The last one is flagged, perhaps because the
>    string is split right after an embedded newline.
>
> * Revert "error: Clean up errors with embedded newlines (again), part 1"
>
>    2/2
>
> * Revert "error: Strip trailing '\n' from error string arguments (again)"
>
>    10/23
>
>    Could you look into catching a few more of these?
>
>> ---
>>   scripts/checkpatch.pl | 39 +++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 39 insertions(+)
>>
>> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
>> index b0f6e11..51ea667 100755
>> --- a/scripts/checkpatch.pl
>> +++ b/scripts/checkpatch.pl
>> @@ -2511,6 +2511,45 @@ sub process {
>>   			WARN("use QEMU instead of Qemu or QEmu\n" . $herecurr);
>>   		}
>>
>> +# Qemu error function tests
>> +
>> +	# Find newlines in error function text
>> +	my $qemu_error_funcs = qr{error_setg|
>> +				error_setg_errno|
>> +				error_setg_win32|
>> +				error_set|
>> +				error_vprintf|
>> +				error_printf|
>> +				error_printf_unless_qmp|
>> +				error_vreport|
>> +				error_report}x;
>> +
>> +	if ($rawline =~ /\b(?:$qemu_error_funcs)\s*\(\s*\".*\\n/) {
>> +		WARN("Error function text should not contain newlines\n" . $herecurr);
>> +	}
>> +
>> +	# Continue checking for error function text that contains newlines. This
>> +	# check handles cases where string literals are spread over multiple lines.
>> +	# Example:
>> +	# error_report("Error msg line #1"
>> +	#              "Error msg line #2\n");
>> +	my $quoted_newline_regex = qr{\+\s*\".*\\n.*\"};
>> +	my $continued_str_literal = qr{\+\s*\".*\"};
>> +
>> +	if ($rawline =~ /$quoted_newline_regex/) {
>> +		# Backtrack to first line that does not contain only a quoted literal
>> +		# and assume that it is the start of the statement.
>> +		my $i = $linenr - 2;
>> +
>> +		while (($i >= 0) & $rawlines[$i] =~ /$continued_str_literal/) {
>> +			$i--;
>> +		}
>
> I guess this fails to backtrack over lines that consisting of string
> literals and macros (that expand into string literals), such as in this
> example from my Revert "error: Strip trailing '\n' from error string
> arguments (again)":
>
>       if (sector_num > bs->total_sectors) {
>           error_report("Wrong offset: sector_num=0x%" PRIx64
> -                     " total_sectors=0x%" PRIx64,
> -                     sector_num, bs->total_sectors);
> +                " total_sectors=0x%" PRIx64 "\n",
> +                sector_num, bs->total_sectors);
>           return -EIO;
>       }
>
>> +

The problem here has more to do with how patches are structured. In 
particular, the - lines. I could try to add code to ignore the - lines. 
Really though, this is a limitation of checkpatch. We do not currently 
(not that I could see) have a good method of isolating a single 
multiline statement. But that is a problem for a different day I think :)

Ultimately, we'll never catch all of them with checkpatch. For example:
      "line 5"
      "line 6"
      "line 7"
+    "line 8\n"
     "line 7"
     "line 8"
     "line 9"
...
The patch only contains so much context info so in this case, because we 
do not have the call to error_report in the context we will never catch 
the error.

But I will take a look at this series and see if we can do better :).
Markus Armbruster Dec. 14, 2015, 3:40 p.m. UTC | #3
"Jason J. Herne" <jjherne@linux.vnet.ibm.com> writes:

> On 12/14/2015 07:47 AM, Markus Armbruster wrote:
>> "Jason J. Herne" <jjherne@linux.vnet.ibm.com> writes:
>>
>>> We don't want newlines embedded in error messages. This seems to be a common
>>> problem with new code so let's try to catch it with checkpatch.
>>>
>>> This will not catch cases where newlines are inserted into the middle of an
>>> existing multi-line statement. But those cases should be rare.
>>>
>>> Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
>>
>> Awesome!
>>
>> Ironically, checkpatch complains a lot about this patch: 31 "code indent
>> should never use tabs" and four "line over 80 characters".  Since the
>> script uses tabs pretty consistently, I guess we'll want to ignore the
>> former.  Long lines are also frequent, but three of the four new ones
>> are comments that ought to be wrapped.  Could be done on commit.
>>
>
> Yep, the whole file uses tabs. I think we should convert it to spaces to be
> consistent with Qemu coding guidelines. I can work up that patch if it would
> be accepted.

We may want to keep the tabs to make continued stealing of code from the
kernel's checkpatch.pl easier.

> I'll fix the comments.
>
>> To test this patch, I fed it a revert of my series.  Score:
>>
>> * Revert "error: Clean up errors with embedded newlines (again), part 2"
>>
>>    1/6
>>
>>    Pretty difficult cases.  The last one is flagged, perhaps because the
>>    string is split right after an embedded newline.
>>
>> * Revert "error: Clean up errors with embedded newlines (again), part 1"
>>
>>    2/2
>>
>> * Revert "error: Strip trailing '\n' from error string arguments (again)"
>>
>>    10/23
>>
>>    Could you look into catching a few more of these?
>>
>>> ---
>>>   scripts/checkpatch.pl | 39 +++++++++++++++++++++++++++++++++++++++
>>>   1 file changed, 39 insertions(+)
>>>
>>> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
>>> index b0f6e11..51ea667 100755
>>> --- a/scripts/checkpatch.pl
>>> +++ b/scripts/checkpatch.pl
>>> @@ -2511,6 +2511,45 @@ sub process {
>>>   			WARN("use QEMU instead of Qemu or QEmu\n" . $herecurr);
>>>   		}
>>>
>>> +# Qemu error function tests
>>> +
>>> +	# Find newlines in error function text
>>> +	my $qemu_error_funcs = qr{error_setg|
>>> +				error_setg_errno|
>>> +				error_setg_win32|
>>> +				error_set|
>>> +				error_vprintf|
>>> +				error_printf|
>>> +				error_printf_unless_qmp|
>>> +				error_vreport|
>>> +				error_report}x;
>>> +
>>> +	if ($rawline =~ /\b(?:$qemu_error_funcs)\s*\(\s*\".*\\n/) {
>>> +		WARN("Error function text should not contain newlines\n" . $herecurr);
>>> +	}
>>> +
>>> +	# Continue checking for error function text that contains newlines. This
>>> +	# check handles cases where string literals are spread over multiple lines.
>>> +	# Example:
>>> +	# error_report("Error msg line #1"
>>> +	#              "Error msg line #2\n");
>>> +	my $quoted_newline_regex = qr{\+\s*\".*\\n.*\"};
>>> +	my $continued_str_literal = qr{\+\s*\".*\"};
>>> +
>>> +	if ($rawline =~ /$quoted_newline_regex/) {
>>> +		# Backtrack to first line that does not contain only a quoted literal
>>> +		# and assume that it is the start of the statement.
>>> +		my $i = $linenr - 2;
>>> +
>>> +		while (($i >= 0) & $rawlines[$i] =~ /$continued_str_literal/) {
>>> +			$i--;
>>> +		}
>>
>> I guess this fails to backtrack over lines that consisting of string
>> literals and macros (that expand into string literals), such as in this
>> example from my Revert "error: Strip trailing '\n' from error string
>> arguments (again)":
>>
>>       if (sector_num > bs->total_sectors) {
>>           error_report("Wrong offset: sector_num=0x%" PRIx64
>> -                     " total_sectors=0x%" PRIx64,
>> -                     sector_num, bs->total_sectors);
>> +                " total_sectors=0x%" PRIx64 "\n",
>> +                sector_num, bs->total_sectors);
>>           return -EIO;
>>       }
>>
>>> +
>
> The problem here has more to do with how patches are structured. In
> particular, the - lines. I could try to add code to ignore the -
> lines. Really though, this is a limitation of checkpatch. We do not
> currently (not that I could see) have a good method of isolating a
> single multiline statement. But that is a problem for a different day
> I think :)

You're right.

> Ultimately, we'll never catch all of them with checkpatch. For example:
>      "line 5"
>      "line 6"
>      "line 7"
> +    "line 8\n"
>     "line 7"
>     "line 8"
>     "line 9"
> ...
> The patch only contains so much context info so in this case, because
> we do not have the call to error_report in the context we will never
> catch the error.
>
> But I will take a look at this series and see if we can do better :).

Thanks!  If we can't, then I'm for taking this imperfect patch, because
flagging some of these mistakes is better than flagging none.
Jason J. Herne Dec. 17, 2015, 5:49 p.m. UTC | #4
On 12/14/2015 10:40 AM, Markus Armbruster wrote:
> "Jason J. Herne" <jjherne@linux.vnet.ibm.com> writes:
>
>> On 12/14/2015 07:47 AM, Markus Armbruster wrote:
>>> "Jason J. Herne" <jjherne@linux.vnet.ibm.com> writes:
>>>
>>>> We don't want newlines embedded in error messages. This seems to be a common
>>>> problem with new code so let's try to catch it with checkpatch.
>>>>
>>>> This will not catch cases where newlines are inserted into the middle of an
>>>> existing multi-line statement. But those cases should be rare.
>>>>
>>>> Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
>>>
...
>> But I will take a look at this series and see if we can do better :).
>
> Thanks!  If we can't, then I'm for taking this imperfect patch, because
> flagging some of these mistakes is better than flagging none.
>
>

I won't get a chance to get back to this before the new year. So feel 
free to push this (unless someone objects) and I'll see what I can do 
about improving it in January.
Happy holidays! :)
Markus Armbruster Dec. 17, 2015, 6:28 p.m. UTC | #5
"Jason J. Herne" <jjherne@linux.vnet.ibm.com> writes:

> On 12/14/2015 10:40 AM, Markus Armbruster wrote:
>> "Jason J. Herne" <jjherne@linux.vnet.ibm.com> writes:
>>
>>> On 12/14/2015 07:47 AM, Markus Armbruster wrote:
>>>> "Jason J. Herne" <jjherne@linux.vnet.ibm.com> writes:
>>>>
>>>>> We don't want newlines embedded in error messages. This seems to
>>>>> be a common
>>>>> problem with new code so let's try to catch it with checkpatch.
>>>>>
>>>>> This will not catch cases where newlines are inserted into the middle of an
>>>>> existing multi-line statement. But those cases should be rare.
>>>>>
>>>>> Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
>>>>
> ...
>>> But I will take a look at this series and see if we can do better :).
>>
>> Thanks!  If we can't, then I'm for taking this imperfect patch, because
>> flagging some of these mistakes is better than flagging none.
>>
>>
>
> I won't get a chance to get back to this before the new year. So feel
> free to push this (unless someone objects) and I'll see what I can do
> about improving it in January.

Makes sense.  Applied to my local error-next branch.

> Happy holidays! :)

Thanks, same to you!
Markus Armbruster Dec. 18, 2015, 3:06 p.m. UTC | #6
Markus Armbruster <armbru@redhat.com> writes:

> "Jason J. Herne" <jjherne@linux.vnet.ibm.com> writes:
>
>> On 12/14/2015 10:40 AM, Markus Armbruster wrote:
>>> "Jason J. Herne" <jjherne@linux.vnet.ibm.com> writes:
>>>
>>>> On 12/14/2015 07:47 AM, Markus Armbruster wrote:
>>>>> "Jason J. Herne" <jjherne@linux.vnet.ibm.com> writes:
>>>>>
>>>>>> We don't want newlines embedded in error messages. This seems to
>>>>>> be a common
>>>>>> problem with new code so let's try to catch it with checkpatch.
>>>>>>
>>>>>> This will not catch cases where newlines are inserted into the
>>>>>> middle of an
>>>>>> existing multi-line statement. But those cases should be rare.
>>>>>>
>>>>>> Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
>>>>>
>> ...
>>>> But I will take a look at this series and see if we can do better :).
>>>
>>> Thanks!  If we can't, then I'm for taking this imperfect patch, because
>>> flagging some of these mistakes is better than flagging none.
>>>
>>>
>>
>> I won't get a chance to get back to this before the new year. So feel
>> free to push this (unless someone objects) and I'll see what I can do
>> about improving it in January.
>
> Makes sense.  Applied to my local error-next branch.

With the warning changed from

    "Error function text should not contain newlines\n"

to

    "Error message string should not contain newlines\n"

[...]
Markus Armbruster Jan. 11, 2016, 9:37 p.m. UTC | #7
"Jason J. Herne" <jjherne@linux.vnet.ibm.com> writes:

> We don't want newlines embedded in error messages. This seems to be a common
> problem with new code so let's try to catch it with checkpatch.
>
> This will not catch cases where newlines are inserted into the middle of an
> existing multi-line statement. But those cases should be rare.
>
> Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
> ---
>  scripts/checkpatch.pl | 39 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 39 insertions(+)
>
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
> index b0f6e11..51ea667 100755
> --- a/scripts/checkpatch.pl
> +++ b/scripts/checkpatch.pl
> @@ -2511,6 +2511,45 @@ sub process {
>  			WARN("use QEMU instead of Qemu or QEmu\n" . $herecurr);
>  		}
>  
> +# Qemu error function tests
> +
> +	# Find newlines in error function text
> +	my $qemu_error_funcs = qr{error_setg|
> +				error_setg_errno|
> +				error_setg_win32|
> +				error_set|

The following three...

> +				error_vprintf|
> +				error_printf|
> +				error_printf_unless_qmp|

... may legitimately print newlines.  Dropped them in my tree.

> +				error_vreport|
> +				error_report}x;
> +
> +	if ($rawline =~ /\b(?:$qemu_error_funcs)\s*\(\s*\".*\\n/) {
> +		WARN("Error function text should not contain newlines\n" . $herecurr);
> +	}
[...]
diff mbox

Patch

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index b0f6e11..51ea667 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -2511,6 +2511,45 @@  sub process {
 			WARN("use QEMU instead of Qemu or QEmu\n" . $herecurr);
 		}
 
+# Qemu error function tests
+
+	# Find newlines in error function text
+	my $qemu_error_funcs = qr{error_setg|
+				error_setg_errno|
+				error_setg_win32|
+				error_set|
+				error_vprintf|
+				error_printf|
+				error_printf_unless_qmp|
+				error_vreport|
+				error_report}x;
+
+	if ($rawline =~ /\b(?:$qemu_error_funcs)\s*\(\s*\".*\\n/) {
+		WARN("Error function text should not contain newlines\n" . $herecurr);
+	}
+
+	# Continue checking for error function text that contains newlines. This
+	# check handles cases where string literals are spread over multiple lines.
+	# Example:
+	# error_report("Error msg line #1"
+	#              "Error msg line #2\n");
+	my $quoted_newline_regex = qr{\+\s*\".*\\n.*\"};
+	my $continued_str_literal = qr{\+\s*\".*\"};
+
+	if ($rawline =~ /$quoted_newline_regex/) {
+		# Backtrack to first line that does not contain only a quoted literal
+		# and assume that it is the start of the statement.
+		my $i = $linenr - 2;
+
+		while (($i >= 0) & $rawlines[$i] =~ /$continued_str_literal/) {
+			$i--;
+		}
+
+		if ($rawlines[$i] =~ /\b(?:$qemu_error_funcs)\s*\(/) {
+			WARN("Error function text should not contain newlines\n" . $herecurr);
+		}
+	}
+
 # check for non-portable ffs() calls that have portable alternatives in QEMU
 		if ($line =~ /\bffs\(/) {
 			ERROR("use ctz32() instead of ffs()\n" . $herecurr);