diff mbox series

[v3,5/9] tests/style: check for commonly doubled up words

Message ID 20220707163720.1421716-6-berrange@redhat.com
State New
Headers show
Series tests: introduce a tree-wide code style checking facility | expand

Commit Message

Daniel P. Berrangé July 7, 2022, 4:37 p.m. UTC
This style check looks for cases where the words

  the then in an on if is it but for or at and do to

are repeated in a sentence. It uses a multi-line match to catch the
especially common mistake in docs where the last word on a line is
repeated as the first word of the next line.

There are inevitably be some false positives with this check, for
example, some docs data tables have the same word in adjacent columns.

There are a few different ways to express this text as a regex which
have wildly different execution time. This impl was carefully chosen
to attempt to minimize matching time.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 tests/style.yml | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)
diff mbox series

Patch

diff --git a/tests/style.yml b/tests/style.yml
index 704227d8e9..d06c55bb29 100644
--- a/tests/style.yml
+++ b/tests/style.yml
@@ -91,3 +91,33 @@  int_assign_bool:
   files: \.c$
   prohibit: \<int\>.*= *(true|false)\b
   message: use bool type for boolean values
+
+double_words:
+  multiline: true
+  prohibit:
+    terms:
+      - the\s+the
+      - then\s+then
+      - in\s+in
+      - an\s+an
+      - on\s+on
+      - if\s+if
+      - is\s+is
+      - it\s+it
+      - but\s+but
+      - for\s+for
+      - or\s+or
+      - at\s+at
+      - and\s+and
+      - do\s+do
+      - to\s+to
+      - can\s+can
+    prefix: \b(?<!=|@|'|")
+    suffix: \b(?!=|@|'|")
+  message: doubled words
+  ignore:
+    - disas/sparc\.c
+    - pc-bios/
+    - qemu-options\.hx
+    - scripts/checkpatch\.pl
+    - tests/qtest/arm-cpu-features\.c
\ No newline at end of file