mbox series

[RFC,v2,0/2] Libclang based analyzer

Message ID 20210604111434.21422-1-rpalethorpe@suse.com
Headers show
Series Libclang based analyzer | expand

Message

Richard Palethorpe June 4, 2021, 11:14 a.m. UTC
Hello,

This implements a TEST() check and integrates the check into the build
system.

Compared to the Coccinelle version it's very ugly. However I think
this will allow us to get all the low hanging fruit without creating
major problems for test developers.

I guess it could be run during CI if we either fix all the existing
TEST() usages in the library or add an ignore list. I already have a
Coccinelle script to help with the former.

V2:
* Consistently use singular form of 'check'
* Include missing clang-check.mk
* Add some more comments in main.c

Richard Palethorpe (2):
  Add 'make check' and clang-check to build system
  Start libclang based analyzer and TEST() check

 configure.ac                       |   2 +
 include/mk/clang-check.mk          |   9 ++
 include/mk/config.mk.in            |   5 +
 include/mk/env_post.mk             |   8 +
 include/mk/generic_leaf_target.inc |   5 +-
 include/mk/lib.mk                  |   3 +
 include/mk/rules.mk                |   9 ++
 include/mk/testcases.mk            |   1 +
 tools/clang-check/.gitignore       |   1 +
 tools/clang-check/Makefile         |  14 ++
 tools/clang-check/main.c           | 239 +++++++++++++++++++++++++++++
 11 files changed, 295 insertions(+), 1 deletion(-)
 create mode 100644 include/mk/clang-check.mk
 create mode 100644 tools/clang-check/.gitignore
 create mode 100644 tools/clang-check/Makefile
 create mode 100644 tools/clang-check/main.c

Comments

Cyril Hrubis June 4, 2021, 12:52 p.m. UTC | #1
Hi!
> Compared to the Coccinelle version it's very ugly. However I think
> this will allow us to get all the low hanging fruit without creating
> major problems for test developers.

I had a look at the code and it does not seem to be that bad. It's
longer due to some boiler plate and more explicit but not necessarily
ugly.

> I guess it could be run during CI if we either fix all the existing
> TEST() usages in the library or add an ignore list. I already have a
> Coccinelle script to help with the former.

I will have a look at the patches generated by coccinelle, I supposed
that we want to merge them regardless.
Richard Palethorpe June 4, 2021, 1:50 p.m. UTC | #2
Hello Cyril,

Cyril Hrubis <chrubis@suse.cz> writes:

> Hi!
>> Compared to the Coccinelle version it's very ugly. However I think
>> this will allow us to get all the low hanging fruit without creating
>> major problems for test developers.
>
> I had a look at the code and it does not seem to be that bad. It's
> longer due to some boiler plate and more explicit but not necessarily
> ugly.

I agree, but "semantic patches" are a really elgant way of matching code
compared to scanning the AST with some ad-hoc logic for matching.

>
>> I guess it could be run during CI if we either fix all the existing
>> TEST() usages in the library or add an ignore list. I already have a
>> Coccinelle script to help with the former.
>
> I will have a look at the patches generated by coccinelle, I supposed
> that we want to merge them regardless.

OK, I just sent in a patch for the CGroups part.
Joerg Vehlow June 7, 2021, 8:37 a.m. UTC | #3
Hi,

just one quick remark. I guess the whole reason for using clang over 
coccinelle was availability of clang on developer systems.
I just wanted to quickly check your work, but had no clang installed. 
Build fail, even with cyril's patch for CHECK_TARGETS, because 
clang-c/Index.h is not found.

On ubuntu 20.04, this file is part of libclang-dev, but installing it 
did not help either, because it is installed to an include path not know 
to gcc (/usr/lib/llvm-10/include/clang-c).
I added it to the include path and it was found, but the next problem 
is, that some used functions (like clang_Cursor_getVarDeclInitializer) 
are only available starting with libclang 12.


So in conclusion, I do not think we can assume libclang to be available 
for all developers and installing it is probably more work, at least 
when newer functions from libclang are used, than installing coccinelle.
And very important for final setup: It must be possible to successfully 
compile ltp, without libclang/coccinelle available. There is no reason 
to force this libraries/tools for pure "users" of ltp.

Jörg
Cyril Hrubis June 7, 2021, 9:14 a.m. UTC | #4
Hi!
> So in conclusion, I do not think we can assume libclang to be available 
> for all developers and installing it is probably more work, at least 
> when newer functions from libclang are used, than installing coccinelle.
> And very important for final setup: It must be possible to successfully 
> compile ltp, without libclang/coccinelle available. There is no reason 
> to force this libraries/tools for pure "users" of ltp.

Indeed this is supposed to be tooling for developers and strictly
optional. If this breaks builds without libclang it has to be fixed.
Richard Palethorpe June 7, 2021, 10:20 a.m. UTC | #5
Hello Joerg,

Joerg Vehlow <lkml@jv-coder.de> writes:

> Hi,
>
> just one quick remark. I guess the whole reason for using clang over
> coccinelle was availability of clang on developer systems.
> I just wanted to quickly check your work, but had no clang
> installed. Build fail, even with cyril's patch for CHECK_TARGETS,
> because clang-c/Index.h is not found.

Thanks for testing it.

>
> On ubuntu 20.04, this file is part of libclang-dev, but installing it
> did not help either, because it is installed to an include path not
> know to gcc (/usr/lib/llvm-10/include/clang-c).

Is part of this path the same that 'clang -print-resource-dir' prints?

Either way I guess we can search for this during configuration. LLVM has
a CMake module (or w/e) which probably finds all this automatically.

> I added it to the include path and it was found, but the next problem
> is, that some used functions (like clang_Cursor_getVarDeclInitializer) 
> are only available starting with libclang 12.
>

I guess that we could replace that function by recursing further into
the AST to find the initializer ourselves.

Probably we can restrict ourselves to only use functions from before
libclang 11.

>
> So in conclusion, I do not think we can assume libclang to be
> available for all developers and installing it is probably more work,
> at least when newer functions from libclang are used, than installing
> coccinelle.

IIRC Cyril said the Coccinelle package on Gentoo is not maintained
anymore. AFAICT it exists, but it is on an old version. I don't think
many people are interested in or want to maintain Ocaml
stuff. LLVM/Clang OTOH looks to be very active.

> And very important for final setup: It must be possible to
> successfully compile ltp, without libclang/coccinelle available. There
> is no reason to force this libraries/tools for pure "users" of ltp.
>
> Jörg

100% agree. These checks only make sense during development. Even then I
want to ensure that everyone can run the checks with very little
effort.
Joerg Vehlow June 7, 2021, 11:34 a.m. UTC | #6
Hi Richard,

On 6/7/2021 12:20 PM, Richard Palethorpe wrote:
> Hello Joerg,
>
> On ubuntu 20.04, this file is part of libclang-dev, but installing it
> did not help either, because it is installed to an include path not
> know to gcc (/usr/lib/llvm-10/include/clang-c).
> Is part of this path the same that 'clang -print-resource-dir' prints?
>
> Either way I guess we can search for this during configuration. LLVM has
> a CMake module (or w/e) which probably finds all this automatically.
resource dir is /usr/lib/llvm-12/lib/clang/12.0.1.

The llvm-config tool can be used to find the locations of the include 
and lib directory.
On my ubuntu, I installed clang-12 from apt.llvm.org and clang-10 from 
ubuntu repos.
In the path there is the llvm-config tool from ubuntu pointing to 
/usr/lib/llvm-10/bin/llvm-config and llvm-config-10 and llvm-config-12 
pointing to the respective llvm-config tool.

I guess using llvm-config from the path to detect the correct include 
and library path would be the best way to go.
If someone wants to use a different version, he can still set prepend it 
to the path during configuration:

$ llvm-config --includedir
/usr/lib/llvm-10/include

$ llvm-config --libdir
/usr/lib/llvm-10/lib

$ PATH="/usr/lib/llvm-12/bin:$PATH" llvm-config --includedir
/usr/lib/llvm-12/include


Both includedir and libdir are required, to correctly link libclang. In 
the default library search paths, there are only versioned versiones of 
libclang (eg. libclang-12.so).

>> I added it to the include path and it was found, but the next problem
>> is, that some used functions (like clang_Cursor_getVarDeclInitializer)
>> are only available starting with libclang 12.
>>
> I guess that we could replace that function by recursing further into
> the AST to find the initializer ourselves.
>
> Probably we can restrict ourselves to only use functions from before
> libclang 11.
Sounds good, but how to force this? I don't think there is a "allow only 
libclang 10 symbols"...
>
>> So in conclusion, I do not think we can assume libclang to be
>> available for all developers and installing it is probably more work,
>> at least when newer functions from libclang are used, than installing
>> coccinelle.
> IIRC Cyril said the Coccinelle package on Gentoo is not maintained
> anymore. AFAICT it exists, but it is on an old version. I don't think
> many people are interested in or want to maintain Ocaml
> stuff. LLVM/Clang OTOH looks to be very active.
Right, it actually is removed now from gentoo portage tree ([1]). But is 
it used by the kernel developers?

Jörg

[1] 
https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=58395d3a0c06e060a0a40182fff4bf39f1910529
Cyril Hrubis June 7, 2021, 1:42 p.m. UTC | #7
Hi!
> > IIRC Cyril said the Coccinelle package on Gentoo is not maintained
> > anymore. AFAICT it exists, but it is on an old version. I don't think
> > many people are interested in or want to maintain Ocaml
> > stuff. LLVM/Clang OTOH looks to be very active.
> Right, it actually is removed now from gentoo portage tree ([1]). But is 
> it used by the kernel developers?

I guess that some people use it, however the choice of programming
language makes everthing much harder. On some distributions it's hard to
install and requires manual steps and when you hit a bug it's impossible
to debug unless you know how to debug ocaml. I've tried to fix a simple
bug in coccinelle once, but gave up after a few hours since I wasn't
even able to gasp how the source code is structured. And for the record
I used to read and write lisp and haskell just fine during my university
days. So even my skills in functional programming are rusty now I did
not expect that ocaml would be so alien to me...
Richard Palethorpe June 7, 2021, 1:49 p.m. UTC | #8
Hello Joerg,

Joerg Vehlow <lkml@jv-coder.de> writes:

> Hi Richard,
>
> On 6/7/2021 12:20 PM, Richard Palethorpe wrote:
>> Hello Joerg,
>>
>> On ubuntu 20.04, this file is part of libclang-dev, but installing it
>> did not help either, because it is installed to an include path not
>> know to gcc (/usr/lib/llvm-10/include/clang-c).
>> Is part of this path the same that 'clang -print-resource-dir' prints?
>>
>> Either way I guess we can search for this during configuration. LLVM has
>> a CMake module (or w/e) which probably finds all this automatically.
> resource dir is /usr/lib/llvm-12/lib/clang/12.0.1.
>
> The llvm-config tool can be used to find the locations of the include
> and lib directory.
> On my ubuntu, I installed clang-12 from apt.llvm.org and clang-10 from
> ubuntu repos.
> In the path there is the llvm-config tool from ubuntu pointing to
> /usr/lib/llvm-10/bin/llvm-config and llvm-config-10 and llvm-config-12 
> pointing to the respective llvm-config tool.
>
> I guess using llvm-config from the path to detect the correct include
> and library path would be the best way to go.
> If someone wants to use a different version, he can still set prepend
> it to the path during configuration:
>
> $ llvm-config --includedir
> /usr/lib/llvm-10/include
>
> $ llvm-config --libdir
> /usr/lib/llvm-10/lib
>
> $ PATH="/usr/lib/llvm-12/bin:$PATH" llvm-config --includedir
> /usr/lib/llvm-12/include
>
>
> Both includedir and libdir are required, to correctly link
> libclang. In the default library search paths, there are only
> versioned versiones of libclang (eg. libclang-12.so).

OK.

>
>>> I added it to the include path and it was found, but the next problem
>>> is, that some used functions (like clang_Cursor_getVarDeclInitializer)
>>> are only available starting with libclang 12.
>>>
>> I guess that we could replace that function by recursing further into
>> the AST to find the initializer ourselves.
>>
>> Probably we can restrict ourselves to only use functions from before
>> libclang 11.
> Sounds good, but how to force this? I don't think there is a "allow
> only libclang 10 symbols"...

Make a list of the symbols exported by libclang 10. Then check that
anything which starts with clang_ or CX is in that list.

Or just compile it against libclang 10 in CI.

>>
>>> So in conclusion, I do not think we can assume libclang to be
>>> available for all developers and installing it is probably more work,
>>> at least when newer functions from libclang are used, than installing
>>> coccinelle.
>> IIRC Cyril said the Coccinelle package on Gentoo is not maintained
>> anymore. AFAICT it exists, but it is on an old version. I don't think
>> many people are interested in or want to maintain Ocaml
>> stuff. LLVM/Clang OTOH looks to be very active.
> Right, it actually is removed now from gentoo portage tree ([1]). But
> is it used by the kernel developers?

Some kernel developers use it. There are a number of checkers and some
maintainers care about them while others do not.

>
> Jörg
>
> [1]
> https://gitweb.gentoo.org/repo/gentoo.git/commit/?id=58395d3a0c06e060a0a40182fff4bf39f1910529