diff mbox series

[ovs-dev,v4] ci: Add clang-analyze to GitHub actions.

Message ID 170306750432.1723144.5957827069398112901.stgit@ebuild
State Changes Requested
Headers show
Series [ovs-dev,v4] ci: Add clang-analyze to GitHub actions. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/intel-ovs-compilation success test: success

Commit Message

Eelco Chaudron Dec. 20, 2023, 10:19 a.m. UTC
This patch identifies new static analysis issues during a GitHub action
run and reports them. The process involves analyzing the changes introduced
in the current commit and comparing them to those in the preceding commit.

However, there are two cases when the GitHub push action runner does not
provide enough details to determen the preceding commit. These cases are
a new branch or a forced push. The strategy for these exceptions is to
select the first commit not done by the current author as the base commit.

An example error output might look like this:

  error level: +0 -0 no changes
  warning level: +2 +0
    New issue "deadcode.DeadStores Value stored to 'remote' is never read" (1 occurrence)
      file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:86
    New issue "unix.Malloc Potential leak of memory pointed to by 'remote'" (1 occurrence)
      file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:95
  note level: +0 -0 no changes
  all levels: +2 +0

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
---

changes in v2:
  - When it's a new branch, it compares it to the HEAD of the default branch.

changes in v3:
  - Include the clang version as part of the cache
  - Change the way it looks for the 'default' branch so it will work
    for patch branches.
  - Also compare to the base branch for forced commits.

changes in v4:
  - No longer look for a default branch, but consume all patches
    from the current author.

 .ci/linux-build.sh                   |   29 ++++++++++
 .github/workflows/build-and-test.yml |  103 ++++++++++++++++++++++++++++++++++
 2 files changed, 132 insertions(+)

Comments

Simon Horman Dec. 20, 2023, 1:46 p.m. UTC | #1
On Wed, Dec 20, 2023 at 11:19:21AM +0100, Eelco Chaudron wrote:
> This patch identifies new static analysis issues during a GitHub action
> run and reports them. The process involves analyzing the changes introduced
> in the current commit and comparing them to those in the preceding commit.
> 
> However, there are two cases when the GitHub push action runner does not
> provide enough details to determen the preceding commit. These cases are
> a new branch or a forced push. The strategy for these exceptions is to
> select the first commit not done by the current author as the base commit.
> 
> An example error output might look like this:
> 
>   error level: +0 -0 no changes
>   warning level: +2 +0
>     New issue "deadcode.DeadStores Value stored to 'remote' is never read" (1 occurrence)
>       file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:86
>     New issue "unix.Malloc Potential leak of memory pointed to by 'remote'" (1 occurrence)
>       file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:95
>   note level: +0 -0 no changes
>   all levels: +2 +0
> 
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>

Acked-by: Simon Horman <horms@ovn.org>
Eelco Chaudron Dec. 20, 2023, 3:07 p.m. UTC | #2
On 20 Dec 2023, at 11:19, Eelco Chaudron wrote:

> This patch identifies new static analysis issues during a GitHub action
> run and reports them. The process involves analyzing the changes introduced
> in the current commit and comparing them to those in the preceding commit.
>
> However, there are two cases when the GitHub push action runner does not
> provide enough details to determen the preceding commit. These cases are
> a new branch or a forced push. The strategy for these exceptions is to
> select the first commit not done by the current author as the base commit.

Including Aarons comment on v3 into v4.

+++ As an alternative we can look at the author (go back in time until
+++ we find a different author), so if you’re lucky and it’s one of your
+++ older patches you will be reminded again and again ;) Not sure if
+++ this is easaly possible with a git command, but it might be worth
+++ trying out.

++ I did some experiments with the above approach, and it seems to work
++  in all the use cases. The only problem could be that if you are also
++ the owner of a previous patch series, it might use a reference branch
++ too deep. This could potentially show issues not introduced by your
++ latest patches. But as you are still the author, you should be
++ “punished” for your earlier mistakes ;)
++
++ I’ll send out a v4, and we can continue the discussion there if needed :)

+ I'll take a look after Jan 1, but I guess we might be able to check
+ against an upstream reference (for example, we can reference towards the
+ 'upstream' repository - such a reference does require some kind of
+ history being pulled) which we know should be stable and 'good'.  IE:
+ the script could add the OVS github remote if it doesn't already exist.
+ I think we could use that to find the common ancestor.  But I need to
+ dive into this more.

I guess we can determine the origin branch going back, and then do something like `git merge-base upstream/master HEAD` assuming we fetched the upstream repository.

Both solutions will work, and I have no real preference.

//Eelco

> An example error output might look like this:
>
>   error level: +0 -0 no changes
>   warning level: +2 +0
>     New issue "deadcode.DeadStores Value stored to 'remote' is never read" (1 occurrence)
>      file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:86
>     New issue "unix.Malloc Potential leak of memory pointed to by 'remote'" (1 occurrence)
>      file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:95
>   note level: +0 -0 no changes
>   all levels: +2 +0
>
> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
> ---
>
> changes in v2:
>   - When it's a new branch, it compares it to the HEAD of the default branch.
>
> changes in v3:
>   - Include the clang version as part of the cache
>   - Change the way it looks for the 'default' branch so it will work
>     for patch branches.
>   - Also compare to the base branch for forced commits.
>
> changes in v4:
>   - No longer look for a default branch, but consume all patches
>     from the current author.
>
>  .ci/linux-build.sh                   |   29 ++++++++++
>  .github/workflows/build-and-test.yml |  103 ++++++++++++++++++++++++++++++++++
>  2 files changed, 132 insertions(+)
>
> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
> index aa2ecc505..fedf1398a 100755
> --- a/.ci/linux-build.sh
> +++ b/.ci/linux-build.sh
> @@ -49,6 +49,30 @@ function build_ovs()
>      make -j4
>  }
>
> +function clang_analyze()
> +{
> +    [ -d "./base-clang-analyzer-results" ] && cache_build=false \
> +                                               || cache_build=true
> +    if [ "$cache_build" = true ]; then
> +        # If this is a cache build, proceed to the base branch's directory.
> +        cd base_ovs_main
> +    fi;
> +
> +    configure_ovs $OPTS
> +    make clean
> +    scan-build -o ./clang-analyzer-results -sarif --use-cc=clang make -j4
> +
> +    if [ "$cache_build" = true ]; then
> +        # Move results, so it will be picked up by the cache.
> +        mv ./clang-analyzer-results ../base-clang-analyzer-results
> +        cd ..
> +    else
> +        # Only do the compare on the none cache builds.
> +        sarif --check note diff ./base-clang-analyzer-results \
> +                                ./clang-analyzer-results
> +    fi;
> +}
> +
>  if [ "$DEB_PACKAGE" ]; then
>      ./boot.sh && ./configure --with-dpdk=$DPDK && make debian
>      mk-build-deps --install --root-cmd sudo --remove debian/control
> @@ -116,6 +140,11 @@ fi
>
>  OPTS="${EXTRA_OPTS} ${OPTS} $*"
>
> +if [ "$CLANG_ANALYZE" ]; then
> +    clang_analyze
> +    exit 0
> +fi
> +
>  if [ "$TESTSUITE" = 'test' ]; then
>      # 'distcheck' will reconfigure with required options.
>      # Now we only need to prepare the Makefile without sparse-wrapped CC.
> diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml
> index 09654205e..cb277ff43 100644
> --- a/.github/workflows/build-and-test.yml
> +++ b/.github/workflows/build-and-test.yml
> @@ -223,6 +223,109 @@ jobs:
>          name: logs-linux-${{ join(matrix.*, '-') }}
>          path: logs.tgz
>
> +  build-clang-analyze:
> +    needs: build-dpdk
> +    env:
> +      dependencies: |
> +        automake bc clang-tools libbpf-dev libnuma-dev libpcap-dev \
> +        libunbound-dev libunwind-dev libssl-dev libtool llvm-dev \
> +        python3-unbound
> +      CC:   clang
> +      DPDK: dpdk
> +      CLANG_ANALYZE: true
> +    name: clang-analyze
> +    runs-on: ubuntu-22.04
> +    timeout-minutes: 30
> +
> +    steps:
> +    - name: checkout
> +      uses: actions/checkout@v3
> +      with:
> +        fetch-depth: 0
> +
> +    - name: get base branch sha
> +      id: base_branch
> +      env:
> +        BASE_SHA: ${{ github.event.pull_request.base.sha }}
> +        EVENT_BEFORE: ${{ github.event.before }}
> +        FORCED_PUSH: ${{ github.event.forced }}
> +      run: |
> +        if [ "$GITHUB_EVENT_NAME" = "pull_request" ]; then
> +          echo "sha=$BASE_SHA" >> $GITHUB_OUTPUT
> +        else
> +          if [ "$EVENT_BEFORE" = "0000000000000000000000000000000000000000" ] \
> +             || [ "$FORCED_PUSH" = true ]; then
> +            echo "sha=$(git log --pretty=format:"%H %ae" | \
> +                        grep -m1 -v $(git log -1 --pretty=format:"%ae") | \
> +                        awk '{print $1}')" >> $GITHUB_OUTPUT
> +          else
> +            echo "sha=$EVENT_BEFORE" >> $GITHUB_OUTPUT
> +          fi
> +        fi
> +
> +    - name: checkout base branch
> +      uses: actions/checkout@v3
> +      with:
> +        ref: ${{ steps.base_branch.outputs.sha }}
> +        path: base_ovs_main
> +
> +    - name: update PATH
> +      run: |
> +        echo "$HOME/bin"        >> $GITHUB_PATH
> +        echo "$HOME/.local/bin" >> $GITHUB_PATH
> +
> +    - name: generate cache key
> +      id: cache_key
> +      run: |
> +        ver=$(clang -v 2>&1 | grep version | \
> +              sed 's/.*version \([0-9]*\.[0-9]*\.[0-9]*\).*/\1/g')
> +        echo "key=clang-${ver}-analyze-$(git -C base_ovs_main rev-parse HEAD)" \
> +          >> $GITHUB_OUTPUT
> +
> +    - name: check for analyzer result cache
> +      id: clang_cache
> +      uses: actions/cache@v3
> +      with:
> +        path: base-clang-analyzer-results
> +        key:  ${{ steps.cache_key.outputs.key }}
> +
> +    - name: set up python
> +      uses: actions/setup-python@v4
> +      with:
> +        python-version: '3.9'
> +
> +    - name: get cached dpdk-dir
> +      uses: actions/cache/restore@v3
> +      with:
> +        path: dpdk-dir
> +        key:  ${{ needs.build-dpdk.outputs.dpdk_key }}
> +
> +    - name: update APT cache
> +      run:  sudo apt update || true
> +
> +    - name: install common dependencies
> +      run:  sudo apt install -y ${{ env.dependencies }}
> +
> +    - name: install sarif tools
> +      run: sudo pip3 install --disable-pip-version-check sarif-tools
> +
> +    - name: prepare
> +      run:  ./.ci/linux-prepare.sh
> +
> +    - name: build base reference
> +      if: steps.clang_cache.outputs.cache-hit != 'true'
> +      run:  ./.ci/linux-build.sh
> +
> +    - name: build
> +      run:  ./.ci/linux-build.sh
> +
> +    - name: save cache
> +      uses: actions/cache/save@v3
> +      if: always() && steps.clang_cache.outputs.cache-hit != 'true'
> +      with:
> +        path: base-clang-analyzer-results
> +        key:  ${{ steps.cache_key.outputs.key }}
> +
>    build-osx:
>      env:
>        CC:    clang
>
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Ilya Maximets Jan. 2, 2024, 12:43 p.m. UTC | #3
On 12/20/23 16:07, Eelco Chaudron wrote:
> 
> 
> On 20 Dec 2023, at 11:19, Eelco Chaudron wrote:
> 
>> This patch identifies new static analysis issues during a GitHub action
>> run and reports them. The process involves analyzing the changes introduced
>> in the current commit and comparing them to those in the preceding commit.
>>
>> However, there are two cases when the GitHub push action runner does not
>> provide enough details to determen the preceding commit. These cases are
>> a new branch or a forced push. The strategy for these exceptions is to
>> select the first commit not done by the current author as the base commit.
> 
> Including Aarons comment on v3 into v4.
> 
> +++ As an alternative we can look at the author (go back in time until
> +++ we find a different author), so if you’re lucky and it’s one of your
> +++ older patches you will be reminded again and again ;) Not sure if
> +++ this is easaly possible with a git command, but it might be worth
> +++ trying out.
> 
> ++ I did some experiments with the above approach, and it seems to work
> ++  in all the use cases. The only problem could be that if you are also
> ++ the owner of a previous patch series, it might use a reference branch
> ++ too deep. This could potentially show issues not introduced by your
> ++ latest patches. But as you are still the author, you should be
> ++ “punished” for your earlier mistakes ;)
> ++
> ++ I’ll send out a v4, and we can continue the discussion there if needed :)
> 
> + I'll take a look after Jan 1, but I guess we might be able to check
> + against an upstream reference (for example, we can reference towards the
> + 'upstream' repository - such a reference does require some kind of
> + history being pulled) which we know should be stable and 'good'.  IE:
> + the script could add the OVS github remote if it doesn't already exist.
> + I think we could use that to find the common ancestor.  But I need to
> + dive into this more.
> 
> I guess we can determine the origin branch going back, and then do something like
> `git merge-base upstream/master HEAD` assuming we fetched the upstream repository.


The 'determine the origin branch' is still tricky and requires some
guesswork in case of a new branch or a force push.  Maybe something
like listing all the heads in upstream repo and finding the one that
has the closest merge-base may be an option.  Then use that merge base
as a base commit to compare against.

The merge-base with upstream/master will be a branching point for a
release branch, and that is way too far.

> 
> Both solutions will work, and I have no real preference.
> 
> //Eelco
> 
>> An example error output might look like this:
>>
>>   error level: +0 -0 no changes
>>   warning level: +2 +0
>>     New issue "deadcode.DeadStores Value stored to 'remote' is never read" (1 occurrence)
>>      file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:86
>>     New issue "unix.Malloc Potential leak of memory pointed to by 'remote'" (1 occurrence)
>>      file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:95
>>   note level: +0 -0 no changes
>>   all levels: +2 +0
>>
>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>> ---
>>
>> changes in v2:
>>   - When it's a new branch, it compares it to the HEAD of the default branch.
>>
>> changes in v3:
>>   - Include the clang version as part of the cache
>>   - Change the way it looks for the 'default' branch so it will work
>>     for patch branches.
>>   - Also compare to the base branch for forced commits.
>>
>> changes in v4:
>>   - No longer look for a default branch, but consume all patches
>>     from the current author.
>>
>>  .ci/linux-build.sh                   |   29 ++++++++++
>>  .github/workflows/build-and-test.yml |  103 ++++++++++++++++++++++++++++++++++
>>  2 files changed, 132 insertions(+)
>>
>> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
>> index aa2ecc505..fedf1398a 100755
>> --- a/.ci/linux-build.sh
>> +++ b/.ci/linux-build.sh
>> @@ -49,6 +49,30 @@ function build_ovs()
>>      make -j4
>>  }
>>
>> +function clang_analyze()
>> +{
>> +    [ -d "./base-clang-analyzer-results" ] && cache_build=false \
>> +                                               || cache_build=true
>> +    if [ "$cache_build" = true ]; then
>> +        # If this is a cache build, proceed to the base branch's directory.
>> +        cd base_ovs_main

pushd/popd might be better here.

>> +    fi;
>> +
>> +    configure_ovs $OPTS
>> +    make clean
>> +    scan-build -o ./clang-analyzer-results -sarif --use-cc=clang make -j4

We have a make target for clang-analyze and it depends on the clean already.
Maybe we can extend it to accept custom arguments via env variable and pass
the '-sarif' this way?  E.g. make clang-analyze SCAN_BUILD_OPTIONS="-sarif".

Also, -j4 should be changed to ${JOBS} since that patch got merged.

And we have CC defined in the job description, so should not use clang directly.

>> +
>> +    if [ "$cache_build" = true ]; then
>> +        # Move results, so it will be picked up by the cache.
>> +        mv ./clang-analyzer-results ../base-clang-analyzer-results
>> +        cd ..
>> +    else
>> +        # Only do the compare on the none cache builds.
>> +        sarif --check note diff ./base-clang-analyzer-results \
>> +                                ./clang-analyzer-results
>> +    fi;
>> +}
>> +
>>  if [ "$DEB_PACKAGE" ]; then
>>      ./boot.sh && ./configure --with-dpdk=$DPDK && make debian
>>      mk-build-deps --install --root-cmd sudo --remove debian/control
>> @@ -116,6 +140,11 @@ fi
>>
>>  OPTS="${EXTRA_OPTS} ${OPTS} $*"
>>
>> +if [ "$CLANG_ANALYZE" ]; then
>> +    clang_analyze
>> +    exit 0
>> +fi
>> +
>>  if [ "$TESTSUITE" = 'test' ]; then
>>      # 'distcheck' will reconfigure with required options.
>>      # Now we only need to prepare the Makefile without sparse-wrapped CC.
>> diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml
>> index 09654205e..cb277ff43 100644
>> --- a/.github/workflows/build-and-test.yml
>> +++ b/.github/workflows/build-and-test.yml
>> @@ -223,6 +223,109 @@ jobs:
>>          name: logs-linux-${{ join(matrix.*, '-') }}
>>          path: logs.tgz
>>
>> +  build-clang-analyze:
>> +    needs: build-dpdk
>> +    env:
>> +      dependencies: |
>> +        automake bc clang-tools libbpf-dev libnuma-dev libpcap-dev \
>> +        libunbound-dev libunwind-dev libssl-dev libtool llvm-dev \
>> +        python3-unbound

Do we need a python3-unbound?

>> +      CC:   clang
>> +      DPDK: dpdk
>> +      CLANG_ANALYZE: true
>> +    name: clang-analyze
>> +    runs-on: ubuntu-22.04
>> +    timeout-minutes: 30
>> +
>> +    steps:
>> +    - name: checkout
>> +      uses: actions/checkout@v3
>> +      with:
>> +        fetch-depth: 0
>> +
>> +    - name: get base branch sha
>> +      id: base_branch
>> +      env:
>> +        BASE_SHA: ${{ github.event.pull_request.base.sha }}
>> +        EVENT_BEFORE: ${{ github.event.before }}
>> +        FORCED_PUSH: ${{ github.event.forced }}
>> +      run: |
>> +        if [ "$GITHUB_EVENT_NAME" = "pull_request" ]; then
>> +          echo "sha=$BASE_SHA" >> $GITHUB_OUTPUT
>> +        else
>> +          if [ "$EVENT_BEFORE" = "0000000000000000000000000000000000000000" ] \
>> +             || [ "$FORCED_PUSH" = true ]; then
>> +            echo "sha=$(git log --pretty=format:"%H %ae" | \
>> +                        grep -m1 -v $(git log -1 --pretty=format:"%ae") | \
>> +                        awk '{print $1}')" >> $GITHUB_OUTPUT
>> +          else
>> +            echo "sha=$EVENT_BEFORE" >> $GITHUB_OUTPUT
>> +          fi
>> +        fi
>> +
>> +    - name: checkout base branch
>> +      uses: actions/checkout@v3
>> +      with:
>> +        ref: ${{ steps.base_branch.outputs.sha }}
>> +        path: base_ovs_main
>> +
>> +    - name: update PATH
>> +      run: |
>> +        echo "$HOME/bin"        >> $GITHUB_PATH
>> +        echo "$HOME/.local/bin" >> $GITHUB_PATH
>> +
>> +    - name: generate cache key
>> +      id: cache_key
>> +      run: |
>> +        ver=$(clang -v 2>&1 | grep version | \

s/clang/${CC}/ ?

And if we grep for ' version ', the check will work for gcc as well,
even if we do not really care about this case. :)

>> +              sed 's/.*version \([0-9]*\.[0-9]*\.[0-9]*\).*/\1/g')
>> +        echo "key=clang-${ver}-analyze-$(git -C base_ovs_main rev-parse HEAD)" \
>> +          >> $GITHUB_OUTPUT
>> +
>> +    - name: check for analyzer result cache
>> +      id: clang_cache
>> +      uses: actions/cache@v3
>> +      with:
>> +        path: base-clang-analyzer-results
>> +        key:  ${{ steps.cache_key.outputs.key }}
>> +
>> +    - name: set up python
>> +      uses: actions/setup-python@v4
>> +      with:
>> +        python-version: '3.9'
>> +
>> +    - name: get cached dpdk-dir
>> +      uses: actions/cache/restore@v3
>> +      with:
>> +        path: dpdk-dir
>> +        key:  ${{ needs.build-dpdk.outputs.dpdk_key }}
>> +
>> +    - name: update APT cache
>> +      run:  sudo apt update || true
>> +
>> +    - name: install common dependencies
>> +      run:  sudo apt install -y ${{ env.dependencies }}
>> +
>> +    - name: install sarif tools
>> +      run: sudo pip3 install --disable-pip-version-check sarif-tools

'--user' to avoid sudo.  Might also be better to add this to the
existing command in the linux-prepare script instead to allow pip
to choose compatible versions of all the packages.

>> +
>> +    - name: prepare
>> +      run:  ./.ci/linux-prepare.sh
>> +
>> +    - name: build base reference
>> +      if: steps.clang_cache.outputs.cache-hit != 'true'
>> +      run:  ./.ci/linux-build.sh
>> +
>> +    - name: build
>> +      run:  ./.ci/linux-build.sh
>> +
>> +    - name: save cache
>> +      uses: actions/cache/save@v3
>> +      if: always() && steps.clang_cache.outputs.cache-hit != 'true'

So, we will save the cache even if the build of the base reference failed?
Maybe move this before the 'build' stage and remove the 'always()' check?

>> +      with:
>> +        path: base-clang-analyzer-results
>> +        key:  ${{ steps.cache_key.outputs.key }}
>> +
>>    build-osx:
>>      env:
>>        CC:    clang
>>

Best regards, Ilya Maximets.
Eelco Chaudron Jan. 2, 2024, 3:19 p.m. UTC | #4
On 2 Jan 2024, at 13:43, Ilya Maximets wrote:

> On 12/20/23 16:07, Eelco Chaudron wrote:
>>
>>
>> On 20 Dec 2023, at 11:19, Eelco Chaudron wrote:
>>
>>> This patch identifies new static analysis issues during a GitHub action
>>> run and reports them. The process involves analyzing the changes introduced
>>> in the current commit and comparing them to those in the preceding commit.
>>>
>>> However, there are two cases when the GitHub push action runner does not
>>> provide enough details to determen the preceding commit. These cases are
>>> a new branch or a forced push. The strategy for these exceptions is to
>>> select the first commit not done by the current author as the base commit.
>>
>> Including Aarons comment on v3 into v4.
>>
>> +++ As an alternative we can look at the author (go back in time until
>> +++ we find a different author), so if you’re lucky and it’s one of your
>> +++ older patches you will be reminded again and again ;) Not sure if
>> +++ this is easaly possible with a git command, but it might be worth
>> +++ trying out.
>>
>> ++ I did some experiments with the above approach, and it seems to work
>> ++  in all the use cases. The only problem could be that if you are also
>> ++ the owner of a previous patch series, it might use a reference branch
>> ++ too deep. This could potentially show issues not introduced by your
>> ++ latest patches. But as you are still the author, you should be
>> ++ “punished” for your earlier mistakes ;)
>> ++
>> ++ I’ll send out a v4, and we can continue the discussion there if needed :)
>>
>> + I'll take a look after Jan 1, but I guess we might be able to check
>> + against an upstream reference (for example, we can reference towards the
>> + 'upstream' repository - such a reference does require some kind of
>> + history being pulled) which we know should be stable and 'good'.  IE:
>> + the script could add the OVS github remote if it doesn't already exist.
>> + I think we could use that to find the common ancestor.  But I need to
>> + dive into this more.
>>
>> I guess we can determine the origin branch going back, and then do something like
>> `git merge-base upstream/master HEAD` assuming we fetched the upstream repository.
>
>
> The 'determine the origin branch' is still tricky and requires some
> guesswork in case of a new branch or a force push.  Maybe something
> like listing all the heads in upstream repo and finding the one that
> has the closest merge-base may be an option.  Then use that merge base
> as a base commit to compare against.
>
> The merge-base with upstream/master will be a branching point for a
> release branch, and that is way too far.

ACK, I think going back in time till a different author will solve all this for now. And this will even work in Ilya’s and my force commit workflow. So for now I’ll keep it as it.

>>
>> Both solutions will work, and I have no real preference.
>>
>> //Eelco
>>
>>> An example error output might look like this:
>>>
>>>   error level: +0 -0 no changes
>>>   warning level: +2 +0
>>>     New issue "deadcode.DeadStores Value stored to 'remote' is never read" (1 occurrence)
>>>     file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:86
>>>     New issue "unix.Malloc Potential leak of memory pointed to by 'remote'" (1 occurrence)
>>>     file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:95
>>>   note level: +0 -0 no changes
>>>   all levels: +2 +0
>>>
>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>> ---
>>>
>>> changes in v2:
>>>   - When it's a new branch, it compares it to the HEAD of the default branch.
>>>
>>> changes in v3:
>>>   - Include the clang version as part of the cache
>>>   - Change the way it looks for the 'default' branch so it will work
>>>     for patch branches.
>>>   - Also compare to the base branch for forced commits.
>>>
>>> changes in v4:
>>>   - No longer look for a default branch, but consume all patches
>>>     from the current author.
>>>
>>>  .ci/linux-build.sh                   |   29 ++++++++++
>>>  .github/workflows/build-and-test.yml |  103 ++++++++++++++++++++++++++++++++++
>>>  2 files changed, 132 insertions(+)
>>>
>>> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
>>> index aa2ecc505..fedf1398a 100755
>>> --- a/.ci/linux-build.sh
>>> +++ b/.ci/linux-build.sh
>>> @@ -49,6 +49,30 @@ function build_ovs()
>>>      make -j4
>>>  }
>>>
>>> +function clang_analyze()
>>> +{
>>> +    [ -d "./base-clang-analyzer-results" ] && cache_build=false \
>>> +                                               || cache_build=true
>>> +    if [ "$cache_build" = true ]; then
>>> +        # If this is a cache build, proceed to the base branch's directory.
>>> +        cd base_ovs_main
>
> pushd/popd might be better here.

ACK will do this next revision.
>
>>> +    fi;
>>> +
>>> +    configure_ovs $OPTS
>>> +    make clean
>>> +    scan-build -o ./clang-analyzer-results -sarif --use-cc=clang make -j4
>
> We have a make target for clang-analyze and it depends on the clean already.
> Maybe we can extend it to accept custom arguments via env variable and pass
> the '-sarif' this way?  E.g. make clang-analyze SCAN_BUILD_OPTIONS="-sarif".

Let me look into this, as I like the idea.

> Also, -j4 should be changed to ${JOBS} since that patch got merged.

ACK

> And we have CC defined in the job description, so should not use clang directly.

ACK

>>> +
>>> +    if [ "$cache_build" = true ]; then
>>> +        # Move results, so it will be picked up by the cache.
>>> +        mv ./clang-analyzer-results ../base-clang-analyzer-results
>>> +        cd ..
>>> +    else
>>> +        # Only do the compare on the none cache builds.
>>> +        sarif --check note diff ./base-clang-analyzer-results \
>>> +                                ./clang-analyzer-results
>>> +    fi;
>>> +}
>>> +
>>>  if [ "$DEB_PACKAGE" ]; then
>>>      ./boot.sh && ./configure --with-dpdk=$DPDK && make debian
>>>      mk-build-deps --install --root-cmd sudo --remove debian/control
>>> @@ -116,6 +140,11 @@ fi
>>>
>>>  OPTS="${EXTRA_OPTS} ${OPTS} $*"
>>>
>>> +if [ "$CLANG_ANALYZE" ]; then
>>> +    clang_analyze
>>> +    exit 0
>>> +fi
>>> +
>>>  if [ "$TESTSUITE" = 'test' ]; then
>>>      # 'distcheck' will reconfigure with required options.
>>>      # Now we only need to prepare the Makefile without sparse-wrapped CC.
>>> diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml
>>> index 09654205e..cb277ff43 100644
>>> --- a/.github/workflows/build-and-test.yml
>>> +++ b/.github/workflows/build-and-test.yml
>>> @@ -223,6 +223,109 @@ jobs:
>>>          name: logs-linux-${{ join(matrix.*, '-') }}
>>>          path: logs.tgz
>>>
>>> +  build-clang-analyze:
>>> +    needs: build-dpdk
>>> +    env:
>>> +      dependencies: |
>>> +        automake bc clang-tools libbpf-dev libnuma-dev libpcap-dev \
>>> +        libunbound-dev libunwind-dev libssl-dev libtool llvm-dev \
>>> +        python3-unbound
>
> Do we need a python3-unbound?

Added it, as I removed the 32-bit conditional install, but can verify if it fails without.

>>> +      CC:   clang
>>> +      DPDK: dpdk
>>> +      CLANG_ANALYZE: true
>>> +    name: clang-analyze
>>> +    runs-on: ubuntu-22.04
>>> +    timeout-minutes: 30
>>> +
>>> +    steps:
>>> +    - name: checkout
>>> +      uses: actions/checkout@v3
>>> +      with:
>>> +        fetch-depth: 0
>>> +
>>> +    - name: get base branch sha
>>> +      id: base_branch
>>> +      env:
>>> +        BASE_SHA: ${{ github.event.pull_request.base.sha }}
>>> +        EVENT_BEFORE: ${{ github.event.before }}
>>> +        FORCED_PUSH: ${{ github.event.forced }}
>>> +      run: |
>>> +        if [ "$GITHUB_EVENT_NAME" = "pull_request" ]; then
>>> +          echo "sha=$BASE_SHA" >> $GITHUB_OUTPUT
>>> +        else
>>> +          if [ "$EVENT_BEFORE" = "0000000000000000000000000000000000000000" ] \
>>> +             || [ "$FORCED_PUSH" = true ]; then
>>> +            echo "sha=$(git log --pretty=format:"%H %ae" | \
>>> +                        grep -m1 -v $(git log -1 --pretty=format:"%ae") | \
>>> +                        awk '{print $1}')" >> $GITHUB_OUTPUT
>>> +          else
>>> +            echo "sha=$EVENT_BEFORE" >> $GITHUB_OUTPUT
>>> +          fi
>>> +        fi
>>> +
>>> +    - name: checkout base branch
>>> +      uses: actions/checkout@v3
>>> +      with:
>>> +        ref: ${{ steps.base_branch.outputs.sha }}
>>> +        path: base_ovs_main
>>> +
>>> +    - name: update PATH
>>> +      run: |
>>> +        echo "$HOME/bin"        >> $GITHUB_PATH
>>> +        echo "$HOME/.local/bin" >> $GITHUB_PATH
>>> +
>>> +    - name: generate cache key
>>> +      id: cache_key
>>> +      run: |
>>> +        ver=$(clang -v 2>&1 | grep version | \
>
> s/clang/${CC}/ ?
>
> And if we grep for ' version ', the check will work for gcc as well,
> even if we do not really care about this case. :)

ACK, will make the change below as well.

>>> +              sed 's/.*version \([0-9]*\.[0-9]*\.[0-9]*\).*/\1/g')
>>> +        echo "key=clang-${ver}-analyze-$(git -C base_ovs_main rev-parse HEAD)" \
>>> +          >> $GITHUB_OUTPUT
>>> +
>>> +    - name: check for analyzer result cache
>>> +      id: clang_cache
>>> +      uses: actions/cache@v3
>>> +      with:
>>> +        path: base-clang-analyzer-results
>>> +        key:  ${{ steps.cache_key.outputs.key }}
>>> +
>>> +    - name: set up python
>>> +      uses: actions/setup-python@v4
>>> +      with:
>>> +        python-version: '3.9'
>>> +
>>> +    - name: get cached dpdk-dir
>>> +      uses: actions/cache/restore@v3
>>> +      with:
>>> +        path: dpdk-dir
>>> +        key:  ${{ needs.build-dpdk.outputs.dpdk_key }}
>>> +
>>> +    - name: update APT cache
>>> +      run:  sudo apt update || true
>>> +
>>> +    - name: install common dependencies
>>> +      run:  sudo apt install -y ${{ env.dependencies }}
>>> +
>>> +    - name: install sarif tools
>>> +      run: sudo pip3 install --disable-pip-version-check sarif-tools
>
> '--user' to avoid sudo.  Might also be better to add this to the
> existing command in the linux-prepare script instead to allow pip
> to choose compatible versions of all the packages.

Let me move it to see if this will work.

>>> +
>>> +    - name: prepare
>>> +      run:  ./.ci/linux-prepare.sh
>>> +
>>> +    - name: build base reference
>>> +      if: steps.clang_cache.outputs.cache-hit != 'true'
>>> +      run:  ./.ci/linux-build.sh
>>> +
>>> +    - name: build
>>> +      run:  ./.ci/linux-build.sh
>>> +
>>> +    - name: save cache
>>> +      uses: actions/cache/save@v3
>>> +      if: always() && steps.clang_cache.outputs.cache-hit != 'true'
>
> So, we will save the cache even if the build of the base reference failed?
> Maybe move this before the 'build' stage and remove the 'always()' check?

Good catch, will move it up.

>>> +      with:
>>> +        path: base-clang-analyzer-results
>>> +        key:  ${{ steps.cache_key.outputs.key }}
>>> +
>>>    build-osx:
>>>      env:
>>>        CC:    clang
>>>
>
> Best regards, Ilya Maximets.

Thanks for the comments, will prep a v5.

//Eelco
Ilya Maximets Jan. 2, 2024, 3:55 p.m. UTC | #5
On 1/2/24 16:19, Eelco Chaudron wrote:
> 
> 
> On 2 Jan 2024, at 13:43, Ilya Maximets wrote:
> 
>> On 12/20/23 16:07, Eelco Chaudron wrote:
>>>
>>>
>>> On 20 Dec 2023, at 11:19, Eelco Chaudron wrote:
>>>
>>>> This patch identifies new static analysis issues during a GitHub action
>>>> run and reports them. The process involves analyzing the changes introduced
>>>> in the current commit and comparing them to those in the preceding commit.
>>>>
>>>> However, there are two cases when the GitHub push action runner does not
>>>> provide enough details to determen the preceding commit. These cases are
>>>> a new branch or a forced push. The strategy for these exceptions is to
>>>> select the first commit not done by the current author as the base commit.
>>>
>>> Including Aarons comment on v3 into v4.
>>>
>>> +++ As an alternative we can look at the author (go back in time until
>>> +++ we find a different author), so if you’re lucky and it’s one of your
>>> +++ older patches you will be reminded again and again ;) Not sure if
>>> +++ this is easaly possible with a git command, but it might be worth
>>> +++ trying out.
>>>
>>> ++ I did some experiments with the above approach, and it seems to work
>>> ++  in all the use cases. The only problem could be that if you are also
>>> ++ the owner of a previous patch series, it might use a reference branch
>>> ++ too deep. This could potentially show issues not introduced by your
>>> ++ latest patches. But as you are still the author, you should be
>>> ++ “punished” for your earlier mistakes ;)
>>> ++
>>> ++ I’ll send out a v4, and we can continue the discussion there if needed :)
>>>
>>> + I'll take a look after Jan 1, but I guess we might be able to check
>>> + against an upstream reference (for example, we can reference towards the
>>> + 'upstream' repository - such a reference does require some kind of
>>> + history being pulled) which we know should be stable and 'good'.  IE:
>>> + the script could add the OVS github remote if it doesn't already exist.
>>> + I think we could use that to find the common ancestor.  But I need to
>>> + dive into this more.
>>>
>>> I guess we can determine the origin branch going back, and then do something like
>>> `git merge-base upstream/master HEAD` assuming we fetched the upstream repository.
>>
>>
>> The 'determine the origin branch' is still tricky and requires some
>> guesswork in case of a new branch or a force push.  Maybe something
>> like listing all the heads in upstream repo and finding the one that
>> has the closest merge-base may be an option.  Then use that merge base
>> as a base commit to compare against.
>>
>> The merge-base with upstream/master will be a branching point for a
>> release branch, and that is way too far.
> 
> ACK, I think going back in time till a different author will solve all this for now.
> And this will even work in Ilya’s and my force commit workflow. So for now I’ll keep
> it as it.

It will not work in case you're incorporating someone else's patches in
your patch set, unless you're pushing them one by one.  Robot will catch
such problems, but people may miss.  Same problem if you are accumulating
multiple different patches before batch-applying them.

> 
>>>
>>> Both solutions will work, and I have no real preference.
>>>
>>> //Eelco
>>>
>>>> An example error output might look like this:
>>>>
>>>>   error level: +0 -0 no changes
>>>>   warning level: +2 +0
>>>>     New issue "deadcode.DeadStores Value stored to 'remote' is never read" (1 occurrence)
>>>>     file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:86
>>>>     New issue "unix.Malloc Potential leak of memory pointed to by 'remote'" (1 occurrence)
>>>>     file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:95
>>>>   note level: +0 -0 no changes
>>>>   all levels: +2 +0
>>>>
>>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>>> ---
>>>>
>>>> changes in v2:
>>>>   - When it's a new branch, it compares it to the HEAD of the default branch.
>>>>
>>>> changes in v3:
>>>>   - Include the clang version as part of the cache
>>>>   - Change the way it looks for the 'default' branch so it will work
>>>>     for patch branches.
>>>>   - Also compare to the base branch for forced commits.
>>>>
>>>> changes in v4:
>>>>   - No longer look for a default branch, but consume all patches
>>>>     from the current author.
>>>>
>>>>  .ci/linux-build.sh                   |   29 ++++++++++
>>>>  .github/workflows/build-and-test.yml |  103 ++++++++++++++++++++++++++++++++++
>>>>  2 files changed, 132 insertions(+)
>>>>
>>>> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
>>>> index aa2ecc505..fedf1398a 100755
>>>> --- a/.ci/linux-build.sh
>>>> +++ b/.ci/linux-build.sh
>>>> @@ -49,6 +49,30 @@ function build_ovs()
>>>>      make -j4
>>>>  }
>>>>
>>>> +function clang_analyze()
>>>> +{
>>>> +    [ -d "./base-clang-analyzer-results" ] && cache_build=false \
>>>> +                                               || cache_build=true
>>>> +    if [ "$cache_build" = true ]; then
>>>> +        # If this is a cache build, proceed to the base branch's directory.
>>>> +        cd base_ovs_main
>>
>> pushd/popd might be better here.
> 
> ACK will do this next revision.
>>
>>>> +    fi;
>>>> +
>>>> +    configure_ovs $OPTS
>>>> +    make clean
>>>> +    scan-build -o ./clang-analyzer-results -sarif --use-cc=clang make -j4
>>
>> We have a make target for clang-analyze and it depends on the clean already.
>> Maybe we can extend it to accept custom arguments via env variable and pass
>> the '-sarif' this way?  E.g. make clang-analyze SCAN_BUILD_OPTIONS="-sarif".
> 
> Let me look into this, as I like the idea.
> 
>> Also, -j4 should be changed to ${JOBS} since that patch got merged.
> 
> ACK
> 
>> And we have CC defined in the job description, so should not use clang directly.
> 
> ACK
> 
>>>> +
>>>> +    if [ "$cache_build" = true ]; then
>>>> +        # Move results, so it will be picked up by the cache.
>>>> +        mv ./clang-analyzer-results ../base-clang-analyzer-results
>>>> +        cd ..
>>>> +    else
>>>> +        # Only do the compare on the none cache builds.
>>>> +        sarif --check note diff ./base-clang-analyzer-results \
>>>> +                                ./clang-analyzer-results
>>>> +    fi;
>>>> +}
>>>> +
>>>>  if [ "$DEB_PACKAGE" ]; then
>>>>      ./boot.sh && ./configure --with-dpdk=$DPDK && make debian
>>>>      mk-build-deps --install --root-cmd sudo --remove debian/control
>>>> @@ -116,6 +140,11 @@ fi
>>>>
>>>>  OPTS="${EXTRA_OPTS} ${OPTS} $*"
>>>>
>>>> +if [ "$CLANG_ANALYZE" ]; then
>>>> +    clang_analyze
>>>> +    exit 0
>>>> +fi
>>>> +
>>>>  if [ "$TESTSUITE" = 'test' ]; then
>>>>      # 'distcheck' will reconfigure with required options.
>>>>      # Now we only need to prepare the Makefile without sparse-wrapped CC.
>>>> diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml
>>>> index 09654205e..cb277ff43 100644
>>>> --- a/.github/workflows/build-and-test.yml
>>>> +++ b/.github/workflows/build-and-test.yml
>>>> @@ -223,6 +223,109 @@ jobs:
>>>>          name: logs-linux-${{ join(matrix.*, '-') }}
>>>>          path: logs.tgz
>>>>
>>>> +  build-clang-analyze:
>>>> +    needs: build-dpdk
>>>> +    env:
>>>> +      dependencies: |
>>>> +        automake bc clang-tools libbpf-dev libnuma-dev libpcap-dev \
>>>> +        libunbound-dev libunwind-dev libssl-dev libtool llvm-dev \
>>>> +        python3-unbound
>>
>> Do we need a python3-unbound?
> 
> Added it, as I removed the 32-bit conditional install, but can verify if it fails without.
> 
>>>> +      CC:   clang
>>>> +      DPDK: dpdk
>>>> +      CLANG_ANALYZE: true
>>>> +    name: clang-analyze
>>>> +    runs-on: ubuntu-22.04
>>>> +    timeout-minutes: 30
>>>> +
>>>> +    steps:
>>>> +    - name: checkout
>>>> +      uses: actions/checkout@v3
>>>> +      with:
>>>> +        fetch-depth: 0
>>>> +
>>>> +    - name: get base branch sha
>>>> +      id: base_branch
>>>> +      env:
>>>> +        BASE_SHA: ${{ github.event.pull_request.base.sha }}
>>>> +        EVENT_BEFORE: ${{ github.event.before }}
>>>> +        FORCED_PUSH: ${{ github.event.forced }}
>>>> +      run: |
>>>> +        if [ "$GITHUB_EVENT_NAME" = "pull_request" ]; then
>>>> +          echo "sha=$BASE_SHA" >> $GITHUB_OUTPUT
>>>> +        else
>>>> +          if [ "$EVENT_BEFORE" = "0000000000000000000000000000000000000000" ] \
>>>> +             || [ "$FORCED_PUSH" = true ]; then
>>>> +            echo "sha=$(git log --pretty=format:"%H %ae" | \
>>>> +                        grep -m1 -v $(git log -1 --pretty=format:"%ae") | \
>>>> +                        awk '{print $1}')" >> $GITHUB_OUTPUT
>>>> +          else
>>>> +            echo "sha=$EVENT_BEFORE" >> $GITHUB_OUTPUT
>>>> +          fi
>>>> +        fi
>>>> +
>>>> +    - name: checkout base branch
>>>> +      uses: actions/checkout@v3
>>>> +      with:
>>>> +        ref: ${{ steps.base_branch.outputs.sha }}
>>>> +        path: base_ovs_main
>>>> +
>>>> +    - name: update PATH
>>>> +      run: |
>>>> +        echo "$HOME/bin"        >> $GITHUB_PATH
>>>> +        echo "$HOME/.local/bin" >> $GITHUB_PATH
>>>> +
>>>> +    - name: generate cache key
>>>> +      id: cache_key
>>>> +      run: |
>>>> +        ver=$(clang -v 2>&1 | grep version | \
>>
>> s/clang/${CC}/ ?
>>
>> And if we grep for ' version ', the check will work for gcc as well,
>> even if we do not really care about this case. :)
> 
> ACK, will make the change below as well.
> 
>>>> +              sed 's/.*version \([0-9]*\.[0-9]*\.[0-9]*\).*/\1/g')
>>>> +        echo "key=clang-${ver}-analyze-$(git -C base_ovs_main rev-parse HEAD)" \
>>>> +          >> $GITHUB_OUTPUT
>>>> +
>>>> +    - name: check for analyzer result cache
>>>> +      id: clang_cache
>>>> +      uses: actions/cache@v3
>>>> +      with:
>>>> +        path: base-clang-analyzer-results
>>>> +        key:  ${{ steps.cache_key.outputs.key }}
>>>> +
>>>> +    - name: set up python
>>>> +      uses: actions/setup-python@v4
>>>> +      with:
>>>> +        python-version: '3.9'
>>>> +
>>>> +    - name: get cached dpdk-dir
>>>> +      uses: actions/cache/restore@v3
>>>> +      with:
>>>> +        path: dpdk-dir
>>>> +        key:  ${{ needs.build-dpdk.outputs.dpdk_key }}
>>>> +
>>>> +    - name: update APT cache
>>>> +      run:  sudo apt update || true
>>>> +
>>>> +    - name: install common dependencies
>>>> +      run:  sudo apt install -y ${{ env.dependencies }}
>>>> +
>>>> +    - name: install sarif tools
>>>> +      run: sudo pip3 install --disable-pip-version-check sarif-tools
>>
>> '--user' to avoid sudo.  Might also be better to add this to the
>> existing command in the linux-prepare script instead to allow pip
>> to choose compatible versions of all the packages.
> 
> Let me move it to see if this will work.
> 
>>>> +
>>>> +    - name: prepare
>>>> +      run:  ./.ci/linux-prepare.sh
>>>> +
>>>> +    - name: build base reference
>>>> +      if: steps.clang_cache.outputs.cache-hit != 'true'
>>>> +      run:  ./.ci/linux-build.sh
>>>> +
>>>> +    - name: build
>>>> +      run:  ./.ci/linux-build.sh
>>>> +
>>>> +    - name: save cache
>>>> +      uses: actions/cache/save@v3
>>>> +      if: always() && steps.clang_cache.outputs.cache-hit != 'true'
>>
>> So, we will save the cache even if the build of the base reference failed?
>> Maybe move this before the 'build' stage and remove the 'always()' check?
> 
> Good catch, will move it up.
> 
>>>> +      with:
>>>> +        path: base-clang-analyzer-results
>>>> +        key:  ${{ steps.cache_key.outputs.key }}
>>>> +
>>>>    build-osx:
>>>>      env:
>>>>        CC:    clang
>>>>
>>
>> Best regards, Ilya Maximets.
> 
> Thanks for the comments, will prep a v5.
> 
> //Eelco
>
Eelco Chaudron Jan. 2, 2024, 4:14 p.m. UTC | #6
On 2 Jan 2024, at 16:55, Ilya Maximets wrote:

> On 1/2/24 16:19, Eelco Chaudron wrote:
>>
>>
>> On 2 Jan 2024, at 13:43, Ilya Maximets wrote:
>>
>>> On 12/20/23 16:07, Eelco Chaudron wrote:
>>>>
>>>>
>>>> On 20 Dec 2023, at 11:19, Eelco Chaudron wrote:
>>>>
>>>>> This patch identifies new static analysis issues during a GitHub action
>>>>> run and reports them. The process involves analyzing the changes introduced
>>>>> in the current commit and comparing them to those in the preceding commit.
>>>>>
>>>>> However, there are two cases when the GitHub push action runner does not
>>>>> provide enough details to determen the preceding commit. These cases are
>>>>> a new branch or a forced push. The strategy for these exceptions is to
>>>>> select the first commit not done by the current author as the base commit.
>>>>
>>>> Including Aarons comment on v3 into v4.
>>>>
>>>> +++ As an alternative we can look at the author (go back in time until
>>>> +++ we find a different author), so if you’re lucky and it’s one of your
>>>> +++ older patches you will be reminded again and again ;) Not sure if
>>>> +++ this is easaly possible with a git command, but it might be worth
>>>> +++ trying out.
>>>>
>>>> ++ I did some experiments with the above approach, and it seems to work
>>>> ++  in all the use cases. The only problem could be that if you are also
>>>> ++ the owner of a previous patch series, it might use a reference branch
>>>> ++ too deep. This could potentially show issues not introduced by your
>>>> ++ latest patches. But as you are still the author, you should be
>>>> ++ “punished” for your earlier mistakes ;)
>>>> ++
>>>> ++ I’ll send out a v4, and we can continue the discussion there if needed :)
>>>>
>>>> + I'll take a look after Jan 1, but I guess we might be able to check
>>>> + against an upstream reference (for example, we can reference towards the
>>>> + 'upstream' repository - such a reference does require some kind of
>>>> + history being pulled) which we know should be stable and 'good'.  IE:
>>>> + the script could add the OVS github remote if it doesn't already exist.
>>>> + I think we could use that to find the common ancestor.  But I need to
>>>> + dive into this more.
>>>>
>>>> I guess we can determine the origin branch going back, and then do something like
>>>> `git merge-base upstream/master HEAD` assuming we fetched the upstream repository.
>>>
>>>
>>> The 'determine the origin branch' is still tricky and requires some
>>> guesswork in case of a new branch or a force push.  Maybe something
>>> like listing all the heads in upstream repo and finding the one that
>>> has the closest merge-base may be an option.  Then use that merge base
>>> as a base commit to compare against.
>>>
>>> The merge-base with upstream/master will be a branching point for a
>>> release branch, and that is way too far.
>>
>> ACK, I think going back in time till a different author will solve all this for now.
>> And this will even work in Ilya’s and my force commit workflow. So for now I’ll keep
>> it as it.
>
> It will not work in case you're incorporating someone else's patches in
> your patch set, unless you're pushing them one by one.  Robot will catch
> such problems, but people may miss.  Same problem if you are accumulating
> multiple different patches before batch-applying them.

ACK, this is true. But with this patch in place, the robot should have captured this for you ;) And your actual push to the main branch will also fail. Guess the previously suggested patch -1 will have the same problem.

Don’t think there will be a ‘full’ solution that will work in all cases 100% of the time. This approach works for individual developers and the robot. For maintainers, we can push the new branch, and then the actual patches, although making changes in the middle might require deleting the remote branch, re-creating it, and re-apply. Tried this with stgit, and this seems easy.

//Eelco

>>>> Both solutions will work, and I have no real preference.
>>>>
>>>> //Eelco
>>>>
>>>>> An example error output might look like this:
>>>>>
>>>>>   error level: +0 -0 no changes
>>>>>   warning level: +2 +0
>>>>>     New issue "deadcode.DeadStores Value stored to 'remote' is never read" (1 occurrence)
>>>>>    file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:86
>>>>>     New issue "unix.Malloc Potential leak of memory pointed to by 'remote'" (1 occurrence)
>>>>>    file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:95
>>>>>   note level: +0 -0 no changes
>>>>>   all levels: +2 +0
>>>>>
>>>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>>>> ---
>>>>>
>>>>> changes in v2:
>>>>>   - When it's a new branch, it compares it to the HEAD of the default branch.
>>>>>
>>>>> changes in v3:
>>>>>   - Include the clang version as part of the cache
>>>>>   - Change the way it looks for the 'default' branch so it will work
>>>>>     for patch branches.
>>>>>   - Also compare to the base branch for forced commits.
>>>>>
>>>>> changes in v4:
>>>>>   - No longer look for a default branch, but consume all patches
>>>>>     from the current author.
>>>>>
>>>>>  .ci/linux-build.sh                   |   29 ++++++++++
>>>>>  .github/workflows/build-and-test.yml |  103 ++++++++++++++++++++++++++++++++++
>>>>>  2 files changed, 132 insertions(+)
>>>>>
>>>>> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
>>>>> index aa2ecc505..fedf1398a 100755
>>>>> --- a/.ci/linux-build.sh
>>>>> +++ b/.ci/linux-build.sh
>>>>> @@ -49,6 +49,30 @@ function build_ovs()
>>>>>      make -j4
>>>>>  }
>>>>>
>>>>> +function clang_analyze()
>>>>> +{
>>>>> +    [ -d "./base-clang-analyzer-results" ] && cache_build=false \
>>>>> +                                               || cache_build=true
>>>>> +    if [ "$cache_build" = true ]; then
>>>>> +        # If this is a cache build, proceed to the base branch's directory.
>>>>> +        cd base_ovs_main
>>>
>>> pushd/popd might be better here.
>>
>> ACK will do this next revision.
>>>
>>>>> +    fi;
>>>>> +
>>>>> +    configure_ovs $OPTS
>>>>> +    make clean
>>>>> +    scan-build -o ./clang-analyzer-results -sarif --use-cc=clang make -j4
>>>
>>> We have a make target for clang-analyze and it depends on the clean already.
>>> Maybe we can extend it to accept custom arguments via env variable and pass
>>> the '-sarif' this way?  E.g. make clang-analyze SCAN_BUILD_OPTIONS="-sarif".
>>
>> Let me look into this, as I like the idea.
>>
>>> Also, -j4 should be changed to ${JOBS} since that patch got merged.
>>
>> ACK
>>
>>> And we have CC defined in the job description, so should not use clang directly.
>>
>> ACK
>>
>>>>> +
>>>>> +    if [ "$cache_build" = true ]; then
>>>>> +        # Move results, so it will be picked up by the cache.
>>>>> +        mv ./clang-analyzer-results ../base-clang-analyzer-results
>>>>> +        cd ..
>>>>> +    else
>>>>> +        # Only do the compare on the none cache builds.
>>>>> +        sarif --check note diff ./base-clang-analyzer-results \
>>>>> +                                ./clang-analyzer-results
>>>>> +    fi;
>>>>> +}
>>>>> +
>>>>>  if [ "$DEB_PACKAGE" ]; then
>>>>>      ./boot.sh && ./configure --with-dpdk=$DPDK && make debian
>>>>>      mk-build-deps --install --root-cmd sudo --remove debian/control
>>>>> @@ -116,6 +140,11 @@ fi
>>>>>
>>>>>  OPTS="${EXTRA_OPTS} ${OPTS} $*"
>>>>>
>>>>> +if [ "$CLANG_ANALYZE" ]; then
>>>>> +    clang_analyze
>>>>> +    exit 0
>>>>> +fi
>>>>> +
>>>>>  if [ "$TESTSUITE" = 'test' ]; then
>>>>>      # 'distcheck' will reconfigure with required options.
>>>>>      # Now we only need to prepare the Makefile without sparse-wrapped CC.
>>>>> diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml
>>>>> index 09654205e..cb277ff43 100644
>>>>> --- a/.github/workflows/build-and-test.yml
>>>>> +++ b/.github/workflows/build-and-test.yml
>>>>> @@ -223,6 +223,109 @@ jobs:
>>>>>          name: logs-linux-${{ join(matrix.*, '-') }}
>>>>>          path: logs.tgz
>>>>>
>>>>> +  build-clang-analyze:
>>>>> +    needs: build-dpdk
>>>>> +    env:
>>>>> +      dependencies: |
>>>>> +        automake bc clang-tools libbpf-dev libnuma-dev libpcap-dev \
>>>>> +        libunbound-dev libunwind-dev libssl-dev libtool llvm-dev \
>>>>> +        python3-unbound
>>>
>>> Do we need a python3-unbound?
>>
>> Added it, as I removed the 32-bit conditional install, but can verify if it fails without.
>>
>>>>> +      CC:   clang
>>>>> +      DPDK: dpdk
>>>>> +      CLANG_ANALYZE: true
>>>>> +    name: clang-analyze
>>>>> +    runs-on: ubuntu-22.04
>>>>> +    timeout-minutes: 30
>>>>> +
>>>>> +    steps:
>>>>> +    - name: checkout
>>>>> +      uses: actions/checkout@v3
>>>>> +      with:
>>>>> +        fetch-depth: 0
>>>>> +
>>>>> +    - name: get base branch sha
>>>>> +      id: base_branch
>>>>> +      env:
>>>>> +        BASE_SHA: ${{ github.event.pull_request.base.sha }}
>>>>> +        EVENT_BEFORE: ${{ github.event.before }}
>>>>> +        FORCED_PUSH: ${{ github.event.forced }}
>>>>> +      run: |
>>>>> +        if [ "$GITHUB_EVENT_NAME" = "pull_request" ]; then
>>>>> +          echo "sha=$BASE_SHA" >> $GITHUB_OUTPUT
>>>>> +        else
>>>>> +          if [ "$EVENT_BEFORE" = "0000000000000000000000000000000000000000" ] \
>>>>> +             || [ "$FORCED_PUSH" = true ]; then
>>>>> +            echo "sha=$(git log --pretty=format:"%H %ae" | \
>>>>> +                        grep -m1 -v $(git log -1 --pretty=format:"%ae") | \
>>>>> +                        awk '{print $1}')" >> $GITHUB_OUTPUT
>>>>> +          else
>>>>> +            echo "sha=$EVENT_BEFORE" >> $GITHUB_OUTPUT
>>>>> +          fi
>>>>> +        fi
>>>>> +
>>>>> +    - name: checkout base branch
>>>>> +      uses: actions/checkout@v3
>>>>> +      with:
>>>>> +        ref: ${{ steps.base_branch.outputs.sha }}
>>>>> +        path: base_ovs_main
>>>>> +
>>>>> +    - name: update PATH
>>>>> +      run: |
>>>>> +        echo "$HOME/bin"        >> $GITHUB_PATH
>>>>> +        echo "$HOME/.local/bin" >> $GITHUB_PATH
>>>>> +
>>>>> +    - name: generate cache key
>>>>> +      id: cache_key
>>>>> +      run: |
>>>>> +        ver=$(clang -v 2>&1 | grep version | \
>>>
>>> s/clang/${CC}/ ?
>>>
>>> And if we grep for ' version ', the check will work for gcc as well,
>>> even if we do not really care about this case. :)
>>
>> ACK, will make the change below as well.
>>
>>>>> +              sed 's/.*version \([0-9]*\.[0-9]*\.[0-9]*\).*/\1/g')
>>>>> +        echo "key=clang-${ver}-analyze-$(git -C base_ovs_main rev-parse HEAD)" \
>>>>> +          >> $GITHUB_OUTPUT
>>>>> +
>>>>> +    - name: check for analyzer result cache
>>>>> +      id: clang_cache
>>>>> +      uses: actions/cache@v3
>>>>> +      with:
>>>>> +        path: base-clang-analyzer-results
>>>>> +        key:  ${{ steps.cache_key.outputs.key }}
>>>>> +
>>>>> +    - name: set up python
>>>>> +      uses: actions/setup-python@v4
>>>>> +      with:
>>>>> +        python-version: '3.9'
>>>>> +
>>>>> +    - name: get cached dpdk-dir
>>>>> +      uses: actions/cache/restore@v3
>>>>> +      with:
>>>>> +        path: dpdk-dir
>>>>> +        key:  ${{ needs.build-dpdk.outputs.dpdk_key }}
>>>>> +
>>>>> +    - name: update APT cache
>>>>> +      run:  sudo apt update || true
>>>>> +
>>>>> +    - name: install common dependencies
>>>>> +      run:  sudo apt install -y ${{ env.dependencies }}
>>>>> +
>>>>> +    - name: install sarif tools
>>>>> +      run: sudo pip3 install --disable-pip-version-check sarif-tools
>>>
>>> '--user' to avoid sudo.  Might also be better to add this to the
>>> existing command in the linux-prepare script instead to allow pip
>>> to choose compatible versions of all the packages.
>>
>> Let me move it to see if this will work.
>>
>>>>> +
>>>>> +    - name: prepare
>>>>> +      run:  ./.ci/linux-prepare.sh
>>>>> +
>>>>> +    - name: build base reference
>>>>> +      if: steps.clang_cache.outputs.cache-hit != 'true'
>>>>> +      run:  ./.ci/linux-build.sh
>>>>> +
>>>>> +    - name: build
>>>>> +      run:  ./.ci/linux-build.sh
>>>>> +
>>>>> +    - name: save cache
>>>>> +      uses: actions/cache/save@v3
>>>>> +      if: always() && steps.clang_cache.outputs.cache-hit != 'true'
>>>
>>> So, we will save the cache even if the build of the base reference failed?
>>> Maybe move this before the 'build' stage and remove the 'always()' check?
>>
>> Good catch, will move it up.
>>
>>>>> +      with:
>>>>> +        path: base-clang-analyzer-results
>>>>> +        key:  ${{ steps.cache_key.outputs.key }}
>>>>> +
>>>>>    build-osx:
>>>>>      env:
>>>>>        CC:    clang
>>>>>
>>>
>>> Best regards, Ilya Maximets.
>>
>> Thanks for the comments, will prep a v5.
>>
>> //Eelco
>>
Ilya Maximets Jan. 2, 2024, 4:31 p.m. UTC | #7
On 1/2/24 17:14, Eelco Chaudron wrote:
> 
> 
> On 2 Jan 2024, at 16:55, Ilya Maximets wrote:
> 
>> On 1/2/24 16:19, Eelco Chaudron wrote:
>>>
>>>
>>> On 2 Jan 2024, at 13:43, Ilya Maximets wrote:
>>>
>>>> On 12/20/23 16:07, Eelco Chaudron wrote:
>>>>>
>>>>>
>>>>> On 20 Dec 2023, at 11:19, Eelco Chaudron wrote:
>>>>>
>>>>>> This patch identifies new static analysis issues during a GitHub action
>>>>>> run and reports them. The process involves analyzing the changes introduced
>>>>>> in the current commit and comparing them to those in the preceding commit.
>>>>>>
>>>>>> However, there are two cases when the GitHub push action runner does not
>>>>>> provide enough details to determen the preceding commit. These cases are
>>>>>> a new branch or a forced push. The strategy for these exceptions is to
>>>>>> select the first commit not done by the current author as the base commit.
>>>>>
>>>>> Including Aarons comment on v3 into v4.
>>>>>
>>>>> +++ As an alternative we can look at the author (go back in time until
>>>>> +++ we find a different author), so if you’re lucky and it’s one of your
>>>>> +++ older patches you will be reminded again and again ;) Not sure if
>>>>> +++ this is easaly possible with a git command, but it might be worth
>>>>> +++ trying out.
>>>>>
>>>>> ++ I did some experiments with the above approach, and it seems to work
>>>>> ++  in all the use cases. The only problem could be that if you are also
>>>>> ++ the owner of a previous patch series, it might use a reference branch
>>>>> ++ too deep. This could potentially show issues not introduced by your
>>>>> ++ latest patches. But as you are still the author, you should be
>>>>> ++ “punished” for your earlier mistakes ;)
>>>>> ++
>>>>> ++ I’ll send out a v4, and we can continue the discussion there if needed :)
>>>>>
>>>>> + I'll take a look after Jan 1, but I guess we might be able to check
>>>>> + against an upstream reference (for example, we can reference towards the
>>>>> + 'upstream' repository - such a reference does require some kind of
>>>>> + history being pulled) which we know should be stable and 'good'.  IE:
>>>>> + the script could add the OVS github remote if it doesn't already exist.
>>>>> + I think we could use that to find the common ancestor.  But I need to
>>>>> + dive into this more.
>>>>>
>>>>> I guess we can determine the origin branch going back, and then do something like
>>>>> `git merge-base upstream/master HEAD` assuming we fetched the upstream repository.
>>>>
>>>>
>>>> The 'determine the origin branch' is still tricky and requires some
>>>> guesswork in case of a new branch or a force push.  Maybe something
>>>> like listing all the heads in upstream repo and finding the one that
>>>> has the closest merge-base may be an option.  Then use that merge base
>>>> as a base commit to compare against.
>>>>
>>>> The merge-base with upstream/master will be a branching point for a
>>>> release branch, and that is way too far.
>>>
>>> ACK, I think going back in time till a different author will solve all this for now.
>>> And this will even work in Ilya’s and my force commit workflow. So for now I’ll keep
>>> it as it.
>>
>> It will not work in case you're incorporating someone else's patches in
>> your patch set, unless you're pushing them one by one.  Robot will catch
>> such problems, but people may miss.  Same problem if you are accumulating
>> multiple different patches before batch-applying them.
> 
> ACK, this is true. But with this patch in place, the robot should have captured this for you ;)
> And your actual push to the main branch will also fail. Guess the previously suggested patch -1
> will have the same problem.
> 
> Don’t think there will be a ‘full’ solution that will work in all cases 100% of the time.

But why don't you like "closest merge-base against upstream heads" approach?
It seems less error prone to me.  At least I can't right away pinpoint a
simple scenario where it will do something unexpected, unless the branch is
completely diverged from anything upstream.

> This approach works for individual developers and the robot. For maintainers, we can push the
> new branch, and then the actual patches, although making changes in the middle might require
> deleting the remote branch, re-creating it, and re-apply. Tried this with stgit, and this seems easy.
> 
> //Eelco
> 
>>>>> Both solutions will work, and I have no real preference.
>>>>>
>>>>> //Eelco
>>>>>
>>>>>> An example error output might look like this:
>>>>>>
>>>>>>   error level: +0 -0 no changes
>>>>>>   warning level: +2 +0
>>>>>>     New issue "deadcode.DeadStores Value stored to 'remote' is never read" (1 occurrence)
>>>>>>    file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:86
>>>>>>     New issue "unix.Malloc Potential leak of memory pointed to by 'remote'" (1 occurrence)
>>>>>>    file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:95
>>>>>>   note level: +0 -0 no changes
>>>>>>   all levels: +2 +0
>>>>>>
>>>>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>>>>> ---
>>>>>>
>>>>>> changes in v2:
>>>>>>   - When it's a new branch, it compares it to the HEAD of the default branch.
>>>>>>
>>>>>> changes in v3:
>>>>>>   - Include the clang version as part of the cache
>>>>>>   - Change the way it looks for the 'default' branch so it will work
>>>>>>     for patch branches.
>>>>>>   - Also compare to the base branch for forced commits.
>>>>>>
>>>>>> changes in v4:
>>>>>>   - No longer look for a default branch, but consume all patches
>>>>>>     from the current author.
>>>>>>
>>>>>>  .ci/linux-build.sh                   |   29 ++++++++++
>>>>>>  .github/workflows/build-and-test.yml |  103 ++++++++++++++++++++++++++++++++++
>>>>>>  2 files changed, 132 insertions(+)
>>>>>>
>>>>>> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
>>>>>> index aa2ecc505..fedf1398a 100755
>>>>>> --- a/.ci/linux-build.sh
>>>>>> +++ b/.ci/linux-build.sh
>>>>>> @@ -49,6 +49,30 @@ function build_ovs()
>>>>>>      make -j4
>>>>>>  }
>>>>>>
>>>>>> +function clang_analyze()
>>>>>> +{
>>>>>> +    [ -d "./base-clang-analyzer-results" ] && cache_build=false \
>>>>>> +                                               || cache_build=true
>>>>>> +    if [ "$cache_build" = true ]; then
>>>>>> +        # If this is a cache build, proceed to the base branch's directory.
>>>>>> +        cd base_ovs_main
>>>>
>>>> pushd/popd might be better here.
>>>
>>> ACK will do this next revision.
>>>>
>>>>>> +    fi;
>>>>>> +
>>>>>> +    configure_ovs $OPTS
>>>>>> +    make clean
>>>>>> +    scan-build -o ./clang-analyzer-results -sarif --use-cc=clang make -j4
>>>>
>>>> We have a make target for clang-analyze and it depends on the clean already.
>>>> Maybe we can extend it to accept custom arguments via env variable and pass
>>>> the '-sarif' this way?  E.g. make clang-analyze SCAN_BUILD_OPTIONS="-sarif".
>>>
>>> Let me look into this, as I like the idea.
>>>
>>>> Also, -j4 should be changed to ${JOBS} since that patch got merged.
>>>
>>> ACK
>>>
>>>> And we have CC defined in the job description, so should not use clang directly.
>>>
>>> ACK
>>>
>>>>>> +
>>>>>> +    if [ "$cache_build" = true ]; then
>>>>>> +        # Move results, so it will be picked up by the cache.
>>>>>> +        mv ./clang-analyzer-results ../base-clang-analyzer-results
>>>>>> +        cd ..
>>>>>> +    else
>>>>>> +        # Only do the compare on the none cache builds.
>>>>>> +        sarif --check note diff ./base-clang-analyzer-results \
>>>>>> +                                ./clang-analyzer-results
>>>>>> +    fi;
>>>>>> +}
>>>>>> +
>>>>>>  if [ "$DEB_PACKAGE" ]; then
>>>>>>      ./boot.sh && ./configure --with-dpdk=$DPDK && make debian
>>>>>>      mk-build-deps --install --root-cmd sudo --remove debian/control
>>>>>> @@ -116,6 +140,11 @@ fi
>>>>>>
>>>>>>  OPTS="${EXTRA_OPTS} ${OPTS} $*"
>>>>>>
>>>>>> +if [ "$CLANG_ANALYZE" ]; then
>>>>>> +    clang_analyze
>>>>>> +    exit 0
>>>>>> +fi
>>>>>> +
>>>>>>  if [ "$TESTSUITE" = 'test' ]; then
>>>>>>      # 'distcheck' will reconfigure with required options.
>>>>>>      # Now we only need to prepare the Makefile without sparse-wrapped CC.
>>>>>> diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml
>>>>>> index 09654205e..cb277ff43 100644
>>>>>> --- a/.github/workflows/build-and-test.yml
>>>>>> +++ b/.github/workflows/build-and-test.yml
>>>>>> @@ -223,6 +223,109 @@ jobs:
>>>>>>          name: logs-linux-${{ join(matrix.*, '-') }}
>>>>>>          path: logs.tgz
>>>>>>
>>>>>> +  build-clang-analyze:
>>>>>> +    needs: build-dpdk
>>>>>> +    env:
>>>>>> +      dependencies: |
>>>>>> +        automake bc clang-tools libbpf-dev libnuma-dev libpcap-dev \
>>>>>> +        libunbound-dev libunwind-dev libssl-dev libtool llvm-dev \
>>>>>> +        python3-unbound
>>>>
>>>> Do we need a python3-unbound?
>>>
>>> Added it, as I removed the 32-bit conditional install, but can verify if it fails without.
>>>
>>>>>> +      CC:   clang
>>>>>> +      DPDK: dpdk
>>>>>> +      CLANG_ANALYZE: true
>>>>>> +    name: clang-analyze
>>>>>> +    runs-on: ubuntu-22.04
>>>>>> +    timeout-minutes: 30
>>>>>> +
>>>>>> +    steps:
>>>>>> +    - name: checkout
>>>>>> +      uses: actions/checkout@v3
>>>>>> +      with:
>>>>>> +        fetch-depth: 0
>>>>>> +
>>>>>> +    - name: get base branch sha
>>>>>> +      id: base_branch
>>>>>> +      env:
>>>>>> +        BASE_SHA: ${{ github.event.pull_request.base.sha }}
>>>>>> +        EVENT_BEFORE: ${{ github.event.before }}
>>>>>> +        FORCED_PUSH: ${{ github.event.forced }}
>>>>>> +      run: |
>>>>>> +        if [ "$GITHUB_EVENT_NAME" = "pull_request" ]; then
>>>>>> +          echo "sha=$BASE_SHA" >> $GITHUB_OUTPUT
>>>>>> +        else
>>>>>> +          if [ "$EVENT_BEFORE" = "0000000000000000000000000000000000000000" ] \
>>>>>> +             || [ "$FORCED_PUSH" = true ]; then
>>>>>> +            echo "sha=$(git log --pretty=format:"%H %ae" | \
>>>>>> +                        grep -m1 -v $(git log -1 --pretty=format:"%ae") | \
>>>>>> +                        awk '{print $1}')" >> $GITHUB_OUTPUT
>>>>>> +          else
>>>>>> +            echo "sha=$EVENT_BEFORE" >> $GITHUB_OUTPUT
>>>>>> +          fi
>>>>>> +        fi
>>>>>> +
>>>>>> +    - name: checkout base branch
>>>>>> +      uses: actions/checkout@v3
>>>>>> +      with:
>>>>>> +        ref: ${{ steps.base_branch.outputs.sha }}
>>>>>> +        path: base_ovs_main
>>>>>> +
>>>>>> +    - name: update PATH
>>>>>> +      run: |
>>>>>> +        echo "$HOME/bin"        >> $GITHUB_PATH
>>>>>> +        echo "$HOME/.local/bin" >> $GITHUB_PATH
>>>>>> +
>>>>>> +    - name: generate cache key
>>>>>> +      id: cache_key
>>>>>> +      run: |
>>>>>> +        ver=$(clang -v 2>&1 | grep version | \
>>>>
>>>> s/clang/${CC}/ ?
>>>>
>>>> And if we grep for ' version ', the check will work for gcc as well,
>>>> even if we do not really care about this case. :)
>>>
>>> ACK, will make the change below as well.
>>>
>>>>>> +              sed 's/.*version \([0-9]*\.[0-9]*\.[0-9]*\).*/\1/g')
>>>>>> +        echo "key=clang-${ver}-analyze-$(git -C base_ovs_main rev-parse HEAD)" \
>>>>>> +          >> $GITHUB_OUTPUT
>>>>>> +
>>>>>> +    - name: check for analyzer result cache
>>>>>> +      id: clang_cache
>>>>>> +      uses: actions/cache@v3
>>>>>> +      with:
>>>>>> +        path: base-clang-analyzer-results
>>>>>> +        key:  ${{ steps.cache_key.outputs.key }}
>>>>>> +
>>>>>> +    - name: set up python
>>>>>> +      uses: actions/setup-python@v4
>>>>>> +      with:
>>>>>> +        python-version: '3.9'
>>>>>> +
>>>>>> +    - name: get cached dpdk-dir
>>>>>> +      uses: actions/cache/restore@v3
>>>>>> +      with:
>>>>>> +        path: dpdk-dir
>>>>>> +        key:  ${{ needs.build-dpdk.outputs.dpdk_key }}
>>>>>> +
>>>>>> +    - name: update APT cache
>>>>>> +      run:  sudo apt update || true
>>>>>> +
>>>>>> +    - name: install common dependencies
>>>>>> +      run:  sudo apt install -y ${{ env.dependencies }}
>>>>>> +
>>>>>> +    - name: install sarif tools
>>>>>> +      run: sudo pip3 install --disable-pip-version-check sarif-tools
>>>>
>>>> '--user' to avoid sudo.  Might also be better to add this to the
>>>> existing command in the linux-prepare script instead to allow pip
>>>> to choose compatible versions of all the packages.
>>>
>>> Let me move it to see if this will work.
>>>
>>>>>> +
>>>>>> +    - name: prepare
>>>>>> +      run:  ./.ci/linux-prepare.sh
>>>>>> +
>>>>>> +    - name: build base reference
>>>>>> +      if: steps.clang_cache.outputs.cache-hit != 'true'
>>>>>> +      run:  ./.ci/linux-build.sh
>>>>>> +
>>>>>> +    - name: build
>>>>>> +      run:  ./.ci/linux-build.sh
>>>>>> +
>>>>>> +    - name: save cache
>>>>>> +      uses: actions/cache/save@v3
>>>>>> +      if: always() && steps.clang_cache.outputs.cache-hit != 'true'
>>>>
>>>> So, we will save the cache even if the build of the base reference failed?
>>>> Maybe move this before the 'build' stage and remove the 'always()' check?
>>>
>>> Good catch, will move it up.
>>>
>>>>>> +      with:
>>>>>> +        path: base-clang-analyzer-results
>>>>>> +        key:  ${{ steps.cache_key.outputs.key }}
>>>>>> +
>>>>>>    build-osx:
>>>>>>      env:
>>>>>>        CC:    clang
>>>>>>
>>>>
>>>> Best regards, Ilya Maximets.
>>>
>>> Thanks for the comments, will prep a v5.
>>>
>>> //Eelco
>>>
>
Eelco Chaudron Jan. 2, 2024, 4:40 p.m. UTC | #8
On 2 Jan 2024, at 17:31, Ilya Maximets wrote:

> On 1/2/24 17:14, Eelco Chaudron wrote:
>>
>>
>> On 2 Jan 2024, at 16:55, Ilya Maximets wrote:
>>
>>> On 1/2/24 16:19, Eelco Chaudron wrote:
>>>>
>>>>
>>>> On 2 Jan 2024, at 13:43, Ilya Maximets wrote:
>>>>
>>>>> On 12/20/23 16:07, Eelco Chaudron wrote:
>>>>>>
>>>>>>
>>>>>> On 20 Dec 2023, at 11:19, Eelco Chaudron wrote:
>>>>>>
>>>>>>> This patch identifies new static analysis issues during a GitHub action
>>>>>>> run and reports them. The process involves analyzing the changes introduced
>>>>>>> in the current commit and comparing them to those in the preceding commit.
>>>>>>>
>>>>>>> However, there are two cases when the GitHub push action runner does not
>>>>>>> provide enough details to determen the preceding commit. These cases are
>>>>>>> a new branch or a forced push. The strategy for these exceptions is to
>>>>>>> select the first commit not done by the current author as the base commit.
>>>>>>
>>>>>> Including Aarons comment on v3 into v4.
>>>>>>
>>>>>> +++ As an alternative we can look at the author (go back in time until
>>>>>> +++ we find a different author), so if you’re lucky and it’s one of your
>>>>>> +++ older patches you will be reminded again and again ;) Not sure if
>>>>>> +++ this is easaly possible with a git command, but it might be worth
>>>>>> +++ trying out.
>>>>>>
>>>>>> ++ I did some experiments with the above approach, and it seems to work
>>>>>> ++  in all the use cases. The only problem could be that if you are also
>>>>>> ++ the owner of a previous patch series, it might use a reference branch
>>>>>> ++ too deep. This could potentially show issues not introduced by your
>>>>>> ++ latest patches. But as you are still the author, you should be
>>>>>> ++ “punished” for your earlier mistakes ;)
>>>>>> ++
>>>>>> ++ I’ll send out a v4, and we can continue the discussion there if needed :)
>>>>>>
>>>>>> + I'll take a look after Jan 1, but I guess we might be able to check
>>>>>> + against an upstream reference (for example, we can reference towards the
>>>>>> + 'upstream' repository - such a reference does require some kind of
>>>>>> + history being pulled) which we know should be stable and 'good'.  IE:
>>>>>> + the script could add the OVS github remote if it doesn't already exist.
>>>>>> + I think we could use that to find the common ancestor.  But I need to
>>>>>> + dive into this more.
>>>>>>
>>>>>> I guess we can determine the origin branch going back, and then do something like
>>>>>> `git merge-base upstream/master HEAD` assuming we fetched the upstream repository.
>>>>>
>>>>>
>>>>> The 'determine the origin branch' is still tricky and requires some
>>>>> guesswork in case of a new branch or a force push.  Maybe something
>>>>> like listing all the heads in upstream repo and finding the one that
>>>>> has the closest merge-base may be an option.  Then use that merge base
>>>>> as a base commit to compare against.
>>>>>
>>>>> The merge-base with upstream/master will be a branching point for a
>>>>> release branch, and that is way too far.
>>>>
>>>> ACK, I think going back in time till a different author will solve all this for now.
>>>> And this will even work in Ilya’s and my force commit workflow. So for now I’ll keep
>>>> it as it.
>>>
>>> It will not work in case you're incorporating someone else's patches in
>>> your patch set, unless you're pushing them one by one.  Robot will catch
>>> such problems, but people may miss.  Same problem if you are accumulating
>>> multiple different patches before batch-applying them.
>>
>> ACK, this is true. But with this patch in place, the robot should have captured this for you ;)
>> And your actual push to the main branch will also fail. Guess the previously suggested patch -1
>> will have the same problem.
>>
>> Don’t think there will be a ‘full’ solution that will work in all cases 100% of the time.
>
> But why don't you like "closest merge-base against upstream heads" approach?
> It seems less error prone to me.  At least I can't right away pinpoint a
> simple scenario where it will do something unexpected, unless the branch is
> completely diverged from anything upstream.

The issue only exists for new branches and forced commits. So how would we determine the branch to use for the ‘git merge-base upstream/<BRANCH> HEAD’? Check if there is a branch-XX in the timeline use it, and if not assume master/main? Or is there a smarter way?

I’m fine with this too, I can do some experimentation…

>> This approach works for individual developers and the robot. For maintainers, we can push the
>> new branch, and then the actual patches, although making changes in the middle might require
>> deleting the remote branch, re-creating it, and re-apply. Tried this with stgit, and this seems easy.
>>
>> //Eelco
>>
>>>>>> Both solutions will work, and I have no real preference.
>>>>>>
>>>>>> //Eelco
>>>>>>
>>>>>>> An example error output might look like this:
>>>>>>>
>>>>>>>   error level: +0 -0 no changes
>>>>>>>   warning level: +2 +0
>>>>>>>     New issue "deadcode.DeadStores Value stored to 'remote' is never read" (1 occurrence)
>>>>>>>   file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:86
>>>>>>>     New issue "unix.Malloc Potential leak of memory pointed to by 'remote'" (1 occurrence)
>>>>>>>   file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:95
>>>>>>>   note level: +0 -0 no changes
>>>>>>>   all levels: +2 +0
>>>>>>>
>>>>>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>>>>>> ---
>>>>>>>
>>>>>>> changes in v2:
>>>>>>>   - When it's a new branch, it compares it to the HEAD of the default branch.
>>>>>>>
>>>>>>> changes in v3:
>>>>>>>   - Include the clang version as part of the cache
>>>>>>>   - Change the way it looks for the 'default' branch so it will work
>>>>>>>     for patch branches.
>>>>>>>   - Also compare to the base branch for forced commits.
>>>>>>>
>>>>>>> changes in v4:
>>>>>>>   - No longer look for a default branch, but consume all patches
>>>>>>>     from the current author.
>>>>>>>
>>>>>>>  .ci/linux-build.sh                   |   29 ++++++++++
>>>>>>>  .github/workflows/build-and-test.yml |  103 ++++++++++++++++++++++++++++++++++
>>>>>>>  2 files changed, 132 insertions(+)
>>>>>>>
>>>>>>> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
>>>>>>> index aa2ecc505..fedf1398a 100755
>>>>>>> --- a/.ci/linux-build.sh
>>>>>>> +++ b/.ci/linux-build.sh
>>>>>>> @@ -49,6 +49,30 @@ function build_ovs()
>>>>>>>      make -j4
>>>>>>>  }
>>>>>>>
>>>>>>> +function clang_analyze()
>>>>>>> +{
>>>>>>> +    [ -d "./base-clang-analyzer-results" ] && cache_build=false \
>>>>>>> +                                               || cache_build=true
>>>>>>> +    if [ "$cache_build" = true ]; then
>>>>>>> +        # If this is a cache build, proceed to the base branch's directory.
>>>>>>> +        cd base_ovs_main
>>>>>
>>>>> pushd/popd might be better here.
>>>>
>>>> ACK will do this next revision.
>>>>>
>>>>>>> +    fi;
>>>>>>> +
>>>>>>> +    configure_ovs $OPTS
>>>>>>> +    make clean
>>>>>>> +    scan-build -o ./clang-analyzer-results -sarif --use-cc=clang make -j4
>>>>>
>>>>> We have a make target for clang-analyze and it depends on the clean already.
>>>>> Maybe we can extend it to accept custom arguments via env variable and pass
>>>>> the '-sarif' this way?  E.g. make clang-analyze SCAN_BUILD_OPTIONS="-sarif".
>>>>
>>>> Let me look into this, as I like the idea.
>>>>
>>>>> Also, -j4 should be changed to ${JOBS} since that patch got merged.
>>>>
>>>> ACK
>>>>
>>>>> And we have CC defined in the job description, so should not use clang directly.
>>>>
>>>> ACK
>>>>
>>>>>>> +
>>>>>>> +    if [ "$cache_build" = true ]; then
>>>>>>> +        # Move results, so it will be picked up by the cache.
>>>>>>> +        mv ./clang-analyzer-results ../base-clang-analyzer-results
>>>>>>> +        cd ..
>>>>>>> +    else
>>>>>>> +        # Only do the compare on the none cache builds.
>>>>>>> +        sarif --check note diff ./base-clang-analyzer-results \
>>>>>>> +                                ./clang-analyzer-results
>>>>>>> +    fi;
>>>>>>> +}
>>>>>>> +
>>>>>>>  if [ "$DEB_PACKAGE" ]; then
>>>>>>>      ./boot.sh && ./configure --with-dpdk=$DPDK && make debian
>>>>>>>      mk-build-deps --install --root-cmd sudo --remove debian/control
>>>>>>> @@ -116,6 +140,11 @@ fi
>>>>>>>
>>>>>>>  OPTS="${EXTRA_OPTS} ${OPTS} $*"
>>>>>>>
>>>>>>> +if [ "$CLANG_ANALYZE" ]; then
>>>>>>> +    clang_analyze
>>>>>>> +    exit 0
>>>>>>> +fi
>>>>>>> +
>>>>>>>  if [ "$TESTSUITE" = 'test' ]; then
>>>>>>>      # 'distcheck' will reconfigure with required options.
>>>>>>>      # Now we only need to prepare the Makefile without sparse-wrapped CC.
>>>>>>> diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml
>>>>>>> index 09654205e..cb277ff43 100644
>>>>>>> --- a/.github/workflows/build-and-test.yml
>>>>>>> +++ b/.github/workflows/build-and-test.yml
>>>>>>> @@ -223,6 +223,109 @@ jobs:
>>>>>>>          name: logs-linux-${{ join(matrix.*, '-') }}
>>>>>>>          path: logs.tgz
>>>>>>>
>>>>>>> +  build-clang-analyze:
>>>>>>> +    needs: build-dpdk
>>>>>>> +    env:
>>>>>>> +      dependencies: |
>>>>>>> +        automake bc clang-tools libbpf-dev libnuma-dev libpcap-dev \
>>>>>>> +        libunbound-dev libunwind-dev libssl-dev libtool llvm-dev \
>>>>>>> +        python3-unbound
>>>>>
>>>>> Do we need a python3-unbound?
>>>>
>>>> Added it, as I removed the 32-bit conditional install, but can verify if it fails without.
>>>>
>>>>>>> +      CC:   clang
>>>>>>> +      DPDK: dpdk
>>>>>>> +      CLANG_ANALYZE: true
>>>>>>> +    name: clang-analyze
>>>>>>> +    runs-on: ubuntu-22.04
>>>>>>> +    timeout-minutes: 30
>>>>>>> +
>>>>>>> +    steps:
>>>>>>> +    - name: checkout
>>>>>>> +      uses: actions/checkout@v3
>>>>>>> +      with:
>>>>>>> +        fetch-depth: 0
>>>>>>> +
>>>>>>> +    - name: get base branch sha
>>>>>>> +      id: base_branch
>>>>>>> +      env:
>>>>>>> +        BASE_SHA: ${{ github.event.pull_request.base.sha }}
>>>>>>> +        EVENT_BEFORE: ${{ github.event.before }}
>>>>>>> +        FORCED_PUSH: ${{ github.event.forced }}
>>>>>>> +      run: |
>>>>>>> +        if [ "$GITHUB_EVENT_NAME" = "pull_request" ]; then
>>>>>>> +          echo "sha=$BASE_SHA" >> $GITHUB_OUTPUT
>>>>>>> +        else
>>>>>>> +          if [ "$EVENT_BEFORE" = "0000000000000000000000000000000000000000" ] \
>>>>>>> +             || [ "$FORCED_PUSH" = true ]; then
>>>>>>> +            echo "sha=$(git log --pretty=format:"%H %ae" | \
>>>>>>> +                        grep -m1 -v $(git log -1 --pretty=format:"%ae") | \
>>>>>>> +                        awk '{print $1}')" >> $GITHUB_OUTPUT
>>>>>>> +          else
>>>>>>> +            echo "sha=$EVENT_BEFORE" >> $GITHUB_OUTPUT
>>>>>>> +          fi
>>>>>>> +        fi
>>>>>>> +
>>>>>>> +    - name: checkout base branch
>>>>>>> +      uses: actions/checkout@v3
>>>>>>> +      with:
>>>>>>> +        ref: ${{ steps.base_branch.outputs.sha }}
>>>>>>> +        path: base_ovs_main
>>>>>>> +
>>>>>>> +    - name: update PATH
>>>>>>> +      run: |
>>>>>>> +        echo "$HOME/bin"        >> $GITHUB_PATH
>>>>>>> +        echo "$HOME/.local/bin" >> $GITHUB_PATH
>>>>>>> +
>>>>>>> +    - name: generate cache key
>>>>>>> +      id: cache_key
>>>>>>> +      run: |
>>>>>>> +        ver=$(clang -v 2>&1 | grep version | \
>>>>>
>>>>> s/clang/${CC}/ ?
>>>>>
>>>>> And if we grep for ' version ', the check will work for gcc as well,
>>>>> even if we do not really care about this case. :)
>>>>
>>>> ACK, will make the change below as well.
>>>>
>>>>>>> +              sed 's/.*version \([0-9]*\.[0-9]*\.[0-9]*\).*/\1/g')
>>>>>>> +        echo "key=clang-${ver}-analyze-$(git -C base_ovs_main rev-parse HEAD)" \
>>>>>>> +          >> $GITHUB_OUTPUT
>>>>>>> +
>>>>>>> +    - name: check for analyzer result cache
>>>>>>> +      id: clang_cache
>>>>>>> +      uses: actions/cache@v3
>>>>>>> +      with:
>>>>>>> +        path: base-clang-analyzer-results
>>>>>>> +        key:  ${{ steps.cache_key.outputs.key }}
>>>>>>> +
>>>>>>> +    - name: set up python
>>>>>>> +      uses: actions/setup-python@v4
>>>>>>> +      with:
>>>>>>> +        python-version: '3.9'
>>>>>>> +
>>>>>>> +    - name: get cached dpdk-dir
>>>>>>> +      uses: actions/cache/restore@v3
>>>>>>> +      with:
>>>>>>> +        path: dpdk-dir
>>>>>>> +        key:  ${{ needs.build-dpdk.outputs.dpdk_key }}
>>>>>>> +
>>>>>>> +    - name: update APT cache
>>>>>>> +      run:  sudo apt update || true
>>>>>>> +
>>>>>>> +    - name: install common dependencies
>>>>>>> +      run:  sudo apt install -y ${{ env.dependencies }}
>>>>>>> +
>>>>>>> +    - name: install sarif tools
>>>>>>> +      run: sudo pip3 install --disable-pip-version-check sarif-tools
>>>>>
>>>>> '--user' to avoid sudo.  Might also be better to add this to the
>>>>> existing command in the linux-prepare script instead to allow pip
>>>>> to choose compatible versions of all the packages.
>>>>
>>>> Let me move it to see if this will work.
>>>>
>>>>>>> +
>>>>>>> +    - name: prepare
>>>>>>> +      run:  ./.ci/linux-prepare.sh
>>>>>>> +
>>>>>>> +    - name: build base reference
>>>>>>> +      if: steps.clang_cache.outputs.cache-hit != 'true'
>>>>>>> +      run:  ./.ci/linux-build.sh
>>>>>>> +
>>>>>>> +    - name: build
>>>>>>> +      run:  ./.ci/linux-build.sh
>>>>>>> +
>>>>>>> +    - name: save cache
>>>>>>> +      uses: actions/cache/save@v3
>>>>>>> +      if: always() && steps.clang_cache.outputs.cache-hit != 'true'
>>>>>
>>>>> So, we will save the cache even if the build of the base reference failed?
>>>>> Maybe move this before the 'build' stage and remove the 'always()' check?
>>>>
>>>> Good catch, will move it up.
>>>>
>>>>>>> +      with:
>>>>>>> +        path: base-clang-analyzer-results
>>>>>>> +        key:  ${{ steps.cache_key.outputs.key }}
>>>>>>> +
>>>>>>>    build-osx:
>>>>>>>      env:
>>>>>>>        CC:    clang
>>>>>>>
>>>>>
>>>>> Best regards, Ilya Maximets.
>>>>
>>>> Thanks for the comments, will prep a v5.
>>>>
>>>> //Eelco
>>>>
>>
Ilya Maximets Jan. 2, 2024, 5:33 p.m. UTC | #9
On 1/2/24 17:40, Eelco Chaudron wrote:
> 
> 
> On 2 Jan 2024, at 17:31, Ilya Maximets wrote:
> 
>> On 1/2/24 17:14, Eelco Chaudron wrote:
>>>
>>>
>>> On 2 Jan 2024, at 16:55, Ilya Maximets wrote:
>>>
>>>> On 1/2/24 16:19, Eelco Chaudron wrote:
>>>>>
>>>>>
>>>>> On 2 Jan 2024, at 13:43, Ilya Maximets wrote:
>>>>>
>>>>>> On 12/20/23 16:07, Eelco Chaudron wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 20 Dec 2023, at 11:19, Eelco Chaudron wrote:
>>>>>>>
>>>>>>>> This patch identifies new static analysis issues during a GitHub action
>>>>>>>> run and reports them. The process involves analyzing the changes introduced
>>>>>>>> in the current commit and comparing them to those in the preceding commit.
>>>>>>>>
>>>>>>>> However, there are two cases when the GitHub push action runner does not
>>>>>>>> provide enough details to determen the preceding commit. These cases are
>>>>>>>> a new branch or a forced push. The strategy for these exceptions is to
>>>>>>>> select the first commit not done by the current author as the base commit.
>>>>>>>
>>>>>>> Including Aarons comment on v3 into v4.
>>>>>>>
>>>>>>> +++ As an alternative we can look at the author (go back in time until
>>>>>>> +++ we find a different author), so if you’re lucky and it’s one of your
>>>>>>> +++ older patches you will be reminded again and again ;) Not sure if
>>>>>>> +++ this is easaly possible with a git command, but it might be worth
>>>>>>> +++ trying out.
>>>>>>>
>>>>>>> ++ I did some experiments with the above approach, and it seems to work
>>>>>>> ++  in all the use cases. The only problem could be that if you are also
>>>>>>> ++ the owner of a previous patch series, it might use a reference branch
>>>>>>> ++ too deep. This could potentially show issues not introduced by your
>>>>>>> ++ latest patches. But as you are still the author, you should be
>>>>>>> ++ “punished” for your earlier mistakes ;)
>>>>>>> ++
>>>>>>> ++ I’ll send out a v4, and we can continue the discussion there if needed :)
>>>>>>>
>>>>>>> + I'll take a look after Jan 1, but I guess we might be able to check
>>>>>>> + against an upstream reference (for example, we can reference towards the
>>>>>>> + 'upstream' repository - such a reference does require some kind of
>>>>>>> + history being pulled) which we know should be stable and 'good'.  IE:
>>>>>>> + the script could add the OVS github remote if it doesn't already exist.
>>>>>>> + I think we could use that to find the common ancestor.  But I need to
>>>>>>> + dive into this more.
>>>>>>>
>>>>>>> I guess we can determine the origin branch going back, and then do something like
>>>>>>> `git merge-base upstream/master HEAD` assuming we fetched the upstream repository.
>>>>>>
>>>>>>
>>>>>> The 'determine the origin branch' is still tricky and requires some
>>>>>> guesswork in case of a new branch or a force push.  Maybe something
>>>>>> like listing all the heads in upstream repo and finding the one that
>>>>>> has the closest merge-base may be an option.  Then use that merge base
>>>>>> as a base commit to compare against.
>>>>>>
>>>>>> The merge-base with upstream/master will be a branching point for a
>>>>>> release branch, and that is way too far.
>>>>>
>>>>> ACK, I think going back in time till a different author will solve all this for now.
>>>>> And this will even work in Ilya’s and my force commit workflow. So for now I’ll keep
>>>>> it as it.
>>>>
>>>> It will not work in case you're incorporating someone else's patches in
>>>> your patch set, unless you're pushing them one by one.  Robot will catch
>>>> such problems, but people may miss.  Same problem if you are accumulating
>>>> multiple different patches before batch-applying them.
>>>
>>> ACK, this is true. But with this patch in place, the robot should have captured this for you ;)
>>> And your actual push to the main branch will also fail. Guess the previously suggested patch -1
>>> will have the same problem.
>>>
>>> Don’t think there will be a ‘full’ solution that will work in all cases 100% of the time.
>>
>> But why don't you like "closest merge-base against upstream heads" approach?
>> It seems less error prone to me.  At least I can't right away pinpoint a
>> simple scenario where it will do something unexpected, unless the branch is
>> completely diverged from anything upstream.
> 
> The issue only exists for new branches and forced commits. So how would we determine the branch
> to use for the ‘git merge-base upstream/<BRANCH> HEAD’? Check if there is a branch-XX in the
> timeline use it, and if not assume master/main? Or is there a smarter way?

Not smarter, but a bit dumber instead. :)  I was thinking about something
like this:

 BASE=HEAD~1 # In case no mergeable heads found.
 MIN_DISTANCE=1000000

 for upstream_head in $(git ls-remote --heads upstream | cut -f 1); do
     CURR_BASE=$(git merge-base ${upstream_head} HEAD)
     if [ ${CURR_BASE} ]; then
       DISTANCE=$(git log --oneline ${CURR_BASE}..HEAD | wc -l);
       echo $(git name-rev --refs='remotes/upstream/*' ${CURR_BASE}) ${DISTANCE}

       if test ${MIN_DISTANCE} -gt ${DISTANCE}; then
           BASE=${CURR_BASE}
           MIN_DISTANCE=${DISTANCE}
       fi
     fi
 done
 echo $(git name-rev --refs='remotes/upstream/*' ${BASE}) ${MIN_DISTANCE}

> 
> I’m fine with this too, I can do some experimentation…
> 
>>> This approach works for individual developers and the robot. For maintainers, we can push the
>>> new branch, and then the actual patches, although making changes in the middle might require
>>> deleting the remote branch, re-creating it, and re-apply. Tried this with stgit, and this seems easy.
>>>
>>> //Eelco
>>>
>>>>>>> Both solutions will work, and I have no real preference.
>>>>>>>
>>>>>>> //Eelco
>>>>>>>
>>>>>>>> An example error output might look like this:
>>>>>>>>
>>>>>>>>   error level: +0 -0 no changes
>>>>>>>>   warning level: +2 +0
>>>>>>>>     New issue "deadcode.DeadStores Value stored to 'remote' is never read" (1 occurrence)
>>>>>>>>   file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:86
>>>>>>>>     New issue "unix.Malloc Potential leak of memory pointed to by 'remote'" (1 occurrence)
>>>>>>>>   file:///home/runner/work/ovs/ovs/vswitchd/ovs-vswitchd.c:95
>>>>>>>>   note level: +0 -0 no changes
>>>>>>>>   all levels: +2 +0
>>>>>>>>
>>>>>>>> Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
>>>>>>>> ---
>>>>>>>>
>>>>>>>> changes in v2:
>>>>>>>>   - When it's a new branch, it compares it to the HEAD of the default branch.
>>>>>>>>
>>>>>>>> changes in v3:
>>>>>>>>   - Include the clang version as part of the cache
>>>>>>>>   - Change the way it looks for the 'default' branch so it will work
>>>>>>>>     for patch branches.
>>>>>>>>   - Also compare to the base branch for forced commits.
>>>>>>>>
>>>>>>>> changes in v4:
>>>>>>>>   - No longer look for a default branch, but consume all patches
>>>>>>>>     from the current author.
>>>>>>>>
>>>>>>>>  .ci/linux-build.sh                   |   29 ++++++++++
>>>>>>>>  .github/workflows/build-and-test.yml |  103 ++++++++++++++++++++++++++++++++++
>>>>>>>>  2 files changed, 132 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
>>>>>>>> index aa2ecc505..fedf1398a 100755
>>>>>>>> --- a/.ci/linux-build.sh
>>>>>>>> +++ b/.ci/linux-build.sh
>>>>>>>> @@ -49,6 +49,30 @@ function build_ovs()
>>>>>>>>      make -j4
>>>>>>>>  }
>>>>>>>>
>>>>>>>> +function clang_analyze()
>>>>>>>> +{
>>>>>>>> +    [ -d "./base-clang-analyzer-results" ] && cache_build=false \
>>>>>>>> +                                               || cache_build=true
>>>>>>>> +    if [ "$cache_build" = true ]; then
>>>>>>>> +        # If this is a cache build, proceed to the base branch's directory.
>>>>>>>> +        cd base_ovs_main
>>>>>>
>>>>>> pushd/popd might be better here.
>>>>>
>>>>> ACK will do this next revision.
>>>>>>
>>>>>>>> +    fi;
>>>>>>>> +
>>>>>>>> +    configure_ovs $OPTS
>>>>>>>> +    make clean
>>>>>>>> +    scan-build -o ./clang-analyzer-results -sarif --use-cc=clang make -j4
>>>>>>
>>>>>> We have a make target for clang-analyze and it depends on the clean already.
>>>>>> Maybe we can extend it to accept custom arguments via env variable and pass
>>>>>> the '-sarif' this way?  E.g. make clang-analyze SCAN_BUILD_OPTIONS="-sarif".
>>>>>
>>>>> Let me look into this, as I like the idea.
>>>>>
>>>>>> Also, -j4 should be changed to ${JOBS} since that patch got merged.
>>>>>
>>>>> ACK
>>>>>
>>>>>> And we have CC defined in the job description, so should not use clang directly.
>>>>>
>>>>> ACK
>>>>>
>>>>>>>> +
>>>>>>>> +    if [ "$cache_build" = true ]; then
>>>>>>>> +        # Move results, so it will be picked up by the cache.
>>>>>>>> +        mv ./clang-analyzer-results ../base-clang-analyzer-results
>>>>>>>> +        cd ..
>>>>>>>> +    else
>>>>>>>> +        # Only do the compare on the none cache builds.
>>>>>>>> +        sarif --check note diff ./base-clang-analyzer-results \
>>>>>>>> +                                ./clang-analyzer-results
>>>>>>>> +    fi;
>>>>>>>> +}
>>>>>>>> +
>>>>>>>>  if [ "$DEB_PACKAGE" ]; then
>>>>>>>>      ./boot.sh && ./configure --with-dpdk=$DPDK && make debian
>>>>>>>>      mk-build-deps --install --root-cmd sudo --remove debian/control
>>>>>>>> @@ -116,6 +140,11 @@ fi
>>>>>>>>
>>>>>>>>  OPTS="${EXTRA_OPTS} ${OPTS} $*"
>>>>>>>>
>>>>>>>> +if [ "$CLANG_ANALYZE" ]; then
>>>>>>>> +    clang_analyze
>>>>>>>> +    exit 0
>>>>>>>> +fi
>>>>>>>> +
>>>>>>>>  if [ "$TESTSUITE" = 'test' ]; then
>>>>>>>>      # 'distcheck' will reconfigure with required options.
>>>>>>>>      # Now we only need to prepare the Makefile without sparse-wrapped CC.
>>>>>>>> diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml
>>>>>>>> index 09654205e..cb277ff43 100644
>>>>>>>> --- a/.github/workflows/build-and-test.yml
>>>>>>>> +++ b/.github/workflows/build-and-test.yml
>>>>>>>> @@ -223,6 +223,109 @@ jobs:
>>>>>>>>          name: logs-linux-${{ join(matrix.*, '-') }}
>>>>>>>>          path: logs.tgz
>>>>>>>>
>>>>>>>> +  build-clang-analyze:
>>>>>>>> +    needs: build-dpdk
>>>>>>>> +    env:
>>>>>>>> +      dependencies: |
>>>>>>>> +        automake bc clang-tools libbpf-dev libnuma-dev libpcap-dev \
>>>>>>>> +        libunbound-dev libunwind-dev libssl-dev libtool llvm-dev \
>>>>>>>> +        python3-unbound
>>>>>>
>>>>>> Do we need a python3-unbound?
>>>>>
>>>>> Added it, as I removed the 32-bit conditional install, but can verify if it fails without.
>>>>>
>>>>>>>> +      CC:   clang
>>>>>>>> +      DPDK: dpdk
>>>>>>>> +      CLANG_ANALYZE: true
>>>>>>>> +    name: clang-analyze
>>>>>>>> +    runs-on: ubuntu-22.04
>>>>>>>> +    timeout-minutes: 30
>>>>>>>> +
>>>>>>>> +    steps:
>>>>>>>> +    - name: checkout
>>>>>>>> +      uses: actions/checkout@v3
>>>>>>>> +      with:
>>>>>>>> +        fetch-depth: 0
>>>>>>>> +
>>>>>>>> +    - name: get base branch sha
>>>>>>>> +      id: base_branch
>>>>>>>> +      env:
>>>>>>>> +        BASE_SHA: ${{ github.event.pull_request.base.sha }}
>>>>>>>> +        EVENT_BEFORE: ${{ github.event.before }}
>>>>>>>> +        FORCED_PUSH: ${{ github.event.forced }}
>>>>>>>> +      run: |
>>>>>>>> +        if [ "$GITHUB_EVENT_NAME" = "pull_request" ]; then
>>>>>>>> +          echo "sha=$BASE_SHA" >> $GITHUB_OUTPUT
>>>>>>>> +        else
>>>>>>>> +          if [ "$EVENT_BEFORE" = "0000000000000000000000000000000000000000" ] \
>>>>>>>> +             || [ "$FORCED_PUSH" = true ]; then
>>>>>>>> +            echo "sha=$(git log --pretty=format:"%H %ae" | \
>>>>>>>> +                        grep -m1 -v $(git log -1 --pretty=format:"%ae") | \
>>>>>>>> +                        awk '{print $1}')" >> $GITHUB_OUTPUT
>>>>>>>> +          else
>>>>>>>> +            echo "sha=$EVENT_BEFORE" >> $GITHUB_OUTPUT
>>>>>>>> +          fi
>>>>>>>> +        fi
>>>>>>>> +
>>>>>>>> +    - name: checkout base branch
>>>>>>>> +      uses: actions/checkout@v3
>>>>>>>> +      with:
>>>>>>>> +        ref: ${{ steps.base_branch.outputs.sha }}
>>>>>>>> +        path: base_ovs_main
>>>>>>>> +
>>>>>>>> +    - name: update PATH
>>>>>>>> +      run: |
>>>>>>>> +        echo "$HOME/bin"        >> $GITHUB_PATH
>>>>>>>> +        echo "$HOME/.local/bin" >> $GITHUB_PATH
>>>>>>>> +
>>>>>>>> +    - name: generate cache key
>>>>>>>> +      id: cache_key
>>>>>>>> +      run: |
>>>>>>>> +        ver=$(clang -v 2>&1 | grep version | \
>>>>>>
>>>>>> s/clang/${CC}/ ?
>>>>>>
>>>>>> And if we grep for ' version ', the check will work for gcc as well,
>>>>>> even if we do not really care about this case. :)
>>>>>
>>>>> ACK, will make the change below as well.
>>>>>
>>>>>>>> +              sed 's/.*version \([0-9]*\.[0-9]*\.[0-9]*\).*/\1/g')
>>>>>>>> +        echo "key=clang-${ver}-analyze-$(git -C base_ovs_main rev-parse HEAD)" \
>>>>>>>> +          >> $GITHUB_OUTPUT
>>>>>>>> +
>>>>>>>> +    - name: check for analyzer result cache
>>>>>>>> +      id: clang_cache
>>>>>>>> +      uses: actions/cache@v3
>>>>>>>> +      with:
>>>>>>>> +        path: base-clang-analyzer-results
>>>>>>>> +        key:  ${{ steps.cache_key.outputs.key }}
>>>>>>>> +
>>>>>>>> +    - name: set up python
>>>>>>>> +      uses: actions/setup-python@v4
>>>>>>>> +      with:
>>>>>>>> +        python-version: '3.9'
>>>>>>>> +
>>>>>>>> +    - name: get cached dpdk-dir
>>>>>>>> +      uses: actions/cache/restore@v3
>>>>>>>> +      with:
>>>>>>>> +        path: dpdk-dir
>>>>>>>> +        key:  ${{ needs.build-dpdk.outputs.dpdk_key }}
>>>>>>>> +
>>>>>>>> +    - name: update APT cache
>>>>>>>> +      run:  sudo apt update || true
>>>>>>>> +
>>>>>>>> +    - name: install common dependencies
>>>>>>>> +      run:  sudo apt install -y ${{ env.dependencies }}
>>>>>>>> +
>>>>>>>> +    - name: install sarif tools
>>>>>>>> +      run: sudo pip3 install --disable-pip-version-check sarif-tools
>>>>>>
>>>>>> '--user' to avoid sudo.  Might also be better to add this to the
>>>>>> existing command in the linux-prepare script instead to allow pip
>>>>>> to choose compatible versions of all the packages.
>>>>>
>>>>> Let me move it to see if this will work.
>>>>>
>>>>>>>> +
>>>>>>>> +    - name: prepare
>>>>>>>> +      run:  ./.ci/linux-prepare.sh
>>>>>>>> +
>>>>>>>> +    - name: build base reference
>>>>>>>> +      if: steps.clang_cache.outputs.cache-hit != 'true'
>>>>>>>> +      run:  ./.ci/linux-build.sh
>>>>>>>> +
>>>>>>>> +    - name: build
>>>>>>>> +      run:  ./.ci/linux-build.sh
>>>>>>>> +
>>>>>>>> +    - name: save cache
>>>>>>>> +      uses: actions/cache/save@v3
>>>>>>>> +      if: always() && steps.clang_cache.outputs.cache-hit != 'true'
>>>>>>
>>>>>> So, we will save the cache even if the build of the base reference failed?
>>>>>> Maybe move this before the 'build' stage and remove the 'always()' check?
>>>>>
>>>>> Good catch, will move it up.
>>>>>
>>>>>>>> +      with:
>>>>>>>> +        path: base-clang-analyzer-results
>>>>>>>> +        key:  ${{ steps.cache_key.outputs.key }}
>>>>>>>> +
>>>>>>>>    build-osx:
>>>>>>>>      env:
>>>>>>>>        CC:    clang
>>>>>>>>
>>>>>>
>>>>>> Best regards, Ilya Maximets.
>>>>>
>>>>> Thanks for the comments, will prep a v5.
>>>>>
>>>>> //Eelco
>>>>>
>>>
>
Eelco Chaudron Jan. 10, 2024, 10:36 p.m. UTC | #10
On 2 Jan 2024, at 16:19, Eelco Chaudron wrote:

> On 2 Jan 2024, at 13:43, Ilya Maximets wrote:
>
>

<SNIP>

>>>> +    fi;
>>>> +
>>>> +    configure_ovs $OPTS
>>>> +    make clean
>>>> +    scan-build -o ./clang-analyzer-results -sarif --use-cc=clang make -j4
>>
>> We have a make target for clang-analyze and it depends on the clean already.
>> Maybe we can extend it to accept custom arguments via env variable and pass
>> the '-sarif' this way?  E.g. make clang-analyze SCAN_BUILD_OPTIONS="-sarif".
>
> Let me look into this, as I like the idea.
>

I implemented this, but this resulted in a chicken and the egg problem. The reference build does not have this (for this commit) and it will fail the run. So will re-submit it as is.

//Eelco

<SNIP>
diff mbox series

Patch

diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index aa2ecc505..fedf1398a 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -49,6 +49,30 @@  function build_ovs()
     make -j4
 }
 
+function clang_analyze()
+{
+    [ -d "./base-clang-analyzer-results" ] && cache_build=false \
+                                               || cache_build=true
+    if [ "$cache_build" = true ]; then
+        # If this is a cache build, proceed to the base branch's directory.
+        cd base_ovs_main
+    fi;
+
+    configure_ovs $OPTS
+    make clean
+    scan-build -o ./clang-analyzer-results -sarif --use-cc=clang make -j4
+
+    if [ "$cache_build" = true ]; then
+        # Move results, so it will be picked up by the cache.
+        mv ./clang-analyzer-results ../base-clang-analyzer-results
+        cd ..
+    else
+        # Only do the compare on the none cache builds.
+        sarif --check note diff ./base-clang-analyzer-results \
+                                ./clang-analyzer-results
+    fi;
+}
+
 if [ "$DEB_PACKAGE" ]; then
     ./boot.sh && ./configure --with-dpdk=$DPDK && make debian
     mk-build-deps --install --root-cmd sudo --remove debian/control
@@ -116,6 +140,11 @@  fi
 
 OPTS="${EXTRA_OPTS} ${OPTS} $*"
 
+if [ "$CLANG_ANALYZE" ]; then
+    clang_analyze
+    exit 0
+fi
+
 if [ "$TESTSUITE" = 'test' ]; then
     # 'distcheck' will reconfigure with required options.
     # Now we only need to prepare the Makefile without sparse-wrapped CC.
diff --git a/.github/workflows/build-and-test.yml b/.github/workflows/build-and-test.yml
index 09654205e..cb277ff43 100644
--- a/.github/workflows/build-and-test.yml
+++ b/.github/workflows/build-and-test.yml
@@ -223,6 +223,109 @@  jobs:
         name: logs-linux-${{ join(matrix.*, '-') }}
         path: logs.tgz
 
+  build-clang-analyze:
+    needs: build-dpdk
+    env:
+      dependencies: |
+        automake bc clang-tools libbpf-dev libnuma-dev libpcap-dev \
+        libunbound-dev libunwind-dev libssl-dev libtool llvm-dev \
+        python3-unbound
+      CC:   clang
+      DPDK: dpdk
+      CLANG_ANALYZE: true
+    name: clang-analyze
+    runs-on: ubuntu-22.04
+    timeout-minutes: 30
+
+    steps:
+    - name: checkout
+      uses: actions/checkout@v3
+      with:
+        fetch-depth: 0
+
+    - name: get base branch sha
+      id: base_branch
+      env:
+        BASE_SHA: ${{ github.event.pull_request.base.sha }}
+        EVENT_BEFORE: ${{ github.event.before }}
+        FORCED_PUSH: ${{ github.event.forced }}
+      run: |
+        if [ "$GITHUB_EVENT_NAME" = "pull_request" ]; then
+          echo "sha=$BASE_SHA" >> $GITHUB_OUTPUT
+        else
+          if [ "$EVENT_BEFORE" = "0000000000000000000000000000000000000000" ] \
+             || [ "$FORCED_PUSH" = true ]; then
+            echo "sha=$(git log --pretty=format:"%H %ae" | \
+                        grep -m1 -v $(git log -1 --pretty=format:"%ae") | \
+                        awk '{print $1}')" >> $GITHUB_OUTPUT
+          else
+            echo "sha=$EVENT_BEFORE" >> $GITHUB_OUTPUT
+          fi
+        fi
+
+    - name: checkout base branch
+      uses: actions/checkout@v3
+      with:
+        ref: ${{ steps.base_branch.outputs.sha }}
+        path: base_ovs_main
+
+    - name: update PATH
+      run: |
+        echo "$HOME/bin"        >> $GITHUB_PATH
+        echo "$HOME/.local/bin" >> $GITHUB_PATH
+
+    - name: generate cache key
+      id: cache_key
+      run: |
+        ver=$(clang -v 2>&1 | grep version | \
+              sed 's/.*version \([0-9]*\.[0-9]*\.[0-9]*\).*/\1/g')
+        echo "key=clang-${ver}-analyze-$(git -C base_ovs_main rev-parse HEAD)" \
+          >> $GITHUB_OUTPUT
+
+    - name: check for analyzer result cache
+      id: clang_cache
+      uses: actions/cache@v3
+      with:
+        path: base-clang-analyzer-results
+        key:  ${{ steps.cache_key.outputs.key }}
+
+    - name: set up python
+      uses: actions/setup-python@v4
+      with:
+        python-version: '3.9'
+
+    - name: get cached dpdk-dir
+      uses: actions/cache/restore@v3
+      with:
+        path: dpdk-dir
+        key:  ${{ needs.build-dpdk.outputs.dpdk_key }}
+
+    - name: update APT cache
+      run:  sudo apt update || true
+
+    - name: install common dependencies
+      run:  sudo apt install -y ${{ env.dependencies }}
+
+    - name: install sarif tools
+      run: sudo pip3 install --disable-pip-version-check sarif-tools
+
+    - name: prepare
+      run:  ./.ci/linux-prepare.sh
+
+    - name: build base reference
+      if: steps.clang_cache.outputs.cache-hit != 'true'
+      run:  ./.ci/linux-build.sh
+
+    - name: build
+      run:  ./.ci/linux-build.sh
+
+    - name: save cache
+      uses: actions/cache/save@v3
+      if: always() && steps.clang_cache.outputs.cache-hit != 'true'
+      with:
+        path: base-clang-analyzer-results
+        key:  ${{ steps.cache_key.outputs.key }}
+
   build-osx:
     env:
       CC:    clang