Message ID | 20201110160140.2859904-11-philmd@redhat.com |
---|---|
State | New |
Headers | show |
Series | gitlab-ci: Introduce "CI job maintainer" concept, mark jobs maintained | expand |
Philippe Mathieu-Daudé <philmd@redhat.com> writes: > When a job fails, someone has to take care of it. As we can > not wait indefinitively of volunteers good will, introduce the > concept of "job maintainers". A job maintainer is reponsible > of keeping it working, or contact the developers having broken > it to fix it. > > When a job is added, it must have a maintainer. A job without > maintainer is not run automatically. It can however be run > manually from the WebUI. > > To declare a maintainer, it is as easy as defining the > JOB_MAINTAINER_NAME / JOB_MAINTAINER_EMAIL environment variables. So I think the problem here is the CI jobs are orthogonal to the actual tests. And the tests should be associated via MAINTAINERS with the relevant sub-systems. That is not to say that the test environments don't need some care and attention. So I'm quite happy to track updates needed to tests/docker/dockerfiles for example but just because check-block failed on an Ubuntu system doesn't mean I'm best placed to diagnose the problem. In the first instance it shouldn't happen (not merging code that regresses a test) and the second instance probably requires a block maintainer to look at the output. I think a better solution is to improve our test reporting so we can quickly point the failing tests. I notice GitLab gets nice test output from check-acceptance. What would we need to do to improve it from check, check-block and check-tcg? > > Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> > --- > Ideally the runner will notify the maintainer by mail that > a job fails. But we are not quite there yet. > > It would be nice if someone document this properly. > --- > .gitlab-ci.d/containers.yml | 4 ++++ > .gitlab-ci.d/crossbuilds.yml | 6 ++++++ > .gitlab-ci.yml | 6 ++++++ > 3 files changed, 16 insertions(+) > > diff --git a/.gitlab-ci.d/containers.yml b/.gitlab-ci.d/containers.yml > index 7e664878cab..bd2a05008d1 100644 > --- a/.gitlab-ci.d/containers.yml > +++ b/.gitlab-ci.d/containers.yml > @@ -21,6 +21,10 @@ > after_script: > - docker logout > rules: > + # Skip unmaintained jobs > + - if: $JOB_MAINTAINER_NAME == null || $JOB_MAINTAINER_EMAIL == null > + when: manual > + allow_failure: true > - changes: > - .gitlab-ci.d/containers.yml > - tests/docker/* > diff --git a/.gitlab-ci.d/crossbuilds.yml b/.gitlab-ci.d/crossbuilds.yml > index 701550f028c..aecdb2a38f1 100644 > --- a/.gitlab-ci.d/crossbuilds.yml > +++ b/.gitlab-ci.d/crossbuilds.yml > @@ -1,6 +1,12 @@ > .cross_common_job: > stage: build > image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:latest > + rules: > + # Skip unmaintained jobs > + - if: $JOB_MAINTAINER_NAME == null || $JOB_MAINTAINER_EMAIL == null > + when: manual > + allow_failure: true > + - when: always > > .cross_system_build_job: > extends: .cross_common_job > diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml > index 2e631d4f160..ded4f0bdd18 100644 > --- a/.gitlab-ci.yml > +++ b/.gitlab-ci.yml > @@ -19,6 +19,12 @@ include: > > .native_common_job: > image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:latest > + rules: > + # Skip unmaintained jobs > + - if: $JOB_MAINTAINER_NAME == null || $JOB_MAINTAINER_EMAIL == null > + when: manual > + allow_failure: true > + - when: always > > .native_build_job: > extends: .native_common_job
On Tue, Nov 10, 2020 at 05:01:34PM +0100, Philippe Mathieu-Daudé wrote: > When a job fails, someone has to take care of it. As we can > not wait indefinitively of volunteers good will, introduce the > concept of "job maintainers". A job maintainer is reponsible > of keeping it working, or contact the developers having broken > it to fix it. > > When a job is added, it must have a maintainer. A job without > maintainer is not run automatically. It can however be run > manually from the WebUI. > > To declare a maintainer, it is as easy as defining the > JOB_MAINTAINER_NAME / JOB_MAINTAINER_EMAIL environment variables. I don't think we really want/need todo this. The big problem we're facing is that there is no incentive right now for maintainers to make sure their code works with GitLab CI before they send a pull request. Adding job maintainers is just a band-aid, and not a very good one either, because each job covers features across many subsystems which should each be dealt with by their existing maintainers. The primary solution to this tragedy is to make all the jobs gating on all pull requests. If a maintainer wants their pull requrst to get merged they then have no choice but to ensure it doesn't break any CI jobs. The main blocker for this right now IIUC is the git repo sync from qemu to gitlab. Once we switch to gitlab as primary, we need to start enforcing GitLab as gating for all jobs. At this point making sure GitLab CI passes is every maintainer's job. We'll still have some failures periodically from non-deterministic bugs. If a test shows itself to be flaky though, it should just be disabled in a very short time frame. Whichever maintainer owned the test has the job for fixing the flakyness before it can be renabled. Regards, Daniel
On Wed, Nov 11, 2020 at 09:37:53AM +0000, Alex Bennée wrote: > > Philippe Mathieu-Daudé <philmd@redhat.com> writes: > > > When a job fails, someone has to take care of it. As we can > > not wait indefinitively of volunteers good will, introduce the > > concept of "job maintainers". A job maintainer is reponsible > > of keeping it working, or contact the developers having broken > > it to fix it. > > > > When a job is added, it must have a maintainer. A job without > > maintainer is not run automatically. It can however be run > > manually from the WebUI. > > > > To declare a maintainer, it is as easy as defining the > > JOB_MAINTAINER_NAME / JOB_MAINTAINER_EMAIL environment variables. > > So I think the problem here is the CI jobs are orthogonal to the actual > tests. And the tests should be associated via MAINTAINERS with the > relevant sub-systems. > > That is not to say that the test environments don't need some care and > attention. So I'm quite happy to track updates needed to > tests/docker/dockerfiles for example but just because check-block failed > on an Ubuntu system doesn't mean I'm best placed to diagnose the > problem. In the first instance it shouldn't happen (not merging code > that regresses a test) and the second instance probably requires a block > maintainer to look at the output. > > I think a better solution is to improve our test reporting so we can > quickly point the failing tests. I notice GitLab gets nice test output > from check-acceptance. What would we need to do to improve it from > check, check-block and check-tcg? That presentation is from the artifacts publishing test results in junit format. IOW, we need to enhance the build system in some way such that it can generate a junit report of other tests too. Regards, Daniel
On 11/11/20 10:45 AM, Daniel P. Berrangé wrote: > On Tue, Nov 10, 2020 at 05:01:34PM +0100, Philippe Mathieu-Daudé wrote: >> When a job fails, someone has to take care of it. As we can >> not wait indefinitively of volunteers good will, introduce the >> concept of "job maintainers". A job maintainer is reponsible >> of keeping it working, or contact the developers having broken >> it to fix it. >> >> When a job is added, it must have a maintainer. A job without >> maintainer is not run automatically. It can however be run >> manually from the WebUI. >> >> To declare a maintainer, it is as easy as defining the >> JOB_MAINTAINER_NAME / JOB_MAINTAINER_EMAIL environment variables. > > I don't think we really want/need todo this. > > The big problem we're facing is that there is no incentive right now > for maintainers to make sure their code works with GitLab CI before > they send a pull request. Adding job maintainers is just a band-aid, > and not a very good one either, because each job covers features across > many subsystems which should each be dealt with by their existing > maintainers. > > The primary solution to this tragedy is to make all the jobs gating > on all pull requests. If a maintainer wants their pull requrst to > get merged they then have no choice but to ensure it doesn't break > any CI jobs. > > The main blocker for this right now IIUC is the git repo sync from > qemu to gitlab. Once we switch to gitlab as primary, we need to start > enforcing GitLab as gating for all jobs. At this point making sure > GitLab CI passes is every maintainer's job. > > We'll still have some failures periodically from non-deterministic > bugs. If a test shows itself to be flaky though, it should just be > disabled in a very short time frame. Whichever maintainer owned the > test has the job for fixing the flakyness before it can be renabled. At this point I'd rather remove everything we have in CI and restart from scratch. So if someone is really interested in having CI jobs / runner, this someone will step in and contribute / maintain. Not like the current state when some are happy to use the result, but nobody cares about maintenance or fixing bugs (as the last year experience clearly show). I don't know if Alex / Thomas are willing to keep doing that. I personally don't have the the energy to do this in my spare time. Thanks.
diff --git a/.gitlab-ci.d/containers.yml b/.gitlab-ci.d/containers.yml index 7e664878cab..bd2a05008d1 100644 --- a/.gitlab-ci.d/containers.yml +++ b/.gitlab-ci.d/containers.yml @@ -21,6 +21,10 @@ after_script: - docker logout rules: + # Skip unmaintained jobs + - if: $JOB_MAINTAINER_NAME == null || $JOB_MAINTAINER_EMAIL == null + when: manual + allow_failure: true - changes: - .gitlab-ci.d/containers.yml - tests/docker/* diff --git a/.gitlab-ci.d/crossbuilds.yml b/.gitlab-ci.d/crossbuilds.yml index 701550f028c..aecdb2a38f1 100644 --- a/.gitlab-ci.d/crossbuilds.yml +++ b/.gitlab-ci.d/crossbuilds.yml @@ -1,6 +1,12 @@ .cross_common_job: stage: build image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:latest + rules: + # Skip unmaintained jobs + - if: $JOB_MAINTAINER_NAME == null || $JOB_MAINTAINER_EMAIL == null + when: manual + allow_failure: true + - when: always .cross_system_build_job: extends: .cross_common_job diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 2e631d4f160..ded4f0bdd18 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -19,6 +19,12 @@ include: .native_common_job: image: $CI_REGISTRY_IMAGE/qemu/$IMAGE:latest + rules: + # Skip unmaintained jobs + - if: $JOB_MAINTAINER_NAME == null || $JOB_MAINTAINER_EMAIL == null + when: manual + allow_failure: true + - when: always .native_build_job: extends: .native_common_job
When a job fails, someone has to take care of it. As we can not wait indefinitively of volunteers good will, introduce the concept of "job maintainers". A job maintainer is reponsible of keeping it working, or contact the developers having broken it to fix it. When a job is added, it must have a maintainer. A job without maintainer is not run automatically. It can however be run manually from the WebUI. To declare a maintainer, it is as easy as defining the JOB_MAINTAINER_NAME / JOB_MAINTAINER_EMAIL environment variables. Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> --- Ideally the runner will notify the maintainer by mail that a job fails. But we are not quite there yet. It would be nice if someone document this properly. --- .gitlab-ci.d/containers.yml | 4 ++++ .gitlab-ci.d/crossbuilds.yml | 6 ++++++ .gitlab-ci.yml | 6 ++++++ 3 files changed, 16 insertions(+)