diff mbox series

[RFC,14/15] gitlab-ci: Allow forks to use different set of jobs

Message ID 20210418233448.1267991-15-f4bug@amsat.org
State New
Headers show
Series gitlab-ci: Allow forks to use different pipelines than mainstream | expand

Commit Message

Philippe Mathieu-Daudé April 18, 2021, 11:34 p.m. UTC
Forks run the same jobs than mainstream, which might be overkill.
Allow them to easily rebase their custom set, while keeping using
the mainstream templates, and ability to pick specific jobs from
the mainstream set.

To switch to your set, simply add your .gitlab-ci.yml as
.gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
is your gitlab 'namespace', usually username). This file will be
used instead of the default mainstream set.

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
 .gitlab-ci.yml | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Thomas Huth April 19, 2021, 5:48 a.m. UTC | #1
On 19/04/2021 01.34, Philippe Mathieu-Daudé wrote:
> Forks run the same jobs than mainstream, which might be overkill.
> Allow them to easily rebase their custom set, while keeping using
> the mainstream templates, and ability to pick specific jobs from
> the mainstream set.
> 
> To switch to your set, simply add your .gitlab-ci.yml as
> .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
> is your gitlab 'namespace', usually username). This file will be
> used instead of the default mainstream set.
> 
> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
>   .gitlab-ci.yml | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
> index 718c8e004be..35fd35075db 100644
> --- a/.gitlab-ci.yml
> +++ b/.gitlab-ci.yml
> @@ -9,7 +9,12 @@ generate-config:
>       paths:
>         - generated-config.yml
>     script:
> -    - cp .gitlab-ci.d/qemu-project.yml generated-config.yml
> +    - if test -e .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml ;
> +      then
> +        cp .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml generated-config.yml ;
> +      else
> +        cp .gitlab-ci.d/qemu-project.yml generated-config.yml ;
> +      fi

I think you could merge this with the previous patch, since the previous 
patch is not very useful on its own.

Anyway, I like the idea, that could be useful for downstream, indeed!

  Thomas
Daniel P. Berrangé April 19, 2021, 9:40 a.m. UTC | #2
On Mon, Apr 19, 2021 at 01:34:47AM +0200, Philippe Mathieu-Daudé wrote:
> Forks run the same jobs than mainstream, which might be overkill.
> Allow them to easily rebase their custom set, while keeping using
> the mainstream templates, and ability to pick specific jobs from
> the mainstream set.
> 
> To switch to your set, simply add your .gitlab-ci.yml as
> .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
> is your gitlab 'namespace', usually username). This file will be
> used instead of the default mainstream set.

I find this approach undesirable, because AFAICT, it means you have
to commit this extra file to any of your downstream branches that
you want this to be used for.  Then you have to be either delete it
again before sending patches upstream, or tell git-publish to
exclude the commit that adds this.

IMHO any per-contributor overhead needs to not involve committing
stuff to their git branches, that isn't intended to go upstream.


Regards,
Daniel
Philippe Mathieu-Daudé April 19, 2021, 10:09 a.m. UTC | #3
On 4/19/21 11:40 AM, Daniel P. Berrangé wrote:
> On Mon, Apr 19, 2021 at 01:34:47AM +0200, Philippe Mathieu-Daudé wrote:
>> Forks run the same jobs than mainstream, which might be overkill.
>> Allow them to easily rebase their custom set, while keeping using
>> the mainstream templates, and ability to pick specific jobs from
>> the mainstream set.
>>
>> To switch to your set, simply add your .gitlab-ci.yml as
>> .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
>> is your gitlab 'namespace', usually username). This file will be
>> used instead of the default mainstream set.
> 
> I find this approach undesirable, because AFAICT, it means you have
> to commit this extra file to any of your downstream branches that
> you want this to be used for.  Then you have to be either delete it
> again before sending patches upstream, or tell git-publish to
> exclude the commit that adds this.

Good point. What I'm looking for is allow fork to keep following the
mainstream development.

> IMHO any per-contributor overhead needs to not involve committing
> stuff to their git branches, that isn't intended to go upstream.

But why am I forced to run the upstream overhead stuff into my fork?
I find it counter-productive for my limited set of topic I'm modifying.
Also, why should I wait >2h for a pipeline when I exactly know which
area I'm modifying? This is a waste of time and resources.

Gitlab suggested an alternative 3 months ago, it is still fresh:
https://docs.gitlab.com/ee/ci/yaml/README.html#variables-with-include
combined with
https://docs.gitlab.com/ee/ci/yaml/README.html#includeremote
and
https://docs.gitlab.com/ee/ci/yaml/README.html#multiple-files-from-a-project
we could have forks include their gitlab-ci.yml from a specific branch
of their repository.

Example, if I push a branch named project-specific-ci, and we add
that to mainstream:

  include:
  - project: '$CI_PROJECT_PATH'
    ref: project-specific-ci
    file:
      - '/.gitlab-ci.d/project-specific.yml'

The it would include
project-specific-ci:/.gitlab-ci.d/project-specific.yml in all
branches/tags I push.

In that case we could rename qemu-project.yml -> project-specific.yml
(patch 12).

The problem is I couldn't have it optionally working (when there is
no 'project-specific-ci' branch).

Still room for investigation...

Thanks for the feedback,

Phil.
Erik Skultety April 19, 2021, 10:10 a.m. UTC | #4
On Mon, Apr 19, 2021 at 10:40:53AM +0100, Daniel P. Berrangé wrote:
> On Mon, Apr 19, 2021 at 01:34:47AM +0200, Philippe Mathieu-Daudé wrote:
> > Forks run the same jobs than mainstream, which might be overkill.
> > Allow them to easily rebase their custom set, while keeping using
> > the mainstream templates, and ability to pick specific jobs from
> > the mainstream set.
> > 
> > To switch to your set, simply add your .gitlab-ci.yml as
> > .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
> > is your gitlab 'namespace', usually username). This file will be
> > used instead of the default mainstream set.
> 
> I find this approach undesirable, because AFAICT, it means you have
> to commit this extra file to any of your downstream branches that
> you want this to be used for.  Then you have to be either delete it
> again before sending patches upstream, or tell git-publish to
> exclude the commit that adds this.
> 
> IMHO any per-contributor overhead needs to not involve committing
> stuff to their git branches, that isn't intended to go upstream.

Not just that, ideally, they should also run all the upstream workloads before
submitting a PR or posting patches because they'd have to respin because of a
potential failure in upstream pipelines anyway.

Erik
Thomas Huth April 19, 2021, 10:20 a.m. UTC | #5
On 19/04/2021 12.10, Erik Skultety wrote:
> On Mon, Apr 19, 2021 at 10:40:53AM +0100, Daniel P. Berrangé wrote:
>> On Mon, Apr 19, 2021 at 01:34:47AM +0200, Philippe Mathieu-Daudé wrote:
>>> Forks run the same jobs than mainstream, which might be overkill.
>>> Allow them to easily rebase their custom set, while keeping using
>>> the mainstream templates, and ability to pick specific jobs from
>>> the mainstream set.
>>>
>>> To switch to your set, simply add your .gitlab-ci.yml as
>>> .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
>>> is your gitlab 'namespace', usually username). This file will be
>>> used instead of the default mainstream set.
>>
>> I find this approach undesirable, because AFAICT, it means you have
>> to commit this extra file to any of your downstream branches that
>> you want this to be used for.  Then you have to be either delete it
>> again before sending patches upstream, or tell git-publish to
>> exclude the commit that adds this.
>>
>> IMHO any per-contributor overhead needs to not involve committing
>> stuff to their git branches, that isn't intended to go upstream.
> 
> Not just that, ideally, they should also run all the upstream workloads before
> submitting a PR or posting patches because they'd have to respin because of a
> potential failure in upstream pipelines anyway.

It's pretty clear that you want to run the full QEMU CI before submitting 
patches to the QEMU project, but I think we are rather talking about forks 
here that are meant not meant for immediately contributing to upstream 
again, like RHEL where we only build the KVM-related targets and certainly 
do not want to test other things like CPUs that are not capable of KVM, or a 
branch where Philippe only wants to check his MIPS-related work during 
development.
For contributing patches to upstream, you certainly have to run the full CI, 
but for other things, it's sometimes really useful to cut down the CI 
machinery (I'm also doing this in my development branches manually some 
times to speed up the CI), so I think this series make sense, indeed.

  Thomas
Daniel P. Berrangé April 19, 2021, 10:36 a.m. UTC | #6
On Mon, Apr 19, 2021 at 12:20:55PM +0200, Thomas Huth wrote:
> On 19/04/2021 12.10, Erik Skultety wrote:
> > On Mon, Apr 19, 2021 at 10:40:53AM +0100, Daniel P. Berrangé wrote:
> > > On Mon, Apr 19, 2021 at 01:34:47AM +0200, Philippe Mathieu-Daudé wrote:
> > > > Forks run the same jobs than mainstream, which might be overkill.
> > > > Allow them to easily rebase their custom set, while keeping using
> > > > the mainstream templates, and ability to pick specific jobs from
> > > > the mainstream set.
> > > > 
> > > > To switch to your set, simply add your .gitlab-ci.yml as
> > > > .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
> > > > is your gitlab 'namespace', usually username). This file will be
> > > > used instead of the default mainstream set.
> > > 
> > > I find this approach undesirable, because AFAICT, it means you have
> > > to commit this extra file to any of your downstream branches that
> > > you want this to be used for.  Then you have to be either delete it
> > > again before sending patches upstream, or tell git-publish to
> > > exclude the commit that adds this.
> > > 
> > > IMHO any per-contributor overhead needs to not involve committing
> > > stuff to their git branches, that isn't intended to go upstream.
> > 
> > Not just that, ideally, they should also run all the upstream workloads before
> > submitting a PR or posting patches because they'd have to respin because of a
> > potential failure in upstream pipelines anyway.
> 
> It's pretty clear that you want to run the full QEMU CI before submitting
> patches to the QEMU project, but I think we are rather talking about forks
> here that are meant not meant for immediately contributing to upstream
> again, like RHEL where we only build the KVM-related targets and certainly
> do not want to test other things like CPUs that are not capable of KVM, or a
> branch where Philippe only wants to check his MIPS-related work during
> development.
> For contributing patches to upstream, you certainly have to run the full CI,
> but for other things, it's sometimes really useful to cut down the CI
> machinery (I'm also doing this in my development branches manually some
> times to speed up the CI), so I think this series make sense, indeed.

For a downstream like RHEL, I'd just expect them to replace the main
.gitlab-ci.yml entirely to suit their downstream needs.

Regards,
Daniel
Philippe Mathieu-Daudé April 19, 2021, 10:44 a.m. UTC | #7
On 4/19/21 12:10 PM, Erik Skultety wrote:
> On Mon, Apr 19, 2021 at 10:40:53AM +0100, Daniel P. Berrangé wrote:
>> On Mon, Apr 19, 2021 at 01:34:47AM +0200, Philippe Mathieu-Daudé wrote:
>>> Forks run the same jobs than mainstream, which might be overkill.
>>> Allow them to easily rebase their custom set, while keeping using
>>> the mainstream templates, and ability to pick specific jobs from
>>> the mainstream set.
>>>
>>> To switch to your set, simply add your .gitlab-ci.yml as
>>> .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
>>> is your gitlab 'namespace', usually username). This file will be
>>> used instead of the default mainstream set.
>>
>> I find this approach undesirable, because AFAICT, it means you have
>> to commit this extra file to any of your downstream branches that
>> you want this to be used for.  Then you have to be either delete it
>> again before sending patches upstream, or tell git-publish to
>> exclude the commit that adds this.
>>
>> IMHO any per-contributor overhead needs to not involve committing
>> stuff to their git branches, that isn't intended to go upstream.
> 
> Not just that, ideally, they should also run all the upstream workloads before
> submitting a PR or posting patches because they'd have to respin because of a
> potential failure in upstream pipelines anyway.

Working a patch series on your fork could take days/weeks/months before
you post it to mainstream... I believe forks are only interested
in running mainstream pipelines when they are ready to post their work,
not at every push to their repository.
Daniel P. Berrangé April 19, 2021, 10:47 a.m. UTC | #8
On Mon, Apr 19, 2021 at 12:20:55PM +0200, Thomas Huth wrote:
> On 19/04/2021 12.10, Erik Skultety wrote:
> > On Mon, Apr 19, 2021 at 10:40:53AM +0100, Daniel P. Berrangé wrote:
> > > On Mon, Apr 19, 2021 at 01:34:47AM +0200, Philippe Mathieu-Daudé wrote:
> > > > Forks run the same jobs than mainstream, which might be overkill.
> > > > Allow them to easily rebase their custom set, while keeping using
> > > > the mainstream templates, and ability to pick specific jobs from
> > > > the mainstream set.
> > > > 
> > > > To switch to your set, simply add your .gitlab-ci.yml as
> > > > .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
> > > > is your gitlab 'namespace', usually username). This file will be
> > > > used instead of the default mainstream set.
> > > 
> > > I find this approach undesirable, because AFAICT, it means you have
> > > to commit this extra file to any of your downstream branches that
> > > you want this to be used for.  Then you have to be either delete it
> > > again before sending patches upstream, or tell git-publish to
> > > exclude the commit that adds this.
> > > 
> > > IMHO any per-contributor overhead needs to not involve committing
> > > stuff to their git branches, that isn't intended to go upstream.
> > 
> > Not just that, ideally, they should also run all the upstream workloads before
> > submitting a PR or posting patches because they'd have to respin because of a
> > potential failure in upstream pipelines anyway.
> 
> It's pretty clear that you want to run the full QEMU CI before submitting
> patches to the QEMU project, but I think we are rather talking about forks
> here that are meant not meant for immediately contributing to upstream
> again, like RHEL where we only build the KVM-related targets and certainly
> do not want to test other things like CPUs that are not capable of KVM, or a
> branch where Philippe only wants to check his MIPS-related work during
> development.
> For contributing patches to upstream, you certainly have to run the full CI,
> but for other things, it's sometimes really useful to cut down the CI
> machinery (I'm also doing this in my development branches manually some
> times to speed up the CI), so I think this series make sense, indeed.

In the case of a permanent fork like RHEL, I'd expect them to just
replace the existing .gitlab-ci.yml entirely in their git repo.

I don't think we need to care about doing anything special downstream
forks, but just focus on what's beneficial to upstream contributors
like the scenario you describe for Philippe only wanting to check
MIPS jobs.


Regards,
Daniel
Thomas Huth April 19, 2021, 10:48 a.m. UTC | #9
On 19/04/2021 12.36, Daniel P. Berrangé wrote:
> On Mon, Apr 19, 2021 at 12:20:55PM +0200, Thomas Huth wrote:
>> On 19/04/2021 12.10, Erik Skultety wrote:
>>> On Mon, Apr 19, 2021 at 10:40:53AM +0100, Daniel P. Berrangé wrote:
>>>> On Mon, Apr 19, 2021 at 01:34:47AM +0200, Philippe Mathieu-Daudé wrote:
>>>>> Forks run the same jobs than mainstream, which might be overkill.
>>>>> Allow them to easily rebase their custom set, while keeping using
>>>>> the mainstream templates, and ability to pick specific jobs from
>>>>> the mainstream set.
>>>>>
>>>>> To switch to your set, simply add your .gitlab-ci.yml as
>>>>> .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
>>>>> is your gitlab 'namespace', usually username). This file will be
>>>>> used instead of the default mainstream set.
>>>>
>>>> I find this approach undesirable, because AFAICT, it means you have
>>>> to commit this extra file to any of your downstream branches that
>>>> you want this to be used for.  Then you have to be either delete it
>>>> again before sending patches upstream, or tell git-publish to
>>>> exclude the commit that adds this.
>>>>
>>>> IMHO any per-contributor overhead needs to not involve committing
>>>> stuff to their git branches, that isn't intended to go upstream.
>>>
>>> Not just that, ideally, they should also run all the upstream workloads before
>>> submitting a PR or posting patches because they'd have to respin because of a
>>> potential failure in upstream pipelines anyway.
>>
>> It's pretty clear that you want to run the full QEMU CI before submitting
>> patches to the QEMU project, but I think we are rather talking about forks
>> here that are meant not meant for immediately contributing to upstream
>> again, like RHEL where we only build the KVM-related targets and certainly
>> do not want to test other things like CPUs that are not capable of KVM, or a
>> branch where Philippe only wants to check his MIPS-related work during
>> development.
>> For contributing patches to upstream, you certainly have to run the full CI,
>> but for other things, it's sometimes really useful to cut down the CI
>> machinery (I'm also doing this in my development branches manually some
>> times to speed up the CI), so I think this series make sense, indeed.
> 
> For a downstream like RHEL, I'd just expect them to replace the main
> .gitlab-ci.yml entirely to suit their downstream needs.

But that still means that we should clean up the main .gitlab-ci.yml file 
anyway, like it is done in this series, to avoid that you always get 
conflicts in this big file with your downstream-only modifications. So at 
least up to patch 11 or 12, I think this is a very valuable work that 
Philippe is doing here.

  Thomas
Daniel P. Berrangé April 19, 2021, 10:51 a.m. UTC | #10
On Mon, Apr 19, 2021 at 12:48:25PM +0200, Thomas Huth wrote:
> On 19/04/2021 12.36, Daniel P. Berrangé wrote:
> > On Mon, Apr 19, 2021 at 12:20:55PM +0200, Thomas Huth wrote:
> > > On 19/04/2021 12.10, Erik Skultety wrote:
> > > > On Mon, Apr 19, 2021 at 10:40:53AM +0100, Daniel P. Berrangé wrote:
> > > > > On Mon, Apr 19, 2021 at 01:34:47AM +0200, Philippe Mathieu-Daudé wrote:
> > > > > > Forks run the same jobs than mainstream, which might be overkill.
> > > > > > Allow them to easily rebase their custom set, while keeping using
> > > > > > the mainstream templates, and ability to pick specific jobs from
> > > > > > the mainstream set.
> > > > > > 
> > > > > > To switch to your set, simply add your .gitlab-ci.yml as
> > > > > > .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
> > > > > > is your gitlab 'namespace', usually username). This file will be
> > > > > > used instead of the default mainstream set.
> > > > > 
> > > > > I find this approach undesirable, because AFAICT, it means you have
> > > > > to commit this extra file to any of your downstream branches that
> > > > > you want this to be used for.  Then you have to be either delete it
> > > > > again before sending patches upstream, or tell git-publish to
> > > > > exclude the commit that adds this.
> > > > > 
> > > > > IMHO any per-contributor overhead needs to not involve committing
> > > > > stuff to their git branches, that isn't intended to go upstream.
> > > > 
> > > > Not just that, ideally, they should also run all the upstream workloads before
> > > > submitting a PR or posting patches because they'd have to respin because of a
> > > > potential failure in upstream pipelines anyway.
> > > 
> > > It's pretty clear that you want to run the full QEMU CI before submitting
> > > patches to the QEMU project, but I think we are rather talking about forks
> > > here that are meant not meant for immediately contributing to upstream
> > > again, like RHEL where we only build the KVM-related targets and certainly
> > > do not want to test other things like CPUs that are not capable of KVM, or a
> > > branch where Philippe only wants to check his MIPS-related work during
> > > development.
> > > For contributing patches to upstream, you certainly have to run the full CI,
> > > but for other things, it's sometimes really useful to cut down the CI
> > > machinery (I'm also doing this in my development branches manually some
> > > times to speed up the CI), so I think this series make sense, indeed.
> > 
> > For a downstream like RHEL, I'd just expect them to replace the main
> > .gitlab-ci.yml entirely to suit their downstream needs.
> 
> But that still means that we should clean up the main .gitlab-ci.yml file
> anyway, like it is done in this series, to avoid that you always get
> conflicts in this big file with your downstream-only modifications. So at
> least up to patch 11 or 12, I think this is a very valuable work that
> Philippe is doing here.

I don't see a real issue with downstream conflicts. They'll just
periodically pick a release to base themselves off and change once
every 6 months or more.

Regards,
Daniel
Thomas Huth April 19, 2021, 10:59 a.m. UTC | #11
On 19/04/2021 12.51, Daniel P. Berrangé wrote:
> On Mon, Apr 19, 2021 at 12:48:25PM +0200, Thomas Huth wrote:
>> On 19/04/2021 12.36, Daniel P. Berrangé wrote:
>>> On Mon, Apr 19, 2021 at 12:20:55PM +0200, Thomas Huth wrote:
>>>> On 19/04/2021 12.10, Erik Skultety wrote:
>>>>> On Mon, Apr 19, 2021 at 10:40:53AM +0100, Daniel P. Berrangé wrote:
>>>>>> On Mon, Apr 19, 2021 at 01:34:47AM +0200, Philippe Mathieu-Daudé wrote:
>>>>>>> Forks run the same jobs than mainstream, which might be overkill.
>>>>>>> Allow them to easily rebase their custom set, while keeping using
>>>>>>> the mainstream templates, and ability to pick specific jobs from
>>>>>>> the mainstream set.
>>>>>>>
>>>>>>> To switch to your set, simply add your .gitlab-ci.yml as
>>>>>>> .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
>>>>>>> is your gitlab 'namespace', usually username). This file will be
>>>>>>> used instead of the default mainstream set.
>>>>>>
>>>>>> I find this approach undesirable, because AFAICT, it means you have
>>>>>> to commit this extra file to any of your downstream branches that
>>>>>> you want this to be used for.  Then you have to be either delete it
>>>>>> again before sending patches upstream, or tell git-publish to
>>>>>> exclude the commit that adds this.
>>>>>>
>>>>>> IMHO any per-contributor overhead needs to not involve committing
>>>>>> stuff to their git branches, that isn't intended to go upstream.
>>>>>
>>>>> Not just that, ideally, they should also run all the upstream workloads before
>>>>> submitting a PR or posting patches because they'd have to respin because of a
>>>>> potential failure in upstream pipelines anyway.
>>>>
>>>> It's pretty clear that you want to run the full QEMU CI before submitting
>>>> patches to the QEMU project, but I think we are rather talking about forks
>>>> here that are meant not meant for immediately contributing to upstream
>>>> again, like RHEL where we only build the KVM-related targets and certainly
>>>> do not want to test other things like CPUs that are not capable of KVM, or a
>>>> branch where Philippe only wants to check his MIPS-related work during
>>>> development.
>>>> For contributing patches to upstream, you certainly have to run the full CI,
>>>> but for other things, it's sometimes really useful to cut down the CI
>>>> machinery (I'm also doing this in my development branches manually some
>>>> times to speed up the CI), so I think this series make sense, indeed.
>>>
>>> For a downstream like RHEL, I'd just expect them to replace the main
>>> .gitlab-ci.yml entirely to suit their downstream needs.
>>
>> But that still means that we should clean up the main .gitlab-ci.yml file
>> anyway, like it is done in this series, to avoid that you always get
>> conflicts in this big file with your downstream-only modifications. So at
>> least up to patch 11 or 12, I think this is a very valuable work that
>> Philippe is doing here.
> 
> I don't see a real issue with downstream conflicts. They'll just
> periodically pick a release to base themselves off and change once
> every 6 months or more.

It's not only downstream distros that rebase every 6 month. Like Philippe, 
I'm sometimes hacking my .gitlab-ci.yml of my development branch to speed up 
the CI during my development cycles (i.e. I'm removing the jobs that I do 
not need). And I'm regularly rebasing my development branchs. Conflicts in 
.gitlab-ci.yml are then always painful, so a leaner main .gitlab-ci.yml file 
would be helpful for me, too, indeed.

  Thomas
Alex Bennée April 19, 2021, 3:57 p.m. UTC | #12
Philippe Mathieu-Daudé <f4bug@amsat.org> writes:

> Forks run the same jobs than mainstream, which might be overkill.
> Allow them to easily rebase their custom set, while keeping using
> the mainstream templates, and ability to pick specific jobs from
> the mainstream set.
>
> To switch to your set, simply add your .gitlab-ci.yml as
> .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
> is your gitlab 'namespace', usually username). This file will be
> used instead of the default mainstream set.
>
> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> ---
>  .gitlab-ci.yml | 7 ++++++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
> index 718c8e004be..35fd35075db 100644
> --- a/.gitlab-ci.yml
> +++ b/.gitlab-ci.yml
> @@ -9,7 +9,12 @@ generate-config:
>      paths:
>        - generated-config.yml
>    script:
> -    - cp .gitlab-ci.d/qemu-project.yml generated-config.yml
> +    - if test -e .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml ;
> +      then
> +        cp .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml generated-config.yml ;
> +      else
> +        cp .gitlab-ci.d/qemu-project.yml generated-config.yml ;
> +      fi

This is going to be a little clunky. I can see a use for the static
forks that Danial proposes but I guess what is needed is a little
expressiveness. So how to express things like:

 - I've only touched stuff in linux-user, so run only linux-user tests

 - I'm working on KVM, run all KVM enabled builds and tests

 - I've changed the core TCG code, run everything that exercises that

 - I'm working on ARM, only build and run jobs that have ARM targets

This sounds like tags I guess but the documentation indicates they are
used for runner selection. Could we come up with a subset that could be
used to select from all our build fragments when constructing the
generated-config? I could even imagine a script analysing a diffstat and
guessing the tags based on that.

I think we should define a minimum set of lightweight smoke tests that
get the most bang for buck for catching sillies. I think checkpatch and
dco checking should probably be in there - and maybe one of the bog
standard build everything builds (maybe a random ../configure; make;
make check on one of the supported LTS targets).

Then there is the question of defaults. Should we default to a minimised
set unless asked or should the default be the full fat run everything?
We could I guess only switch to running everything for the staging
branch and anything that is associated with a tag or a branch that has
pull in the name?

>  
>  generate-pipeline:
>    stage: test
Daniel P. Berrangé April 19, 2021, 4:22 p.m. UTC | #13
On Mon, Apr 19, 2021 at 04:57:55PM +0100, Alex Bennée wrote:
> 
> Philippe Mathieu-Daudé <f4bug@amsat.org> writes:
> 
> > Forks run the same jobs than mainstream, which might be overkill.
> > Allow them to easily rebase their custom set, while keeping using
> > the mainstream templates, and ability to pick specific jobs from
> > the mainstream set.
> >
> > To switch to your set, simply add your .gitlab-ci.yml as
> > .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
> > is your gitlab 'namespace', usually username). This file will be
> > used instead of the default mainstream set.
> >
> > Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> > ---
> >  .gitlab-ci.yml | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
> > index 718c8e004be..35fd35075db 100644
> > --- a/.gitlab-ci.yml
> > +++ b/.gitlab-ci.yml
> > @@ -9,7 +9,12 @@ generate-config:
> >      paths:
> >        - generated-config.yml
> >    script:
> > -    - cp .gitlab-ci.d/qemu-project.yml generated-config.yml
> > +    - if test -e .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml ;
> > +      then
> > +        cp .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml generated-config.yml ;
> > +      else
> > +        cp .gitlab-ci.d/qemu-project.yml generated-config.yml ;
> > +      fi
> 
> This is going to be a little clunky. I can see a use for the static
> forks that Danial proposes but I guess what is needed is a little
> expressiveness. So how to express things like:
> 
>  - I've only touched stuff in linux-user, so run only linux-user tests

This can be done with "rules" matching on files, but *only* if the
pipeline trigger is a merge request - specifically not a git branch
push, as the latter doesn't have the semantics you expect wrt
determining the "ancestor" to compare against. It only looks at commits
in the push, not those which may already have previously been pushed
on the branch.

>  - I'm working on KVM, run all KVM enabled builds and tests
> 
>  - I've changed the core TCG code, run everything that exercises that
> 
>  - I'm working on ARM, only build and run jobs that have ARM targets

If the stuff you work on is fairly static, we could potentially
allow env variables to be set by the user in their fork, which
the CI jobs use to filter jobs.

> I think we should define a minimum set of lightweight smoke tests that
> get the most bang for buck for catching sillies. I think checkpatch and
> dco checking should probably be in there - and maybe one of the bog
> standard build everything builds (maybe a random ../configure; make;
> make check on one of the supported LTS targets).

Could we have allow an env var  "QEMU_CI_SMOKE_TEST=1" which can be
set when pushing:

   git push  -o ci.variable="QEMU_CI_SMOKE_TEST=1"


which causes it to only do the minimum neccessary.

Alternatively, invert this, so do minimum smoke test by default
and require an env to run the full test. QEMU_CI_MAX=1

Potentially allow also  "QEMU_CI_EXTRA_JOBS=foo,bar,wizz"
to match against job jnames ?

> Then there is the question of defaults. Should we default to a minimised
> set unless asked or should the default be the full fat run everything?

With the direction gitlab is taking towards limiting CI minuts, it is
probably a safer bet to do a minimal smoke test by default and only
do the full test when definitely needed.

> We could I guess only switch to running everything for the staging
> branch and anything that is associated with a tag or a branch that has
> pull in the name?

Regards,
Daniel
Philippe Mathieu-Daudé April 19, 2021, 4:39 p.m. UTC | #14
On 4/19/21 5:57 PM, Alex Bennée wrote:
> Philippe Mathieu-Daudé <f4bug@amsat.org> writes:
> 
>> Forks run the same jobs than mainstream, which might be overkill.
>> Allow them to easily rebase their custom set, while keeping using
>> the mainstream templates, and ability to pick specific jobs from
>> the mainstream set.
>>
>> To switch to your set, simply add your .gitlab-ci.yml as
>> .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
>> is your gitlab 'namespace', usually username). This file will be
>> used instead of the default mainstream set.
>>
>> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>> ---
>>  .gitlab-ci.yml | 7 ++++++-
>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>
>> diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
>> index 718c8e004be..35fd35075db 100644
>> --- a/.gitlab-ci.yml
>> +++ b/.gitlab-ci.yml
>> @@ -9,7 +9,12 @@ generate-config:
>>      paths:
>>        - generated-config.yml
>>    script:
>> -    - cp .gitlab-ci.d/qemu-project.yml generated-config.yml
>> +    - if test -e .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml ;
>> +      then
>> +        cp .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml generated-config.yml ;
>> +      else
>> +        cp .gitlab-ci.d/qemu-project.yml generated-config.yml ;
>> +      fi
> 
> This is going to be a little clunky. I can see a use for the static
> forks that Danial proposes but I guess what is needed is a little
> expressiveness. So how to express things like:
> 
>  - I've only touched stuff in linux-user, so run only linux-user tests
> 
>  - I'm working on KVM, run all KVM enabled builds and tests
> 
>  - I've changed the core TCG code, run everything that exercises that
> 
>  - I'm working on ARM, only build and run jobs that have ARM targets
> 
> This sounds like tags I guess but the documentation indicates they are
> used for runner selection. Could we come up with a subset that could be
> used to select from all our build fragments when constructing the
> generated-config? I could even imagine a script analysing a diffstat and
> guessing the tags based on that.

Ahah this is just what we were discussing with Willian 2h ago after
looking again at stefanha analysis
(https://www.mail-archive.com/qemu-devel@nongnu.org/msg795905.html).

. diff-stat -> files modified
. files modified | get_maintainers -> subsystem maintained sections

I suggested Willian to add support for 'tags' entries to MAINTAINERS,
so we could have:

./get_maintainer --tags file1 file2 ...
-> virtio, migration, kvm

Then we could run all the tests tagged 'virtio, migration, kvm'
(unit tests, iotests, qtests, integration tests).


The transposed use is when a test fails, we can list its tags and
from here get the subsystem maintained sections tracking these tags
and notify them a test using their subsystem failed.

> I think we should define a minimum set of lightweight smoke tests that
> get the most bang for buck for catching sillies. I think checkpatch and
> dco checking should probably be in there - and maybe one of the bog
> standard build everything builds (maybe a random ../configure; make;
> make check on one of the supported LTS targets).
> 
> Then there is the question of defaults. Should we default to a minimised
> set unless asked or should the default be the full fat run everything?
> We could I guess only switch to running everything for the staging
> branch and anything that is associated with a tag or a branch that has
> pull in the name?

Yes, this is a community problem that need to be discussed. Not all the
community members have the same requirements and expectations.

What I'm trying to do here is ease random contributor fork workflow,
not uptimizing mainstream /master gating CI, which is suppose to have
way more resources than a random contributor.

Also I don't believe 1 set of CI jobs will ever make all users happy
together. We have all different needs. I'm looking for a solution
which include every contributors from the community.

I'm brainstorming about a setup where a maintainer might have extra
resources provided by the project (such access to dedicated hardware).
Let's use 'virtio' for example. The maintainer might want to use 2
different pipelines:
- one to run all its 'virtio' tagged tests each time patches are queued
  from the subsystem the contributors (this is the subsystem "gating"
  side).
- one to run extra set more complex, run only before sending a pull
  request.

Regards,

Phil.
Philippe Mathieu-Daudé April 19, 2021, 4:46 p.m. UTC | #15
On 4/19/21 6:22 PM, Daniel P. Berrangé wrote:
> On Mon, Apr 19, 2021 at 04:57:55PM +0100, Alex Bennée wrote:
>>
>> Philippe Mathieu-Daudé <f4bug@amsat.org> writes:
>>
>>> Forks run the same jobs than mainstream, which might be overkill.
>>> Allow them to easily rebase their custom set, while keeping using
>>> the mainstream templates, and ability to pick specific jobs from
>>> the mainstream set.
>>>
>>> To switch to your set, simply add your .gitlab-ci.yml as
>>> .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
>>> is your gitlab 'namespace', usually username). This file will be
>>> used instead of the default mainstream set.
>>>
>>> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
>>> ---
>>>  .gitlab-ci.yml | 7 ++++++-
>>>  1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
>>> index 718c8e004be..35fd35075db 100644
>>> --- a/.gitlab-ci.yml
>>> +++ b/.gitlab-ci.yml
>>> @@ -9,7 +9,12 @@ generate-config:
>>>      paths:
>>>        - generated-config.yml
>>>    script:
>>> -    - cp .gitlab-ci.d/qemu-project.yml generated-config.yml
>>> +    - if test -e .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml ;
>>> +      then
>>> +        cp .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml generated-config.yml ;
>>> +      else
>>> +        cp .gitlab-ci.d/qemu-project.yml generated-config.yml ;
>>> +      fi
>>
>> This is going to be a little clunky. I can see a use for the static
>> forks that Danial proposes but I guess what is needed is a little
>> expressiveness. So how to express things like:
>>
>>  - I've only touched stuff in linux-user, so run only linux-user tests
> 
> This can be done with "rules" matching on files, but *only* if the
> pipeline trigger is a merge request - specifically not a git branch
> push, as the latter doesn't have the semantics you expect wrt
> determining the "ancestor" to compare against. It only looks at commits
> in the push, not those which may already have previously been pushed
> on the branch.
> 
>>  - I'm working on KVM, run all KVM enabled builds and tests
>>
>>  - I've changed the core TCG code, run everything that exercises that
>>
>>  - I'm working on ARM, only build and run jobs that have ARM targets
> 
> If the stuff you work on is fairly static, we could potentially
> allow env variables to be set by the user in their fork, which
> the CI jobs use to filter jobs.
> 
>> I think we should define a minimum set of lightweight smoke tests that
>> get the most bang for buck for catching sillies. I think checkpatch and
>> dco checking should probably be in there - and maybe one of the bog
>> standard build everything builds (maybe a random ../configure; make;
>> make check on one of the supported LTS targets).
> 
> Could we have allow an env var  "QEMU_CI_SMOKE_TEST=1" which can be
> set when pushing:
> 
>    git push  -o ci.variable="QEMU_CI_SMOKE_TEST=1"
> 
> 
> which causes it to only do the minimum neccessary.
> 
> Alternatively, invert this, so do minimum smoke test by default
> and require an env to run the full test. QEMU_CI_MAX=1
> 
> Potentially allow also  "QEMU_CI_EXTRA_JOBS=foo,bar,wizz"
> to match against job jnames ?

Is that what you mean?
https://www.mail-archive.com/qemu-devel@nongnu.org/msg758340.html

(cover https://www.mail-archive.com/qemu-devel@nongnu.org/msg758331.html)

>> Then there is the question of defaults. Should we default to a minimised
>> set unless asked or should the default be the full fat run everything?
> 
> With the direction gitlab is taking towards limiting CI minuts, it is
> probably a safer bet to do a minimal smoke test by default and only
> do the full test when definitely needed.

Yes please.
Daniel P. Berrangé April 19, 2021, 4:58 p.m. UTC | #16
On Mon, Apr 19, 2021 at 06:46:49PM +0200, Philippe Mathieu-Daudé wrote:
> On 4/19/21 6:22 PM, Daniel P. Berrangé wrote:
> > On Mon, Apr 19, 2021 at 04:57:55PM +0100, Alex Bennée wrote:
> >>
> >> Philippe Mathieu-Daudé <f4bug@amsat.org> writes:
> >>
> >>> Forks run the same jobs than mainstream, which might be overkill.
> >>> Allow them to easily rebase their custom set, while keeping using
> >>> the mainstream templates, and ability to pick specific jobs from
> >>> the mainstream set.
> >>>
> >>> To switch to your set, simply add your .gitlab-ci.yml as
> >>> .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where CI_PROJECT_NAMESPACE
> >>> is your gitlab 'namespace', usually username). This file will be
> >>> used instead of the default mainstream set.
> >>>
> >>> Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
> >>> ---
> >>>  .gitlab-ci.yml | 7 ++++++-
> >>>  1 file changed, 6 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
> >>> index 718c8e004be..35fd35075db 100644
> >>> --- a/.gitlab-ci.yml
> >>> +++ b/.gitlab-ci.yml
> >>> @@ -9,7 +9,12 @@ generate-config:
> >>>      paths:
> >>>        - generated-config.yml
> >>>    script:
> >>> -    - cp .gitlab-ci.d/qemu-project.yml generated-config.yml
> >>> +    - if test -e .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml ;
> >>> +      then
> >>> +        cp .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml generated-config.yml ;
> >>> +      else
> >>> +        cp .gitlab-ci.d/qemu-project.yml generated-config.yml ;
> >>> +      fi
> >>
> >> This is going to be a little clunky. I can see a use for the static
> >> forks that Danial proposes but I guess what is needed is a little
> >> expressiveness. So how to express things like:
> >>
> >>  - I've only touched stuff in linux-user, so run only linux-user tests
> > 
> > This can be done with "rules" matching on files, but *only* if the
> > pipeline trigger is a merge request - specifically not a git branch
> > push, as the latter doesn't have the semantics you expect wrt
> > determining the "ancestor" to compare against. It only looks at commits
> > in the push, not those which may already have previously been pushed
> > on the branch.
> > 
> >>  - I'm working on KVM, run all KVM enabled builds and tests
> >>
> >>  - I've changed the core TCG code, run everything that exercises that
> >>
> >>  - I'm working on ARM, only build and run jobs that have ARM targets
> > 
> > If the stuff you work on is fairly static, we could potentially
> > allow env variables to be set by the user in their fork, which
> > the CI jobs use to filter jobs.
> > 
> >> I think we should define a minimum set of lightweight smoke tests that
> >> get the most bang for buck for catching sillies. I think checkpatch and
> >> dco checking should probably be in there - and maybe one of the bog
> >> standard build everything builds (maybe a random ../configure; make;
> >> make check on one of the supported LTS targets).
> > 
> > Could we have allow an env var  "QEMU_CI_SMOKE_TEST=1" which can be
> > set when pushing:
> > 
> >    git push  -o ci.variable="QEMU_CI_SMOKE_TEST=1"
> > 
> > 
> > which causes it to only do the minimum neccessary.
> > 
> > Alternatively, invert this, so do minimum smoke test by default
> > and require an env to run the full test. QEMU_CI_MAX=1
> > 
> > Potentially allow also  "QEMU_CI_EXTRA_JOBS=foo,bar,wizz"
> > to match against job jnames ?
> 
> Is that what you mean?
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg758340.html

Sort of - this is more implementing high level tags - I was actally
suggesting the explicit job names here.

eg if I see that my pull request to peter has failed on  job "foo",
then when testing fixes it is easier if I can just say run job "foo",
instead of trying to figure out which high level tag happens to pull
in job "foo".

The two approaches probably aren't mutually exclusive though.

Regards,
Daniel
Philippe Mathieu-Daudé May 11, 2021, 6:48 a.m. UTC | #17
+Stefan/Peter

On 4/19/21 12:59 PM, Thomas Huth wrote:
> On 19/04/2021 12.51, Daniel P. Berrangé wrote:
>> On Mon, Apr 19, 2021 at 12:48:25PM +0200, Thomas Huth wrote:
>>> On 19/04/2021 12.36, Daniel P. Berrangé wrote:
>>>> On Mon, Apr 19, 2021 at 12:20:55PM +0200, Thomas Huth wrote:
>>>>> On 19/04/2021 12.10, Erik Skultety wrote:
>>>>>> On Mon, Apr 19, 2021 at 10:40:53AM +0100, Daniel P. Berrangé wrote:
>>>>>>> On Mon, Apr 19, 2021 at 01:34:47AM +0200, Philippe Mathieu-Daudé
>>>>>>> wrote:
>>>>>>>> Forks run the same jobs than mainstream, which might be overkill.
>>>>>>>> Allow them to easily rebase their custom set, while keeping using
>>>>>>>> the mainstream templates, and ability to pick specific jobs from
>>>>>>>> the mainstream set.
>>>>>>>>
>>>>>>>> To switch to your set, simply add your .gitlab-ci.yml as
>>>>>>>> .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where
>>>>>>>> CI_PROJECT_NAMESPACE
>>>>>>>> is your gitlab 'namespace', usually username). This file will be
>>>>>>>> used instead of the default mainstream set.
>>>>>>>
>>>>>>> I find this approach undesirable, because AFAICT, it means you have
>>>>>>> to commit this extra file to any of your downstream branches that
>>>>>>> you want this to be used for.  Then you have to be either delete it
>>>>>>> again before sending patches upstream, or tell git-publish to
>>>>>>> exclude the commit that adds this.
>>>>>>>
>>>>>>> IMHO any per-contributor overhead needs to not involve committing
>>>>>>> stuff to their git branches, that isn't intended to go upstream.
>>>>>>
>>>>>> Not just that, ideally, they should also run all the upstream
>>>>>> workloads before
>>>>>> submitting a PR or posting patches because they'd have to respin
>>>>>> because of a
>>>>>> potential failure in upstream pipelines anyway.
>>>>>
>>>>> It's pretty clear that you want to run the full QEMU CI before
>>>>> submitting
>>>>> patches to the QEMU project, but I think we are rather talking
>>>>> about forks
>>>>> here that are meant not meant for immediately contributing to upstream
>>>>> again, like RHEL where we only build the KVM-related targets and
>>>>> certainly
>>>>> do not want to test other things like CPUs that are not capable of
>>>>> KVM, or a
>>>>> branch where Philippe only wants to check his MIPS-related work during
>>>>> development.
>>>>> For contributing patches to upstream, you certainly have to run the
>>>>> full CI,
>>>>> but for other things, it's sometimes really useful to cut down the CI
>>>>> machinery (I'm also doing this in my development branches manually
>>>>> some
>>>>> times to speed up the CI), so I think this series make sense, indeed.
>>>>
>>>> For a downstream like RHEL, I'd just expect them to replace the main
>>>> .gitlab-ci.yml entirely to suit their downstream needs.
>>>
>>> But that still means that we should clean up the main .gitlab-ci.yml
>>> file
>>> anyway, like it is done in this series, to avoid that you always get
>>> conflicts in this big file with your downstream-only modifications.
>>> So at
>>> least up to patch 11 or 12, I think this is a very valuable work that
>>> Philippe is doing here.
>>
>> I don't see a real issue with downstream conflicts. They'll just
>> periodically pick a release to base themselves off and change once
>> every 6 months or more.
> 
> It's not only downstream distros that rebase every 6 month. Like
> Philippe, I'm sometimes hacking my .gitlab-ci.yml of my development
> branch to speed up the CI during my development cycles (i.e. I'm
> removing the jobs that I do not need). And I'm regularly rebasing my
> development branchs. Conflicts in .gitlab-ci.yml are then always
> painful, so a leaner main .gitlab-ci.yml file would be helpful for me,
> too, indeed.

Not sure if following up this thread or start a new one, but I got
blocked again from Gitlab, tagged as a crypto miner for running QEMU
CI...
[1]
https://about.gitlab.com/handbook/support/workflows/investigate_blocked_pipeline.html#trends--high-priority-cases

I pushed 5 different branches to my repository in less than 1h,
kicking 580 jobs [*].

I didn't try to stress Gitlab, it was a simple "one time in the month
rebase unmerged branches, push them before respining on the mailing
list".

I'm considering changing my workflow:
- not push more than 2 branches per hour (I know 3/h works, so choose
  a lower number, as we want to add more tests).
- merge multiple branches locally and push the merged result and
  bisect / re-push on failure
- run less testing
- do not run testing

This sounds counter productive and doesn't scale to a community of
contributors asked to use Gitlab.

So far I don't have better idea than this series.

Who is interested in sending patches to improve our workflow?

Thanks,

Phil.

[*] NB I have 3 extra runners added to my namespace, but it didn't
help, as per [1] I got blocked by reaching an API rate limit.
Stefan Hajnoczi May 11, 2021, 1:55 p.m. UTC | #18
On Tue, May 11, 2021 at 08:48:44AM +0200, Philippe Mathieu-Daudé wrote:
> +Stefan/Peter
> 
> On 4/19/21 12:59 PM, Thomas Huth wrote:
> > On 19/04/2021 12.51, Daniel P. Berrangé wrote:
> >> On Mon, Apr 19, 2021 at 12:48:25PM +0200, Thomas Huth wrote:
> >>> On 19/04/2021 12.36, Daniel P. Berrangé wrote:
> >>>> On Mon, Apr 19, 2021 at 12:20:55PM +0200, Thomas Huth wrote:
> >>>>> On 19/04/2021 12.10, Erik Skultety wrote:
> >>>>>> On Mon, Apr 19, 2021 at 10:40:53AM +0100, Daniel P. Berrangé wrote:
> >>>>>>> On Mon, Apr 19, 2021 at 01:34:47AM +0200, Philippe Mathieu-Daudé
> >>>>>>> wrote:
> >>>>>>>> Forks run the same jobs than mainstream, which might be overkill.
> >>>>>>>> Allow them to easily rebase their custom set, while keeping using
> >>>>>>>> the mainstream templates, and ability to pick specific jobs from
> >>>>>>>> the mainstream set.
> >>>>>>>>
> >>>>>>>> To switch to your set, simply add your .gitlab-ci.yml as
> >>>>>>>> .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where
> >>>>>>>> CI_PROJECT_NAMESPACE
> >>>>>>>> is your gitlab 'namespace', usually username). This file will be
> >>>>>>>> used instead of the default mainstream set.
> >>>>>>>
> >>>>>>> I find this approach undesirable, because AFAICT, it means you have
> >>>>>>> to commit this extra file to any of your downstream branches that
> >>>>>>> you want this to be used for.  Then you have to be either delete it
> >>>>>>> again before sending patches upstream, or tell git-publish to
> >>>>>>> exclude the commit that adds this.
> >>>>>>>
> >>>>>>> IMHO any per-contributor overhead needs to not involve committing
> >>>>>>> stuff to their git branches, that isn't intended to go upstream.
> >>>>>>
> >>>>>> Not just that, ideally, they should also run all the upstream
> >>>>>> workloads before
> >>>>>> submitting a PR or posting patches because they'd have to respin
> >>>>>> because of a
> >>>>>> potential failure in upstream pipelines anyway.
> >>>>>
> >>>>> It's pretty clear that you want to run the full QEMU CI before
> >>>>> submitting
> >>>>> patches to the QEMU project, but I think we are rather talking
> >>>>> about forks
> >>>>> here that are meant not meant for immediately contributing to upstream
> >>>>> again, like RHEL where we only build the KVM-related targets and
> >>>>> certainly
> >>>>> do not want to test other things like CPUs that are not capable of
> >>>>> KVM, or a
> >>>>> branch where Philippe only wants to check his MIPS-related work during
> >>>>> development.
> >>>>> For contributing patches to upstream, you certainly have to run the
> >>>>> full CI,
> >>>>> but for other things, it's sometimes really useful to cut down the CI
> >>>>> machinery (I'm also doing this in my development branches manually
> >>>>> some
> >>>>> times to speed up the CI), so I think this series make sense, indeed.
> >>>>
> >>>> For a downstream like RHEL, I'd just expect them to replace the main
> >>>> .gitlab-ci.yml entirely to suit their downstream needs.
> >>>
> >>> But that still means that we should clean up the main .gitlab-ci.yml
> >>> file
> >>> anyway, like it is done in this series, to avoid that you always get
> >>> conflicts in this big file with your downstream-only modifications.
> >>> So at
> >>> least up to patch 11 or 12, I think this is a very valuable work that
> >>> Philippe is doing here.
> >>
> >> I don't see a real issue with downstream conflicts. They'll just
> >> periodically pick a release to base themselves off and change once
> >> every 6 months or more.
> > 
> > It's not only downstream distros that rebase every 6 month. Like
> > Philippe, I'm sometimes hacking my .gitlab-ci.yml of my development
> > branch to speed up the CI during my development cycles (i.e. I'm
> > removing the jobs that I do not need). And I'm regularly rebasing my
> > development branchs. Conflicts in .gitlab-ci.yml are then always
> > painful, so a leaner main .gitlab-ci.yml file would be helpful for me,
> > too, indeed.
> 
> Not sure if following up this thread or start a new one, but I got
> blocked again from Gitlab, tagged as a crypto miner for running QEMU
> CI...
> [1]
> https://about.gitlab.com/handbook/support/workflows/investigate_blocked_pipeline.html#trends--high-priority-cases
> 
> I pushed 5 different branches to my repository in less than 1h,
> kicking 580 jobs [*].
> 
> I didn't try to stress Gitlab, it was a simple "one time in the month
> rebase unmerged branches, push them before respining on the mailing
> list".
> 
> I'm considering changing my workflow:
> - not push more than 2 branches per hour (I know 3/h works, so choose
>   a lower number, as we want to add more tests).
> - merge multiple branches locally and push the merged result and
>   bisect / re-push on failure
> - run less testing
> - do not run testing
> 
> This sounds counter productive and doesn't scale to a community of
> contributors asked to use Gitlab.
> 
> So far I don't have better idea than this series.
> 
> Who is interested in sending patches to improve our workflow?
> 
> Thanks,
> 
> Phil.
> 
> [*] NB I have 3 extra runners added to my namespace, but it didn't
> help, as per [1] I got blocked by reaching an API rate limit.

The easiest short-term workaround seems to be disabling testing when you
push certain branches.

In the long term I think GitLab CI should allow unlimited jobs on
dedicated runners. It may be necessary to get in touch with GitLab
support and figure out how to stop it blocking dedicated runner jobs.

Stefan
Alex Bennée May 11, 2021, 2 p.m. UTC | #19
Philippe Mathieu-Daudé <f4bug@amsat.org> writes:

> +Stefan/Peter
>
> On 4/19/21 12:59 PM, Thomas Huth wrote:
>> On 19/04/2021 12.51, Daniel P. Berrangé wrote:
>>> On Mon, Apr 19, 2021 at 12:48:25PM +0200, Thomas Huth wrote:
>>>> On 19/04/2021 12.36, Daniel P. Berrangé wrote:
>>>>> On Mon, Apr 19, 2021 at 12:20:55PM +0200, Thomas Huth wrote:
>>>>>> On 19/04/2021 12.10, Erik Skultety wrote:
>>>>>>> On Mon, Apr 19, 2021 at 10:40:53AM +0100, Daniel P. Berrangé wrote:
>>>>>>>> On Mon, Apr 19, 2021 at 01:34:47AM +0200, Philippe Mathieu-Daudé
>>>>>>>> wrote:
>>>>>>>>> Forks run the same jobs than mainstream, which might be overkill.
>>>>>>>>> Allow them to easily rebase their custom set, while keeping using
>>>>>>>>> the mainstream templates, and ability to pick specific jobs from
>>>>>>>>> the mainstream set.
>>>>>>>>>
>>>>>>>>> To switch to your set, simply add your .gitlab-ci.yml as
>>>>>>>>> .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml (where
>>>>>>>>> CI_PROJECT_NAMESPACE
>>>>>>>>> is your gitlab 'namespace', usually username). This file will be
>>>>>>>>> used instead of the default mainstream set.
>>>>>>>>
>>>>>>>> I find this approach undesirable, because AFAICT, it means you have
>>>>>>>> to commit this extra file to any of your downstream branches that
>>>>>>>> you want this to be used for.  Then you have to be either delete it
>>>>>>>> again before sending patches upstream, or tell git-publish to
>>>>>>>> exclude the commit that adds this.
>>>>>>>>
>>>>>>>> IMHO any per-contributor overhead needs to not involve committing
>>>>>>>> stuff to their git branches, that isn't intended to go upstream.
>>>>>>>
>>>>>>> Not just that, ideally, they should also run all the upstream
>>>>>>> workloads before
>>>>>>> submitting a PR or posting patches because they'd have to respin
>>>>>>> because of a
>>>>>>> potential failure in upstream pipelines anyway.
>>>>>>
>>>>>> It's pretty clear that you want to run the full QEMU CI before
>>>>>> submitting
>>>>>> patches to the QEMU project, but I think we are rather talking
>>>>>> about forks
>>>>>> here that are meant not meant for immediately contributing to upstream
>>>>>> again, like RHEL where we only build the KVM-related targets and
>>>>>> certainly
>>>>>> do not want to test other things like CPUs that are not capable of
>>>>>> KVM, or a
>>>>>> branch where Philippe only wants to check his MIPS-related work during
>>>>>> development.
>>>>>> For contributing patches to upstream, you certainly have to run the
>>>>>> full CI,
>>>>>> but for other things, it's sometimes really useful to cut down the CI
>>>>>> machinery (I'm also doing this in my development branches manually
>>>>>> some
>>>>>> times to speed up the CI), so I think this series make sense, indeed.
>>>>>
>>>>> For a downstream like RHEL, I'd just expect them to replace the main
>>>>> .gitlab-ci.yml entirely to suit their downstream needs.
>>>>
>>>> But that still means that we should clean up the main .gitlab-ci.yml
>>>> file
>>>> anyway, like it is done in this series, to avoid that you always get
>>>> conflicts in this big file with your downstream-only modifications.
>>>> So at
>>>> least up to patch 11 or 12, I think this is a very valuable work that
>>>> Philippe is doing here.
>>>
>>> I don't see a real issue with downstream conflicts. They'll just
>>> periodically pick a release to base themselves off and change once
>>> every 6 months or more.
>> 
>> It's not only downstream distros that rebase every 6 month. Like
>> Philippe, I'm sometimes hacking my .gitlab-ci.yml of my development
>> branch to speed up the CI during my development cycles (i.e. I'm
>> removing the jobs that I do not need). And I'm regularly rebasing my
>> development branchs. Conflicts in .gitlab-ci.yml are then always
>> painful, so a leaner main .gitlab-ci.yml file would be helpful for me,
>> too, indeed.
>
> Not sure if following up this thread or start a new one, but I got
> blocked again from Gitlab, tagged as a crypto miner for running QEMU
> CI...
> [1]
> https://about.gitlab.com/handbook/support/workflows/investigate_blocked_pipeline.html#trends--high-priority-cases
>
> I pushed 5 different branches to my repository in less than 1h,
> kicking 580 jobs [*].
>
> I didn't try to stress Gitlab, it was a simple "one time in the month
> rebase unmerged branches, push them before respining on the mailing
> list".
>
> I'm considering changing my workflow:
> - not push more than 2 branches per hour (I know 3/h works, so choose
>   a lower number, as we want to add more tests).
> - merge multiple branches locally and push the merged result and
>   bisect / re-push on failure

I stack my branches - so usually I have a:

 testing/next
 gdb/next
 whatever my current hack is

Every week I re-base the branches and re-build my current hacking tree.
If an actual problem shows up in CI I'll bisect on one of my beefy boxes
to fix it and then fix and re-push testing/next and whatever my tip is.

> - run less testing
> - do not run testing

I run a lot of testing locally (or rather on a beefy server) so I'm
really only using GitLab for final validation of trees rather than day 2
day.

>
> This sounds counter productive and doesn't scale to a community of
> contributors asked to use Gitlab.
>
> So far I don't have better idea than this series.
>
> Who is interested in sending patches to improve our workflow?
>
> Thanks,
>
> Phil.
>
> [*] NB I have 3 extra runners added to my namespace, but it didn't
> help, as per [1] I got blocked by reaching an API rate limit.
Daniel P. Berrangé May 11, 2021, 2:21 p.m. UTC | #20
On Tue, May 11, 2021 at 08:48:44AM +0200, Philippe Mathieu-Daudé wrote:
> +Stefan/Peter
> 
> Not sure if following up this thread or start a new one, but I got
> blocked again from Gitlab, tagged as a crypto miner for running QEMU
> CI...
> [1]
> https://about.gitlab.com/handbook/support/workflows/investigate_blocked_pipeline.html#trends--high-priority-cases
> 
> I pushed 5 different branches to my repository in less than 1h,
> kicking 580 jobs [*].
> 
> I didn't try to stress Gitlab, it was a simple "one time in the month
> rebase unmerged branches, push them before respining on the mailing
> list".
> 
> I'm considering changing my workflow:
> - not push more than 2 branches per hour (I know 3/h works, so choose
>   a lower number, as we want to add more tests).
> - merge multiple branches locally and push the merged result and
>   bisect / re-push on failure
> - run less testing
> - do not run testing
> 
> This sounds counter productive and doesn't scale to a community of
> contributors asked to use Gitlab.
> 
> So far I don't have better idea than this series.
> 
> Who is interested in sending patches to improve our workflow?

So we have a few scenarios for using the CI

 1. Running gating CI before merging to master
 2. Subsystem maintainers running CI before sending a PULL req
 3. Contributors running CI before sending a patch series

Right now we have the same jobs running in all three scenarios.

Given the increasing restrictions on usage, we clearly need to cut
down in general and also make it so that it is harder to accidentally
burn all your available CI allowance.

Currently we always run CI whenever pushing to gitlab. This is
convenient but in retrospect it is overkill. People often push
to gitlab simply as their backup strategy and thus don't need
CI run every time.

Not all changes require all possible jobs to by run, but it is hard
to filter jobs when we're triggering them based on pushes, as the
baseline against which file changes are identified is ill-defined.


For scenario (1) we need all the jobs run to maximise quality.
This is also a case where we're most likely to have custom runners
available, so CI allowance is less of a concern. The job count still
needs to be reasonable to avoid hitting issues at times when the
merges are frequent (just before freeze).


For scenario (2) subsys maintainers, we want them to minimize
the liklihood that a pull request will fail scenario (1) and
require a respin.  Running all jobs achieces this but it is
likely overkill.

eg we have 24 cross compiler builds. If we expect most maintainers
will have either x86-64 or aarch64 hardware for their primary dev
platform, then the key benefit of cross compilers is getting coverage
of

 - 32-bit
 - big endian
 - windows

We don't need 24 jobs todo that. We could simply pick armel as the
most relevant 32-bit arch and s390x as the most relevant big endian
arch, and then the win32/64 platforms. IOW we could potentially only
run 4-6 jobs instead of 24, and still get excellant arch coverage.

Similarly for native builds we test quite alot of different distros.
I think we probably can rationalize that down to just 2 distros,
one covering oldest packages (Debian Stretch) and one covering newest
(Fedora 34), and a "build everything" config.

We have many other jobs that are testing various obscure combinations
of configure args. I'd suggest these rarely fail for most pull requests
so are overkill.

For subsystem maintainers we could potentially get down to just 10-15
jobs if we're ambitious. Leave everything else as manual trigger only.

Perhaps set all the jobs to only run on certain branch name patterns.
eg perhaps "*-next" filter is common for subsystem maintainer's pending
branches ?

For general contributors a similarly short set of jobs to subsystem
maintainers is viable. Perhaps again just let then use a "-next"
branch.

If we can enable manual triggers on any other branches that's good.

Regards,
Daniel
Philippe Mathieu-Daudé May 13, 2021, 7:01 p.m. UTC | #21
On 5/11/21 8:48 AM, Philippe Mathieu-Daudé wrote:
> Not sure if following up this thread or start a new one, but I got
> blocked again from Gitlab, tagged as a crypto miner for running QEMU
> CI...
> [1]
> https://about.gitlab.com/handbook/support/workflows/investigate_blocked_pipeline.html#trends--high-priority-cases
> 
> I pushed 5 different branches to my repository in less than 1h,
> kicking 580 jobs [*].
> 
> I didn't try to stress Gitlab, it was a simple "one time in the month
> rebase unmerged branches, push them before respining on the mailing
> list".

FYI I got my account unlocked (without any notification update).
36h passed, maybe it is something automatic (block the user for
36h if suspected of crypto mining?).
diff mbox series

Patch

diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml
index 718c8e004be..35fd35075db 100644
--- a/.gitlab-ci.yml
+++ b/.gitlab-ci.yml
@@ -9,7 +9,12 @@  generate-config:
     paths:
       - generated-config.yml
   script:
-    - cp .gitlab-ci.d/qemu-project.yml generated-config.yml
+    - if test -e .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml ;
+      then
+        cp .gitlab-ci.d/${CI_PROJECT_NAMESPACE}.yml generated-config.yml ;
+      else
+        cp .gitlab-ci.d/qemu-project.yml generated-config.yml ;
+      fi
 
 generate-pipeline:
   stage: test