diff mbox series

[06/13] tests/avocado: use more distinct names for assets

Message ID 20240726134438.14720-7-crosa@redhat.com
State New
Headers show
Series Bump Avocado to 103.0 LTS and update tests for compatibility and new features | expand

Commit Message

Cleber Rosa July 26, 2024, 1:44 p.m. UTC
Avocado's asset system will deposit files in a cache organized either
by their original location (the URI) or by their names.  Because the
cache (and the "by_name" sub directory) is common across tests, it's a
good idea to make these names as distinct as possible.

This avoid name clashes, which makes future Avocado runs to attempt to
redownload the assets with the same name, but from the different
locations they actually are from.  This causes cache misses, extra
downloads, and possibly canceled tests.

Signed-off-by: Cleber Rosa <crosa@redhat.com>
---
 tests/avocado/kvm_xen_guest.py  | 3 ++-
 tests/avocado/netdev-ethtool.py | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

Comments

Daniel P. Berrangé July 29, 2024, 10:49 a.m. UTC | #1
On Fri, Jul 26, 2024 at 09:44:31AM -0400, Cleber Rosa wrote:
> Avocado's asset system will deposit files in a cache organized either
> by their original location (the URI) or by their names.  Because the
> cache (and the "by_name" sub directory) is common across tests, it's a
> good idea to make these names as distinct as possible.
> 
> This avoid name clashes, which makes future Avocado runs to attempt to
> redownload the assets with the same name, but from the different
> locations they actually are from.  This causes cache misses, extra
> downloads, and possibly canceled tests.
> 
> Signed-off-by: Cleber Rosa <crosa@redhat.com>
> ---
>  tests/avocado/kvm_xen_guest.py  | 3 ++-
>  tests/avocado/netdev-ethtool.py | 3 ++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/tests/avocado/kvm_xen_guest.py b/tests/avocado/kvm_xen_guest.py
> index f8cb458d5d..318fadebc3 100644
> --- a/tests/avocado/kvm_xen_guest.py
> +++ b/tests/avocado/kvm_xen_guest.py
> @@ -40,7 +40,8 @@ def get_asset(self, name, sha1):
>          url = base_url + name
>          # use explicit name rather than failing to neatly parse the
>          # URL into a unique one
> -        return self.fetch_asset(name=name, locations=(url), asset_hash=sha1)
> +        return self.fetch_asset(name=f"qemu-kvm-xen-guest-{name}",
> +                                locations=(url), asset_hash=sha1)

Why do we need to pass a name here at all ? I see the comment here
but it isn't very clear about what the problem is. It just feels
wrong to be creating ourselves uniqueness naming problems, when we
have a nicely unique URL, and that cached URL can be shared across
tests, where as the custom names added by this patch are forcing
no-caching of the same URL between tests.

>  
>      def common_vm_setup(self):
>          # We also catch lack of KVM_XEN support if we fail to launch
> diff --git a/tests/avocado/netdev-ethtool.py b/tests/avocado/netdev-ethtool.py
> index 5f33288f81..462cf8de7d 100644
> --- a/tests/avocado/netdev-ethtool.py
> +++ b/tests/avocado/netdev-ethtool.py
> @@ -27,7 +27,8 @@ def get_asset(self, name, sha1):
>          url = base_url + name
>          # use explicit name rather than failing to neatly parse the
>          # URL into a unique one
> -        return self.fetch_asset(name=name, locations=(url), asset_hash=sha1)
> +        return self.fetch_asset(name=f"qemu-netdev-ethtool-{name}",
> +                                locations=(url), asset_hash=sha1)
>  
>      def common_test_code(self, netdev, extra_args=None):
>  
> -- 
> 2.45.2
> 
> 

With regards,
Daniel
Philippe Mathieu-Daudé July 29, 2024, 11:54 a.m. UTC | #2
On 29/7/24 12:49, Daniel P. Berrangé wrote:
> On Fri, Jul 26, 2024 at 09:44:31AM -0400, Cleber Rosa wrote:
>> Avocado's asset system will deposit files in a cache organized either
>> by their original location (the URI) or by their names.  Because the
>> cache (and the "by_name" sub directory) is common across tests, it's a
>> good idea to make these names as distinct as possible.
>>
>> This avoid name clashes, which makes future Avocado runs to attempt to
>> redownload the assets with the same name, but from the different
>> locations they actually are from.  This causes cache misses, extra
>> downloads, and possibly canceled tests.
>>
>> Signed-off-by: Cleber Rosa <crosa@redhat.com>
>> ---
>>   tests/avocado/kvm_xen_guest.py  | 3 ++-
>>   tests/avocado/netdev-ethtool.py | 3 ++-
>>   2 files changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/tests/avocado/kvm_xen_guest.py b/tests/avocado/kvm_xen_guest.py
>> index f8cb458d5d..318fadebc3 100644
>> --- a/tests/avocado/kvm_xen_guest.py
>> +++ b/tests/avocado/kvm_xen_guest.py
>> @@ -40,7 +40,8 @@ def get_asset(self, name, sha1):
>>           url = base_url + name
>>           # use explicit name rather than failing to neatly parse the
>>           # URL into a unique one
>> -        return self.fetch_asset(name=name, locations=(url), asset_hash=sha1)
>> +        return self.fetch_asset(name=f"qemu-kvm-xen-guest-{name}",
>> +                                locations=(url), asset_hash=sha1)
> 
> Why do we need to pass a name here at all ? I see the comment here
> but it isn't very clear about what the problem is. It just feels
> wrong to be creating ourselves uniqueness naming problems, when we
> have a nicely unique URL, and that cached URL can be shared across
> tests, where as the custom names added by this patch are forcing
> no-caching of the same URL between tests.

I thought $name was purely for debugging; the file was downloaded
in a temporary location, and if the hash matched, it was renamed
in the cache as $asset_hash which is unique. This was suggested
in order to avoid dealing with URL updates for the same asset.
Isn't it the case?
Cleber Rosa Aug. 1, 2024, 3:12 a.m. UTC | #3
On Mon, Jul 29, 2024 at 6:49 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> On Fri, Jul 26, 2024 at 09:44:31AM -0400, Cleber Rosa wrote:
> > Avocado's asset system will deposit files in a cache organized either
> > by their original location (the URI) or by their names.  Because the
> > cache (and the "by_name" sub directory) is common across tests, it's a
> > good idea to make these names as distinct as possible.
> >
> > This avoid name clashes, which makes future Avocado runs to attempt to
> > redownload the assets with the same name, but from the different
> > locations they actually are from.  This causes cache misses, extra
> > downloads, and possibly canceled tests.
> >
> > Signed-off-by: Cleber Rosa <crosa@redhat.com>
> > ---
> >  tests/avocado/kvm_xen_guest.py  | 3 ++-
> >  tests/avocado/netdev-ethtool.py | 3 ++-
> >  2 files changed, 4 insertions(+), 2 deletions(-)
> >
> > diff --git a/tests/avocado/kvm_xen_guest.py b/tests/avocado/kvm_xen_guest.py
> > index f8cb458d5d..318fadebc3 100644
> > --- a/tests/avocado/kvm_xen_guest.py
> > +++ b/tests/avocado/kvm_xen_guest.py
> > @@ -40,7 +40,8 @@ def get_asset(self, name, sha1):
> >          url = base_url + name
> >          # use explicit name rather than failing to neatly parse the
> >          # URL into a unique one
> > -        return self.fetch_asset(name=name, locations=(url), asset_hash=sha1)
> > +        return self.fetch_asset(name=f"qemu-kvm-xen-guest-{name}",
> > +                                locations=(url), asset_hash=sha1)
>
> Why do we need to pass a name here at all ? I see the comment here
> but it isn't very clear about what the problem is. It just feels
> wrong to be creating ourselves uniqueness naming problems, when we
> have a nicely unique URL, and that cached URL can be shared across
> tests, where as the custom names added by this patch are forcing
> no-caching of the same URL between tests.
>

Now with your comment, I do agree that this adds some unneeded
maintenance burden indeed.  Also, this was part of my pre-avocado bump
patches that would work around issues present in < 103.0.  But let me
give the complete answer.

Under 88.1 the "uniqueness" of the URL did not consider the query
parameters in the URL.  So, under 88.1:

   avocado.utils.asset.Asset(name='bzImage',
locations=['https://fileserver.linaro.org/s/kE4nCFLdQcoBF9t/download?path=%2Fkvm-xen-guest&files=bzImage',
...)
   avocado.utils.asset.Asset(name='bzImage',
locations=['https://fileserver.linaro.org/s/kE4nCFLdQcoBF9t/download?path=%2Fnetdev-ethtool&files=bzImage',
...)

Would save content to the same location:
/tmp/cache_old/by_location/2a8ecd750eb952504ad96b89576207afe1be6a8f/download.

This is no longer the case on 103.0 (actually since 92.0), the
contents of those exact assets would be saved to
'/by_location/415c998a0061347e5115da53d57ea92c908a2e7f/path=%2Fkvm-xen-guest&files=bzImage'
and /by_location/415c998a0061347e5115da53d57ea92c908a2e7f/path=%2Fnetdev-ethtool&files=bzImage'.

I personally don't like having the files named, although uniquely,
after the query parameters.  But, If this doesn't bother others more
than the maintenance burden, and Avocado version bump is applied, this
patch can be dropped.
Cleber Rosa Aug. 1, 2024, 3:20 a.m. UTC | #4
On Mon, Jul 29, 2024 at 7:54 AM Philippe Mathieu-Daudé
<philmd@linaro.org> wrote:
>
> On 29/7/24 12:49, Daniel P. Berrangé wrote:
> > On Fri, Jul 26, 2024 at 09:44:31AM -0400, Cleber Rosa wrote:
> >> Avocado's asset system will deposit files in a cache organized either
> >> by their original location (the URI) or by their names.  Because the
> >> cache (and the "by_name" sub directory) is common across tests, it's a
> >> good idea to make these names as distinct as possible.
> >>
> >> This avoid name clashes, which makes future Avocado runs to attempt to
> >> redownload the assets with the same name, but from the different
> >> locations they actually are from.  This causes cache misses, extra
> >> downloads, and possibly canceled tests.
> >>
> >> Signed-off-by: Cleber Rosa <crosa@redhat.com>
> >> ---
> >>   tests/avocado/kvm_xen_guest.py  | 3 ++-
> >>   tests/avocado/netdev-ethtool.py | 3 ++-
> >>   2 files changed, 4 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/tests/avocado/kvm_xen_guest.py b/tests/avocado/kvm_xen_guest.py
> >> index f8cb458d5d..318fadebc3 100644
> >> --- a/tests/avocado/kvm_xen_guest.py
> >> +++ b/tests/avocado/kvm_xen_guest.py
> >> @@ -40,7 +40,8 @@ def get_asset(self, name, sha1):
> >>           url = base_url + name
> >>           # use explicit name rather than failing to neatly parse the
> >>           # URL into a unique one
> >> -        return self.fetch_asset(name=name, locations=(url), asset_hash=sha1)
> >> +        return self.fetch_asset(name=f"qemu-kvm-xen-guest-{name}",
> >> +                                locations=(url), asset_hash=sha1)
> >
> > Why do we need to pass a name here at all ? I see the comment here
> > but it isn't very clear about what the problem is. It just feels
> > wrong to be creating ourselves uniqueness naming problems, when we
> > have a nicely unique URL, and that cached URL can be shared across
> > tests, where as the custom names added by this patch are forcing
> > no-caching of the same URL between tests.
>
> I thought $name was purely for debugging; the file was downloaded
> in a temporary location, and if the hash matched, it was renamed
> in the cache as $asset_hash which is unique. This was suggested
> in order to avoid dealing with URL updates for the same asset.
> Isn't it the case?
>

Hi Phillipe,

I've replied to Daniel's question, but let me repeat the relevant
parts of the $name behavior here.

---

Under 88.1 the "uniqueness" of the URL did not consider the query
parameters in the URL.  So, under 88.1:

   avocado.utils.asset.Asset(name='bzImage',
locations=['https://fileserver.linaro.org/s/kE4nCFLdQcoBF9t/download?path=%2Fkvm-xen-guest&files=bzImage',
...)
   avocado.utils.asset.Asset(name='bzImage',
locations=['https://fileserver.linaro.org/s/kE4nCFLdQcoBF9t/download?path=%2Fnetdev-ethtool&files=bzImage',
...)

Would save content to the same location:
/tmp/cache_old/by_location/2a8ecd750eb952504ad96b89576207afe1be6a8f/download.

This is no longer the case on 103.0 (actually since 92.0), the
contents of those exact assets would be saved to
'/by_location/415c998a0061347e5115da53d57ea92c908a2e7f/path=%2Fkvm-xen-guest&files=bzImage'
and /by_location/415c998a0061347e5115da53d57ea92c908a2e7f/path=%2Fnetdev-ethtool&files=bzImage'.

I personally don't like having the files named, although uniquely,
after the query parameters.  But, If this doesn't bother others more
than the maintenance burden, and Avocado version bump is applied, this
patch can be dropped.

---

Now, an update: instead of dropping the patch, it could be simplified
by keeping just the "name" parameter (with the URL) and dropping the
"locations" parameter (that has a single location).

Let me know if this makes sense and what you take on it.

Regards,
- Cleber.
Alex Bennée Aug. 1, 2024, 4:05 p.m. UTC | #5
Cleber Rosa <crosa@redhat.com> writes:

> On Mon, Jul 29, 2024 at 6:49 AM Daniel P. Berrangé <berrange@redhat.com> wrote:
>>
>> On Fri, Jul 26, 2024 at 09:44:31AM -0400, Cleber Rosa wrote:
>> > Avocado's asset system will deposit files in a cache organized either
>> > by their original location (the URI) or by their names.  Because the
>> > cache (and the "by_name" sub directory) is common across tests, it's a
>> > good idea to make these names as distinct as possible.
>> >
>> > This avoid name clashes, which makes future Avocado runs to attempt to
>> > redownload the assets with the same name, but from the different
>> > locations they actually are from.  This causes cache misses, extra
>> > downloads, and possibly canceled tests.
>> >
>> > Signed-off-by: Cleber Rosa <crosa@redhat.com>
>> > ---
>> >  tests/avocado/kvm_xen_guest.py  | 3 ++-
>> >  tests/avocado/netdev-ethtool.py | 3 ++-
>> >  2 files changed, 4 insertions(+), 2 deletions(-)
>> >
>> > diff --git a/tests/avocado/kvm_xen_guest.py b/tests/avocado/kvm_xen_guest.py
>> > index f8cb458d5d..318fadebc3 100644
>> > --- a/tests/avocado/kvm_xen_guest.py
>> > +++ b/tests/avocado/kvm_xen_guest.py
>> > @@ -40,7 +40,8 @@ def get_asset(self, name, sha1):
>> >          url = base_url + name
>> >          # use explicit name rather than failing to neatly parse the
>> >          # URL into a unique one
>> > -        return self.fetch_asset(name=name, locations=(url), asset_hash=sha1)
>> > +        return self.fetch_asset(name=f"qemu-kvm-xen-guest-{name}",
>> > +                                locations=(url), asset_hash=sha1)
>>
>> Why do we need to pass a name here at all ? I see the comment here
>> but it isn't very clear about what the problem is. It just feels
>> wrong to be creating ourselves uniqueness naming problems, when we
>> have a nicely unique URL, and that cached URL can be shared across
>> tests, where as the custom names added by this patch are forcing
>> no-caching of the same URL between tests.
>>
>
> Now with your comment, I do agree that this adds some unneeded
> maintenance burden indeed.  Also, this was part of my pre-avocado bump
> patches that would work around issues present in < 103.0.  But let me
> give the complete answer.
>
> Under 88.1 the "uniqueness" of the URL did not consider the query
> parameters in the URL.  So, under 88.1:
>
>    avocado.utils.asset.Asset(name='bzImage',
> locations=['https://fileserver.linaro.org/s/kE4nCFLdQcoBF9t/download?path=%2Fkvm-xen-guest&files=bzImage',
> ...)
>    avocado.utils.asset.Asset(name='bzImage',
> locations=['https://fileserver.linaro.org/s/kE4nCFLdQcoBF9t/download?path=%2Fnetdev-ethtool&files=bzImage',
> ...)

This is mostly a hack to avoid having to tell NextCloud to generate a
unique sharing URL for every file.

>
> Would save content to the same location:
> /tmp/cache_old/by_location/2a8ecd750eb952504ad96b89576207afe1be6a8f/download.
>
> This is no longer the case on 103.0 (actually since 92.0), the
> contents of those exact assets would be saved to
> '/by_location/415c998a0061347e5115da53d57ea92c908a2e7f/path=%2Fkvm-xen-guest&files=bzImage'
> and /by_location/415c998a0061347e5115da53d57ea92c908a2e7f/path=%2Fnetdev-ethtool&files=bzImage'.
>
> I personally don't like having the files named, although uniquely,
> after the query parameters.  But, If this doesn't bother others more
> than the maintenance burden, and Avocado version bump is applied, this
> patch can be dropped.
diff mbox series

Patch

diff --git a/tests/avocado/kvm_xen_guest.py b/tests/avocado/kvm_xen_guest.py
index f8cb458d5d..318fadebc3 100644
--- a/tests/avocado/kvm_xen_guest.py
+++ b/tests/avocado/kvm_xen_guest.py
@@ -40,7 +40,8 @@  def get_asset(self, name, sha1):
         url = base_url + name
         # use explicit name rather than failing to neatly parse the
         # URL into a unique one
-        return self.fetch_asset(name=name, locations=(url), asset_hash=sha1)
+        return self.fetch_asset(name=f"qemu-kvm-xen-guest-{name}",
+                                locations=(url), asset_hash=sha1)
 
     def common_vm_setup(self):
         # We also catch lack of KVM_XEN support if we fail to launch
diff --git a/tests/avocado/netdev-ethtool.py b/tests/avocado/netdev-ethtool.py
index 5f33288f81..462cf8de7d 100644
--- a/tests/avocado/netdev-ethtool.py
+++ b/tests/avocado/netdev-ethtool.py
@@ -27,7 +27,8 @@  def get_asset(self, name, sha1):
         url = base_url + name
         # use explicit name rather than failing to neatly parse the
         # URL into a unique one
-        return self.fetch_asset(name=name, locations=(url), asset_hash=sha1)
+        return self.fetch_asset(name=f"qemu-netdev-ethtool-{name}",
+                                locations=(url), asset_hash=sha1)
 
     def common_test_code(self, netdev, extra_args=None):