diff mbox series

[v2,1/3] compare_versions: make is_oldstyle_version fail when number > 65535

Message ID 20211015041851.2122952-1-dominique.martinet@atmark-techno.com
State Accepted
Headers show
Series [v2,1/3] compare_versions: make is_oldstyle_version fail when number > 65535 | expand

Commit Message

Dominique MARTINET Oct. 15, 2021, 4:18 a.m. UTC
is_oldstyle_version would accept any string comprised of digits and dots.
Parse the actual version in is_oldstyle_version to detect large numbers and
fall back to semver if a large number was given, as semver can handle bigger
values.
Note that if this happens on the 4th level, this result in that number
actually being ignored as semver only handles up to major.minor.patch
levels.

Also print trace messages in case of parse failure (too many digits or number
too big) to facilitate debugging.

Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
---

So here is a version that checks directly that each field does not
overflow 16 bits and falls back to semver.
Honestly this looks more complicated than the first patch I had sent, so
I would rather not merge this -- the fact that the 4th field just starts
disappearing if any of the field goes > 65535 is also hard to understand
for users so I will just enforce the limit in our image creation tool
and have users relying on date try something else (adding dots or
setting it in prerelease part of semver)

Let's drop this one.


The later 2 patches are valid regardless. One is pure style, feel free
to ignore it, I was just annoyed by it.
Then documentation as promised.


 core/artifacts_versions.c | 61 +++++++++++++++++++++++----------------
 1 file changed, 36 insertions(+), 25 deletions(-)

Comments

Stefano Babic Oct. 15, 2021, 9:17 a.m. UTC | #1
Hi Dominique,

On 15.10.21 06:18, Dominique Martinet wrote:
> is_oldstyle_version would accept any string comprised of digits and dots.
> Parse the actual version in is_oldstyle_version to detect large numbers and
> fall back to semver if a large number was given, as semver can handle bigger
> values.
> Note that if this happens on the 4th level, this result in that number
> actually being ignored as semver only handles up to major.minor.patch
> levels.
> 
> Also print trace messages in case of parse failure (too many digits or number
> too big) to facilitate debugging.
> 
> Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
> ---
> 
> So here is a version that checks directly that each field does not
> overflow 16 bits and falls back to semver.
> Honestly this looks more complicated than the first patch

The most important think IMHO is to make very clear which schemas 
SWUpdate supports for versioning. I am aware that we cannot support any 
versioning schema a customer or a company are brought, and we have to 
expose what SWUpdate supports when it starts to make version comparisons.

I will take over the documentation part, and I will add a chapter on 
this. At the end, a user should know that they can use:

	- a x.y.z.t schema, with just plain numbers in range 0..65535
	- a semantic version (with link to documentation), alphanumeric.

Any other versioning schema can be used in "version" attribute, but 
leads to unpredictable results. If rules are well defined, it is easier 
for users to follow them.

> I had sent, so
> I would rather not merge this -- the fact that the 4th field just starts
> disappearing if any of the field goes > 65535 is also hard to understand

Becuase it is today poor documented. If this is documented, and they 
know they can use, it should be not a problem. If they add "-" with a 
descriptive text, version will be compared automatically as semantic 
version and they can use more fields.

	x.y.z.t.w-fancy_name_of_release

This should work. I consider this issue more a documentation bug as a 
bug in code.

> for users so I will just enforce the limit in our image creation tool
> and have users relying on date try something else (adding dots or
> setting it in prerelease part of semver)

Right.

> 
> Let's drop this one.
> 
> 
> The later 2 patches are valid regardless. One is pure style, feel free
> to ignore it, I was just annoyed by it.

I see, the else branch is annoying and make the function less readable. 
Thanks to simplify it.

> Then documentation as promised.

Still poor, but I promise I overtake this duty myself.

> 
> 
>   core/artifacts_versions.c | 61 +++++++++++++++++++++++----------------
>   1 file changed, 36 insertions(+), 25 deletions(-)
> 
> diff --git a/core/artifacts_versions.c b/core/artifacts_versions.c
> index 433125862014..14abdb420498 100644
> --- a/core/artifacts_versions.c
> +++ b/core/artifacts_versions.c
> @@ -147,19 +147,6 @@ void get_sw_versions(swupdate_cfg_handle  __attribute__ ((__unused__))*handle,
>   }
>   #endif
>   
> -static const char ACCEPTED_CHARS[] = "0123456789.";
> -
> -static bool is_oldstyle_version(const char* version_string)
> -{
> -	while (*version_string)
> -	{
> -		if (strchr(ACCEPTED_CHARS, *version_string) == NULL)
> -			return false;
> -		++version_string;
> -	}
> -	return true;
> -}
> -
>   /*
>    * convert a version string into a number
>    * version string is in the format:
> @@ -170,7 +157,7 @@ static bool is_oldstyle_version(const char* version_string)
>    * Also major.minor or major.minor.revision are allowed
>    * The conversion generates a 64 bit value that can be compared
>    */
> -static __u64 version_to_number(const char *version_string)
> +static bool version_to_number(const char *version_string, __u64 *version_number)
>   {
>   	char **versions = NULL;
>   	char **ver;
> @@ -178,22 +165,43 @@ static __u64 version_to_number(const char *version_string)
>   	__u64 version = 0;
>   
>   	versions = string_split(version_string, '.');
> -	for (ver = versions; *ver != NULL; ver++, count ++) {
> +	for (ver = versions; *ver != NULL; ver++, count++) {
>   		if (count < 4) {
>   			unsigned long int fld = strtoul(*ver, NULL, 10);
>   			/* check for return of strtoul, mandatory */
> -			if (fld != ULONG_MAX) {
> -				fld &= 0xffff;
> -				version = (version << 16) | fld;
> +			if (fld > 0xffff) {
> +				TRACE("Version %s had an element > 65535, falling back to semver",
> +				      version_string);

Understand this, but it becomes very annoying when semver is used and 
these are always printed. I will change this to DEBUG instead of TRACE.

> +				return false;
>   			}
> +			version = (version << 16) | fld;
>   		}
>   		free(*ver);
>   	}
> -	if ((count < 4) && (count > 0))
> +	if (count >= 4) {
> +		TRACE("Version %s had more than 4 numbers, trailing numbers will be ignored",
> +		      version_string);

Ditto.

> +	} else if (count > 0) {
>   		version <<= 16 * (4 - count);
> +	}
>   	free(versions);
>   
> -	return version;
> +	*version_number = version;
> +	return true;
> +}
> +
> +static const char ACCEPTED_CHARS[] = "0123456789.";
> +
> +static bool is_oldstyle_version(const char *version_string, __u64 *version_number)
> +{
> +	const char *ver = version_string;
> +	while (*ver)
> +	{
> +		if (strchr(ACCEPTED_CHARS, *ver) == NULL)
> +			return false;
> +		++ver;
> +	}
> +	return version_to_number(version_string, version_number);;
>   }
>   
>   /*
> @@ -209,12 +217,15 @@ static __u64 version_to_number(const char *version_string)
>    */
>   int compare_versions(const char* left_version, const char* right_version)
>   {
> -	if (is_oldstyle_version(left_version) && is_oldstyle_version(right_version))
> -	{
> -		__u64 left_u64 = version_to_number(left_version);
> -		__u64 right_u64 = version_to_number(right_version);
>   
> -		DEBUG("Comparing old-style versions '%s' <-> '%s'", left_version, right_version);
> +	__u64 left_u64;
> +	__u64 right_u64;
> +
> +	if (is_oldstyle_version(left_version, &left_u64)
> +	    && is_oldstyle_version(right_version, &right_u64))
> +	{
> +		DEBUG("Comparing old-style versions '%s' <-> '%s'",
> +		      left_version, right_version);
>   		TRACE("Parsed: '%llu' <-> '%llu'", left_u64, right_u64);
>   
>   		if (left_u64 < right_u64)
> 

Well, why do you think this should not be merged ? This guarantees that 
is fully backward compatible with the past. I have no idea how many 
projects are out relying on the current behavior, and a breakage can 
cause a lot of headaches.

Best regards,
Stefano Babic
Dominique MARTINET Oct. 17, 2021, 11:28 p.m. UTC | #2
Stefano Babic wrote on Fri, Oct 15, 2021 at 11:17:54AM +0200:
> > So here is a version that checks directly that each field does not
> > overflow 16 bits and falls back to semver.
> > Honestly this looks more complicated than the first patch
> 
> The most important think IMHO is to make very clear which schemas SWUpdate
> supports for versioning. I am aware that we cannot support any versioning
> schema a customer or a company are brought, and we have to expose what
> SWUpdate supports when it starts to make version comparisons.

Agreed.

> I will take over the documentation part, and I will add a chapter on this.
> At the end, a user should know that they can use:
> 
> 	- a x.y.z.t schema, with just plain numbers in range 0..65535
> 	- a semantic version (with link to documentation), alphanumeric.
> 
> Any other versioning schema can be used in "version" attribute, but leads to
> unpredictable results. If rules are well defined, it is easier for users to
> follow them.

Thanks!

> > I had sent, so
> > I would rather not merge this -- the fact that the 4th field just starts
> > disappearing if any of the field goes > 65535 is also hard to understand
> 
> Becuase it is today poor documented. If this is documented, and they know
> they can use, it should be not a problem. If they add "-" with a descriptive
> text, version will be compared automatically as semantic version and they
> can use more fields.
> 
> 	x.y.z.t.w-fancy_name_of_release
> 
> This should work. I consider this issue more a documentation bug as a bug in
> code.

This will fall back to semver, which will silently drop the t and w
components

As far as swupdate code goes (with and without my second patch),
we have for example 1.1.1.1.1-fancy = 1.1.1.2-fancy.
These two are simply not valid semver version strings, so this is
probably fine, but the semver_is_valid() check does not catch it

 
> > +	if (is_oldstyle_version(left_version, &left_u64)
> > +	    && is_oldstyle_version(right_version, &right_u64))
> > +	{
> > +		DEBUG("Comparing old-style versions '%s' <-> '%s'",
> > +		      left_version, right_version);
> >   		TRACE("Parsed: '%llu' <-> '%llu'", left_u64, right_u64);
> >   		if (left_u64 < right_u64)
> > 
> 
> Well, why do you think this should not be merged ? This guarantees that is
> fully backward compatible with the past. I have no idea how many projects
> are out relying on the current behavior, and a breakage can cause a lot of
> headaches.

This is not 100% compatible because of the dropped 4th field in semver.

comparing 70000.1.1.1 and 70000.1.1.2 used to work but will no longer
work with this patch because of the overflow check.
The patch would need to also check if there are four fields and carry on
with the conversion even in case of overflow in this case; I didn't do
it because that adds even more complexity, but I guess it can be done if
we care about it.

Anything with 3 fields should be 99% compatible (last 1% is if someone
relied on the overflow behaviour, but I'm happy to ignore these :-D)


Ultimately I don't think there is an easy way that preserves
compatibility, but as long as things are clearly documented I can define
safeguards for my customers and am happy with anything, so leaving the
decision to you.
Stefano Babic Oct. 18, 2021, 9:05 a.m. UTC | #3
Hi Dominique,

On 18.10.21 01:28, Dominique Martinet wrote:
> Stefano Babic wrote on Fri, Oct 15, 2021 at 11:17:54AM +0200:
>>> So here is a version that checks directly that each field does not
>>> overflow 16 bits and falls back to semver.
>>> Honestly this looks more complicated than the first patch
>>
>> The most important think IMHO is to make very clear which schemas SWUpdate
>> supports for versioning. I am aware that we cannot support any versioning
>> schema a customer or a company are brought, and we have to expose what
>> SWUpdate supports when it starts to make version comparisons.
> 
> Agreed.
> 
>> I will take over the documentation part, and I will add a chapter on this.
>> At the end, a user should know that they can use:
>>
>> 	- a x.y.z.t schema, with just plain numbers in range 0..65535
>> 	- a semantic version (with link to documentation), alphanumeric.
>>
>> Any other versioning schema can be used in "version" attribute, but leads to
>> unpredictable results. If rules are well defined, it is easier for users to
>> follow them.
> 
> Thanks!
> 
>>> I had sent, so
>>> I would rather not merge this -- the fact that the 4th field just starts
>>> disappearing if any of the field goes > 65535 is also hard to understand
>>
>> Becuase it is today poor documented. If this is documented, and they know
>> they can use, it should be not a problem. If they add "-" with a descriptive
>> text, version will be compared automatically as semantic version and they
>> can use more fields.
>>
>> 	x.y.z.t.w-fancy_name_of_release
>>
>> This should work. I consider this issue more a documentation bug as a bug in
>> code.
> 
> This will fall back to semver, which will silently drop the t and w
> components
> 
> As far as swupdate code goes (with and without my second patch),
> we have for example 1.1.1.1.1-fancy = 1.1.1.2-fancy.
> These two are simply not valid semver version strings, so this is
> probably fine, but the semver_is_valid() check does not catch it
> 
>   
>>> +	if (is_oldstyle_version(left_version, &left_u64)
>>> +	    && is_oldstyle_version(right_version, &right_u64))
>>> +	{
>>> +		DEBUG("Comparing old-style versions '%s' <-> '%s'",
>>> +		      left_version, right_version);
>>>    		TRACE("Parsed: '%llu' <-> '%llu'", left_u64, right_u64);
>>>    		if (left_u64 < right_u64)
>>>
>>
>> Well, why do you think this should not be merged ? This guarantees that is
>> fully backward compatible with the past. I have no idea how many projects
>> are out relying on the current behavior, and a breakage can cause a lot of
>> headaches.
> 
> This is not 100% compatible because of the dropped 4th field in semver.
> 
> comparing 70000.1.1.1 and 70000.1.1.2 used to work but will no longer
> work with this patch because of the overflow check.

This is correct, but having the check is an improvement and better 
states that each field is 16 bit wide.

> The patch would need to also check if there are four fields and carry on
> with the conversion even in case of overflow in this case; I didn't do
> it because that adds even more complexity,

Agree, let's things simple until not explicitely requzired.

> but I guess it can be done if
> we care about it.
> 
> Anything with 3 fields should be 99% compatible (last 1% is if someone
> relied on the overflow behaviour, but I'm happy to ignore these :-D)

I will ignore them, too. 3 fields is enough for quite all applications, 
and as I said, we cannottake care of any different versioning schema 
that customer / user can invent.

> 
> 
> Ultimately I don't think there is an easy way that preserves
> compatibility, but as long as things are clearly documented I can define
> safeguards for my customers and am happy with anything,

Agree. The behavior was very poor documented, and this can lead to 
misunderstanding. I think it is not a problem for most customers to 
follow the rules if these are clear and well explained.

> so leaving the
> decision to you.
> 

I will merge your patch.

Best regards,
Stefano
diff mbox series

Patch

diff --git a/core/artifacts_versions.c b/core/artifacts_versions.c
index 433125862014..14abdb420498 100644
--- a/core/artifacts_versions.c
+++ b/core/artifacts_versions.c
@@ -147,19 +147,6 @@  void get_sw_versions(swupdate_cfg_handle  __attribute__ ((__unused__))*handle,
 }
 #endif
 
-static const char ACCEPTED_CHARS[] = "0123456789.";
-
-static bool is_oldstyle_version(const char* version_string)
-{
-	while (*version_string)
-	{
-		if (strchr(ACCEPTED_CHARS, *version_string) == NULL)
-			return false;
-		++version_string;
-	}
-	return true;
-}
-
 /*
  * convert a version string into a number
  * version string is in the format:
@@ -170,7 +157,7 @@  static bool is_oldstyle_version(const char* version_string)
  * Also major.minor or major.minor.revision are allowed
  * The conversion generates a 64 bit value that can be compared
  */
-static __u64 version_to_number(const char *version_string)
+static bool version_to_number(const char *version_string, __u64 *version_number)
 {
 	char **versions = NULL;
 	char **ver;
@@ -178,22 +165,43 @@  static __u64 version_to_number(const char *version_string)
 	__u64 version = 0;
 
 	versions = string_split(version_string, '.');
-	for (ver = versions; *ver != NULL; ver++, count ++) {
+	for (ver = versions; *ver != NULL; ver++, count++) {
 		if (count < 4) {
 			unsigned long int fld = strtoul(*ver, NULL, 10);
 			/* check for return of strtoul, mandatory */
-			if (fld != ULONG_MAX) {
-				fld &= 0xffff;
-				version = (version << 16) | fld;
+			if (fld > 0xffff) {
+				TRACE("Version %s had an element > 65535, falling back to semver",
+				      version_string);
+				return false;
 			}
+			version = (version << 16) | fld;
 		}
 		free(*ver);
 	}
-	if ((count < 4) && (count > 0))
+	if (count >= 4) {
+		TRACE("Version %s had more than 4 numbers, trailing numbers will be ignored",
+		      version_string);
+	} else if (count > 0) {
 		version <<= 16 * (4 - count);
+	}
 	free(versions);
 
-	return version;
+	*version_number = version;
+	return true;
+}
+
+static const char ACCEPTED_CHARS[] = "0123456789.";
+
+static bool is_oldstyle_version(const char *version_string, __u64 *version_number)
+{
+	const char *ver = version_string;
+	while (*ver)
+	{
+		if (strchr(ACCEPTED_CHARS, *ver) == NULL)
+			return false;
+		++ver;
+	}
+	return version_to_number(version_string, version_number);;
 }
 
 /*
@@ -209,12 +217,15 @@  static __u64 version_to_number(const char *version_string)
  */
 int compare_versions(const char* left_version, const char* right_version)
 {
-	if (is_oldstyle_version(left_version) && is_oldstyle_version(right_version))
-	{
-		__u64 left_u64 = version_to_number(left_version);
-		__u64 right_u64 = version_to_number(right_version);
 
-		DEBUG("Comparing old-style versions '%s' <-> '%s'", left_version, right_version);
+	__u64 left_u64;
+	__u64 right_u64;
+
+	if (is_oldstyle_version(left_version, &left_u64)
+	    && is_oldstyle_version(right_version, &right_u64))
+	{
+		DEBUG("Comparing old-style versions '%s' <-> '%s'",
+		      left_version, right_version);
 		TRACE("Parsed: '%llu' <-> '%llu'", left_u64, right_u64);
 
 		if (left_u64 < right_u64)