diff mbox series

[v2,1/1] support/scripts/pkg-stats: fix/improve git hash sorting

Message ID 20240305093334.2233237-1-sen@hastings.org
State Accepted
Headers show
Series [v2,1/1] support/scripts/pkg-stats: fix/improve git hash sorting | expand

Commit Message

Sen Hastings March 5, 2024, 9:33 a.m. UTC
sortGrid()'s handling of git hashes and other large hex numbers
has been inconsistent, they can be detected as strings or numbers
depending on what type of character they start with.
This patch fixes the behaviour by using a regex to capture everything
that looks like a big hex number and treat it as a string.
This means when you sort by current version ascending all the version
strings with big hex numbers should show up first, sorted 0-9,a-f.

First we check for a string length >= 39, and then apply a regex
to return an array with every char from that string that matched
the regex. If the length of this array is still >= 39 we can assume
we are looking at something containing a git hash.

The reason why the length is defined as ">= 39" and not "40" or
"39 or 40" is twofold:

Firstly, 39 was chosen as a minimum to match stuff with 39 char git
hashes, like the rockchip-mali package.

Secondly, there is no max because we actually want to catch not
just explicitly git hashes, but any verson string with big gnarly
hex numbers in it.
Stuff like: "1.4.2-168-ged3039cdbeeb28fc0011c3585d8f7dfb91038292"

Why? Well, the idea is less about git hashes and sorting
and more about grouping similarly formatted version strings.

It would be impossble (or at least annoyingly complicated) and of
dubious utility to get a real sequential sort out of the
current version column, so the attempt here is to at the very
least collect all the similarly formatted things together.

This isn't perfect, but it's a (arguably) more useful sorted
output than before.

A demo is available here:
https://sen-h.codeberg.page/pkg-stats-demos/@pages/fix-improve-git-hash-sorting.html

Signed-off-by: Sen Hastings <sen@hastings.org>
---
Changes v1 -> v2:
  - more detailed commit log (requested by Yann E. MORIN)
---
 support/scripts/pkg-stats | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Arnout Vandecappelle April 7, 2024, 4:03 p.m. UTC | #1
On 05/03/2024 10:33, Sen Hastings wrote:
> sortGrid()'s handling of git hashes and other large hex numbers
> has been inconsistent, they can be detected as strings or numbers
> depending on what type of character they start with.
> This patch fixes the behaviour by using a regex to capture everything
> that looks like a big hex number and treat it as a string.
> This means when you sort by current version ascending all the version
> strings with big hex numbers should show up first, sorted 0-9,a-f.
> 
> First we check for a string length >= 39, and then apply a regex
> to return an array with every char from that string that matched
> the regex. If the length of this array is still >= 39 we can assume
> we are looking at something containing a git hash.
> 
> The reason why the length is defined as ">= 39" and not "40" or
> "39 or 40" is twofold:
> 
> Firstly, 39 was chosen as a minimum to match stuff with 39 char git
> hashes, like the rockchip-mali package.
> 
> Secondly, there is no max because we actually want to catch not
> just explicitly git hashes, but any verson string with big gnarly
> hex numbers in it.
> Stuff like: "1.4.2-168-ged3039cdbeeb28fc0011c3585d8f7dfb91038292"
> 
> Why? Well, the idea is less about git hashes and sorting
> and more about grouping similarly formatted version strings.
> 
> It would be impossble (or at least annoyingly complicated) and of
> dubious utility to get a real sequential sort out of the
> current version column, so the attempt here is to at the very
> least collect all the similarly formatted things together.
> 
> This isn't perfect, but it's a (arguably) more useful sorted
> output than before.
> 
> A demo is available here:
> https://sen-h.codeberg.page/pkg-stats-demos/@pages/fix-improve-git-hash-sorting.html
> 
> Signed-off-by: Sen Hastings <sen@hastings.org>

  Applied to master, thanks.

  That said, sorting by version is of no practical value, because the versions 
of different packages bear absolutely no relation with each other. So maybe it's 
more useful to disable the sorting entirely in the Current version, Latest 
version, and CPE ID columns.

  Regards,
  Arnout

> ---
> Changes v1 -> v2:
>    - more detailed commit log (requested by Yann E. MORIN)
> ---
>   support/scripts/pkg-stats | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
> index 3295eb7a48..4dc1857a9e 100755
> --- a/support/scripts/pkg-stats
> +++ b/support/scripts/pkg-stats
> @@ -741,6 +741,7 @@ addedCSSRules.forEach(rule => styleSheet.insertRule(rule));
>   function sortGrid(sortLabel){
>   	let i = 0;
>   	let pkgSortArray = [], sortedPkgArray = [], pkgStringSortArray = [], pkgNumSortArray = [];
> +	const git_hash_regex = /[a-f,0-9]/gi;
>   	const columnValues = Array.from(document.getElementsByClassName(sortLabel));
>   	const columnName = document.getElementById(sortLabel);
>   	let lastStyle = document.getElementById("sort-css");
> @@ -765,7 +766,9 @@ function sortGrid(sortLabel){
>                   pkgSortArray.push(sortArr);
>           });
>           pkgSortArray.forEach((listing) => {
> -                if ( isNaN(parseInt(listing[1], 10)) ){
> +                if ( listing[1].length >= 39 && listing[1].match(git_hash_regex).length >= 39){
> +                        pkgStringSortArray.push(listing);
> +		} else if ( isNaN(parseInt(listing[1], 10)) ){
>                           pkgStringSortArray.push(listing);
>                   } else {
>                           listing[1] = parseFloat(listing[1]);
diff mbox series

Patch

diff --git a/support/scripts/pkg-stats b/support/scripts/pkg-stats
index 3295eb7a48..4dc1857a9e 100755
--- a/support/scripts/pkg-stats
+++ b/support/scripts/pkg-stats
@@ -741,6 +741,7 @@  addedCSSRules.forEach(rule => styleSheet.insertRule(rule));
 function sortGrid(sortLabel){
 	let i = 0;
 	let pkgSortArray = [], sortedPkgArray = [], pkgStringSortArray = [], pkgNumSortArray = [];
+	const git_hash_regex = /[a-f,0-9]/gi;
 	const columnValues = Array.from(document.getElementsByClassName(sortLabel));
 	const columnName = document.getElementById(sortLabel);
 	let lastStyle = document.getElementById("sort-css");
@@ -765,7 +766,9 @@  function sortGrid(sortLabel){
                 pkgSortArray.push(sortArr);
         });
         pkgSortArray.forEach((listing) => {
-                if ( isNaN(parseInt(listing[1], 10)) ){
+                if ( listing[1].length >= 39 && listing[1].match(git_hash_regex).length >= 39){
+                        pkgStringSortArray.push(listing);
+		} else if ( isNaN(parseInt(listing[1], 10)) ){
                         pkgStringSortArray.push(listing);
                 } else {
                         listing[1] = parseFloat(listing[1]);