mbox series

[v2,0/5] Improve performances and feedback of different

Message ID 20190719143556.14907-1-victor.huesca@bootlin.com
Headers show
Series Improve performances and feedback of different | expand

Message

Victor Huesca July 19, 2019, 2:35 p.m. UTC
This patch serie extends the existing idea of making HTTP requests in
parallel via the multiprocessing module. This allows way better
performances (x5~10) at no cost.

This patch also implements helper function to ease providing feedback
about the current state for this time consuming functions.

The 'package_make_init_info' function is slightly modified to call make
only once (as suggested by Thomas).

Finaly an option to reduce verbosity is introduced since the per-package
notification flouds the standard output.

Changes v1 --> v2:
  - Replace multiprocessing by multiprocess (a fork)
  - Add progress feedback when checking URLs
  - Replace the multiple calls to make by a single one
  - Add an option to reduce versbosity (hide per-package notification)


Victor Huesca (5):
  support/scripts/pkg-stats: Use the 'multiprocess' fork instead of
    'multiprocessing'
  support/scripts/pkg-stats: retrieve packages latest version using
    processes
  support/scripts/pkg-stats: add current progress in 'check_url_status'
  support/scripts/pkg-stats: improve 'package_init_make_info'
  support/scripts/pkg-stats: add option to reduce verbosity

 support/scripts/pkg-stats | 193 +++++++++++++++++++++-----------------
 1 file changed, 108 insertions(+), 85 deletions(-)

Comments

Thomas Petazzoni Aug. 1, 2019, 12:26 p.m. UTC | #1
Hello,

Thanks for this work on making pkg-stats faster, very nice!

On Fri, 19 Jul 2019 16:35:51 +0200
Victor Huesca <victor.huesca@bootlin.com> wrote:

> Victor Huesca (5):
>   support/scripts/pkg-stats: Use the 'multiprocess' fork instead of
>     'multiprocessing'

As said during our live discussion, I am not a big fan of using
multiprocess instead of multiprocessing. multiprocessing is in the
Python standard library, while multiprocess is an external module that
isn't even packaged in Fedora. Considering that the main (only?) reason
for using multiprocess over multiprocessing is simply to improve the
script logging, I don't think it's worth it.

>   support/scripts/pkg-stats: retrieve packages latest version using
>     processes
>   support/scripts/pkg-stats: add current progress in 'check_url_status'
>   support/scripts/pkg-stats: improve 'package_init_make_info'

I've however applied this particular patch, that is independent.

The other 4 patches are marked as Changes Requested.

Thanks!

Thomas