diff mbox

graph-depends: optimize remove_transitive_deps()

Message ID 1450174901-30667-1-git-send-email-thomas.petazzoni@free-electrons.com
State Accepted
Headers show

Commit Message

Thomas Petazzoni Dec. 15, 2015, 10:21 a.m. UTC
For large configurations, the execution time of
remove_transitive_deps() becomes really high, due to the number of
nested loops + the is_dep() function being recursive.

For an allyespackageconfig, the remove_extra_deps() function takes 334
seconds to execute, and the overall time to generate the .dot file is
6 minutes and 39 seconds. Here is a timing of the different
graph-depends steps and the overall execution time:

  Getting dependencies:   42.5735 seconds
  Turn deps into a dict:   0.0023 seconds
  Remove extra deps:     334.1542 seconds
  Get version:            22.4919 seconds
  Generate .dot:           0.0197 seconds

  real	6m39.289s
  user	6m16.644s
  sys	0m8.792s

By adding a very simple cache for the results of is_dep(), we bring
down the execution time of the "Remove extra deps" step from 334
seconds to just 4 seconds, reducing the overall execution time to 1
minutes and 10 seconds:

  Getting dependencies:  42.9546 seconds
  Turn deps into a dict:  0.0025 seconds
  Remove extra deps:      4.9643 seconds
  Get version:           22.1865 seconds
  Generate .dot:          0.0207 seconds

  real	1m10.201s
  user	0m47.716s
  sys	0m7.948s

Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
---
 support/scripts/graph-depends | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

Comments

gustavo.zacarias@free-electrons.com Dec. 15, 2015, 11:20 a.m. UTC | #1
On 15/12/15 07:21, Thomas Petazzoni wrote:

> For large configurations, the execution time of
> remove_transitive_deps() becomes really high, due to the number of
> nested loops + the is_dep() function being recursive.
>
> For an allyespackageconfig, the remove_extra_deps() function takes 334
> seconds to execute, and the overall time to generate the .dot file is
> 6 minutes and 39 seconds. Here is a timing of the different
> graph-depends steps and the overall execution time:
>
>    Getting dependencies:   42.5735 seconds
>    Turn deps into a dict:   0.0023 seconds
>    Remove extra deps:     334.1542 seconds
>    Get version:            22.4919 seconds
>    Generate .dot:           0.0197 seconds
>
>    real	6m39.289s
>    user	6m16.644s
>    sys	0m8.792s
>
> By adding a very simple cache for the results of is_dep(), we bring
> down the execution time of the "Remove extra deps" step from 334
> seconds to just 4 seconds, reducing the overall execution time to 1
> minutes and 10 seconds:
>
>    Getting dependencies:  42.9546 seconds
>    Turn deps into a dict:  0.0025 seconds
>    Remove extra deps:      4.9643 seconds
>    Get version:           22.1865 seconds
>    Generate .dot:          0.0207 seconds
>
>    real	1m10.201s
>    user	0m47.716s
>    sys	0m7.948s
>
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>

Tested-by: Gustavo Zacarias <gustavo.zacarias@free-electrons.com>

For a lot of transitive deps (custom package set) the difference is even 
bigger, akin to hours (previous version) vs 2 minutes (patch applied).

Regards.
Yann E. MORIN Dec. 15, 2015, 8:37 p.m. UTC | #2
Thomas, All,

On 2015-12-15 11:21 +0100, Thomas Petazzoni spake thusly:
> For large configurations, the execution time of
> remove_transitive_deps() becomes really high, due to the number of
> nested loops + the is_dep() function being recursive.
> 
> For an allyespackageconfig, the remove_extra_deps() function takes 334
> seconds to execute, and the overall time to generate the .dot file is
> 6 minutes and 39 seconds. Here is a timing of the different
> graph-depends steps and the overall execution time:
> 
>   Getting dependencies:   42.5735 seconds
>   Turn deps into a dict:   0.0023 seconds
>   Remove extra deps:     334.1542 seconds
>   Get version:            22.4919 seconds
>   Generate .dot:           0.0197 seconds
> 
>   real	6m39.289s
>   user	6m16.644s
>   sys	0m8.792s
> 
> By adding a very simple cache for the results of is_dep(), we bring
> down the execution time of the "Remove extra deps" step from 334
> seconds to just 4 seconds, reducing the overall execution time to 1
> minutes and 10 seconds:
> 
>   Getting dependencies:  42.9546 seconds
>   Turn deps into a dict:  0.0025 seconds
>   Remove extra deps:      4.9643 seconds
>   Get version:           22.1865 seconds
>   Generate .dot:          0.0207 seconds
> 
>   real	1m10.201s
>   user	0m47.716s
>   sys	0m7.948s

Wee! :-)

Still, somme comments, see below...

I did use the Python profiler to investigate:
    python -m cProfile -s cumulative support/scripts/graph-depends >foo.list

Unfortunately, it is not possible to dump a text version to a file...  :-(
Or I'm too dumb for that. :-]

> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  support/scripts/graph-depends | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
> 
> diff --git a/support/scripts/graph-depends b/support/scripts/graph-depends
> index 5f77038..d39306e 100755
> --- a/support/scripts/graph-depends
> +++ b/support/scripts/graph-depends
> @@ -237,16 +237,43 @@ for dep in dependencies:
>          dict_deps[dep[0]] = []
>      dict_deps[dep[0]].append(dep[1])
>  
> +# Basic cache for the results of the is_dep() function, in order to
> +# optimize the execution time. The cache is a dict of dict of boolean
> +# values. The key to the primary dict is "pkg", and the key of the
> +# sub-dicts is "pkg2".
> +is_dep_cache = {}
> +
> +def is_dep_cache_insert(pkg, pkg2, val):
> +    if is_dep_cache.has_key(pkg):

    if pkg in is_dep_cache

> +        is_dep_cache[pkg].update({pkg2: val})
> +    else:
> +        is_dep_cache[pkg] = {pkg2: val}
> +
> +# Returns a tuple (r, val), where r is a boolean that indicates if a
> +# value was found in the cache or not, and val being the value found
> +# (only when r is True).
> +def is_dep_cache_lookup(pkg, pkg2):
> +    if is_dep_cache.has_key(pkg):
> +        if is_dep_cache[pkg].has_key(pkg2):
> +            return (True, is_dep_cache[pkg][pkg2])
> +    return (False, False)

I think using exceptions is better:

    return is_dep_cache[pkg][pkg2]

Don't catch any exception, it'll be propagated up as usual, see below...

>  # This function return True if pkg is a dependency (direct or
>  # transitive) of pkg2, dependencies being listed in the deps
>  # dictionary. Returns False otherwise.
>  def is_dep(pkg,pkg2,deps):
> +    (r, val) = is_dep_cache_lookup(pkg, pkg2)
> +    if r:
> +        return val
>      if pkg2 in deps:
>          for p in deps[pkg2]:
>              if pkg == p:
> +                is_dep_cache_insert(pkg, pkg2, True)
>                  return True
>              if is_dep(pkg,p,deps):
> +                is_dep_cache_insert(pkg, pkg2, True)
>                  return True
> +    is_dep_cache_insert(pkg, pkg2, False)
>      return False

Here, I would have dome a bit differently:

  - keep is_dep() untouched
  - rename is_dep() to is_dep_uncached()
  - add is_dep() as such:

    def is_dep(pkg,pkg2,deps):
        try:
            return is_dep_cache_lookup(pkg,pkg2)
        except KeyError:
            val = is_dep_uncached(pkg, pkg2, deps)
            is_dep_cache_insert(pkg, pkg2, val)
            return val

Not really tested, but should work I hope! ;-)

Regards,
Yann E. MORIN.

>  # This function eliminates transitive dependencies; for example, given
> -- 
> 2.6.4
>
Samuel Martin Dec. 15, 2015, 9:23 p.m. UTC | #3
Hi Thomas, Yann, all,

On Tue, Dec 15, 2015 at 9:37 PM, Yann E. MORIN <yann.morin.1998@free.fr> wrote:
> Thomas, All,
>
> On 2015-12-15 11:21 +0100, Thomas Petazzoni spake thusly:
>> For large configurations, the execution time of
>> remove_transitive_deps() becomes really high, due to the number of
>> nested loops + the is_dep() function being recursive.
>>
>> For an allyespackageconfig, the remove_extra_deps() function takes 334
>> seconds to execute, and the overall time to generate the .dot file is
>> 6 minutes and 39 seconds. Here is a timing of the different
>> graph-depends steps and the overall execution time:
>>
>>   Getting dependencies:   42.5735 seconds
>>   Turn deps into a dict:   0.0023 seconds
>>   Remove extra deps:     334.1542 seconds
>>   Get version:            22.4919 seconds
>>   Generate .dot:           0.0197 seconds
>>
>>   real        6m39.289s
>>   user        6m16.644s
>>   sys 0m8.792s
>>
>> By adding a very simple cache for the results of is_dep(), we bring
>> down the execution time of the "Remove extra deps" step from 334
>> seconds to just 4 seconds, reducing the overall execution time to 1
>> minutes and 10 seconds:
>>
>>   Getting dependencies:  42.9546 seconds
>>   Turn deps into a dict:  0.0025 seconds
>>   Remove extra deps:      4.9643 seconds
>>   Get version:           22.1865 seconds
>>   Generate .dot:          0.0207 seconds
>>
>>   real        1m10.201s
>>   user        0m47.716s
>>   sys 0m7.948s

Impressive! :-)

>
> Wee! :-)
>
> Still, somme comments, see below...
>
> I did use the Python profiler to investigate:
>     python -m cProfile -s cumulative support/scripts/graph-depends >foo.list
>
> Unfortunately, it is not possible to dump a text version to a file...  :-(
> Or I'm too dumb for that. :-]
>
>> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
>> ---
>>  support/scripts/graph-depends | 27 +++++++++++++++++++++++++++
>>  1 file changed, 27 insertions(+)
>>
>> diff --git a/support/scripts/graph-depends b/support/scripts/graph-depends
>> index 5f77038..d39306e 100755
>> --- a/support/scripts/graph-depends
>> +++ b/support/scripts/graph-depends
>> @@ -237,16 +237,43 @@ for dep in dependencies:
>>          dict_deps[dep[0]] = []
>>      dict_deps[dep[0]].append(dep[1])
>>
>> +# Basic cache for the results of the is_dep() function, in order to
>> +# optimize the execution time. The cache is a dict of dict of boolean
>> +# values. The key to the primary dict is "pkg", and the key of the
>> +# sub-dicts is "pkg2".
>> +is_dep_cache = {}
>> +
>> +def is_dep_cache_insert(pkg, pkg2, val):
>> +    if is_dep_cache.has_key(pkg):
>
>     if pkg in is_dep_cache
>
>> +        is_dep_cache[pkg].update({pkg2: val})
>> +    else:
>> +        is_dep_cache[pkg] = {pkg2: val}
>> +
>> +# Returns a tuple (r, val), where r is a boolean that indicates if a
>> +# value was found in the cache or not, and val being the value found
>> +# (only when r is True).
>> +def is_dep_cache_lookup(pkg, pkg2):
>> +    if is_dep_cache.has_key(pkg):
>> +        if is_dep_cache[pkg].has_key(pkg2):
>> +            return (True, is_dep_cache[pkg][pkg2])
>> +    return (False, False)
>
> I think using exceptions is better:
>
>     return is_dep_cache[pkg][pkg2]
>
> Don't catch any exception, it'll be propagated up as usual, see below...

Well, since this patch is about performance optimization, exception
should be avoided because they are expensive [1].

But in the end, the choice will also depends on the maintainability of
one or the other option.

>
>>  # This function return True if pkg is a dependency (direct or
>>  # transitive) of pkg2, dependencies being listed in the deps
>>  # dictionary. Returns False otherwise.
>>  def is_dep(pkg,pkg2,deps):
>> +    (r, val) = is_dep_cache_lookup(pkg, pkg2)
>> +    if r:
>> +        return val
>>      if pkg2 in deps:
>>          for p in deps[pkg2]:
>>              if pkg == p:
>> +                is_dep_cache_insert(pkg, pkg2, True)
>>                  return True
>>              if is_dep(pkg,p,deps):
>> +                is_dep_cache_insert(pkg, pkg2, True)
>>                  return True
>> +    is_dep_cache_insert(pkg, pkg2, False)
>>      return False
>
> Here, I would have dome a bit differently:
>
>   - keep is_dep() untouched
>   - rename is_dep() to is_dep_uncached()
>   - add is_dep() as such:
>
>     def is_dep(pkg,pkg2,deps):
>         try:
>             return is_dep_cache_lookup(pkg,pkg2)
>         except KeyError:
>             val = is_dep_uncached(pkg, pkg2, deps)
>             is_dep_cache_insert(pkg, pkg2, val)
>             return val
>
> Not really tested, but should work I hope! ;-)
>
> Regards,
> Yann E. MORIN.
>
>>  # This function eliminates transitive dependencies; for example, given
>> --
>> 2.6.4
>>
>
> --
> .-----------------.--------------------.------------------.--------------------.
> |  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
> | +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
> | +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
> | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
> '------------------------------^-------^------------------^--------------------'
> _______________________________________________
> buildroot mailing list
> buildroot@busybox.net
> http://lists.busybox.net/mailman/listinfo/buildroot

Regards,
Samuel Martin Dec. 15, 2015, 9:27 p.m. UTC | #4
On Tue, Dec 15, 2015 at 10:23 PM, Samuel Martin <s.martin49@gmail.com> wrote:
> Hi Thomas, Yann, all,
>
> On Tue, Dec 15, 2015 at 9:37 PM, Yann E. MORIN <yann.morin.1998@free.fr> wrote:
>> Thomas, All,
>>
>> On 2015-12-15 11:21 +0100, Thomas Petazzoni spake thusly:
>>> For large configurations, the execution time of
>>> remove_transitive_deps() becomes really high, due to the number of
>>> nested loops + the is_dep() function being recursive.
>>>
>>> For an allyespackageconfig, the remove_extra_deps() function takes 334
>>> seconds to execute, and the overall time to generate the .dot file is
>>> 6 minutes and 39 seconds. Here is a timing of the different
>>> graph-depends steps and the overall execution time:
>>>
>>>   Getting dependencies:   42.5735 seconds
>>>   Turn deps into a dict:   0.0023 seconds
>>>   Remove extra deps:     334.1542 seconds
>>>   Get version:            22.4919 seconds
>>>   Generate .dot:           0.0197 seconds
>>>
>>>   real        6m39.289s
>>>   user        6m16.644s
>>>   sys 0m8.792s
>>>
>>> By adding a very simple cache for the results of is_dep(), we bring
>>> down the execution time of the "Remove extra deps" step from 334
>>> seconds to just 4 seconds, reducing the overall execution time to 1
>>> minutes and 10 seconds:
>>>
>>>   Getting dependencies:  42.9546 seconds
>>>   Turn deps into a dict:  0.0025 seconds
>>>   Remove extra deps:      4.9643 seconds
>>>   Get version:           22.1865 seconds
>>>   Generate .dot:          0.0207 seconds
>>>
>>>   real        1m10.201s
>>>   user        0m47.716s
>>>   sys 0m7.948s
>
> Impressive! :-)
>
>>
>> Wee! :-)
>>
>> Still, somme comments, see below...
>>
>> I did use the Python profiler to investigate:
>>     python -m cProfile -s cumulative support/scripts/graph-depends >foo.list
>>
>> Unfortunately, it is not possible to dump a text version to a file...  :-(
>> Or I'm too dumb for that. :-]
>>
>>> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
>>> ---
>>>  support/scripts/graph-depends | 27 +++++++++++++++++++++++++++
>>>  1 file changed, 27 insertions(+)
>>>
>>> diff --git a/support/scripts/graph-depends b/support/scripts/graph-depends
>>> index 5f77038..d39306e 100755
>>> --- a/support/scripts/graph-depends
>>> +++ b/support/scripts/graph-depends
>>> @@ -237,16 +237,43 @@ for dep in dependencies:
>>>          dict_deps[dep[0]] = []
>>>      dict_deps[dep[0]].append(dep[1])
>>>
>>> +# Basic cache for the results of the is_dep() function, in order to
>>> +# optimize the execution time. The cache is a dict of dict of boolean
>>> +# values. The key to the primary dict is "pkg", and the key of the
>>> +# sub-dicts is "pkg2".
>>> +is_dep_cache = {}
>>> +
>>> +def is_dep_cache_insert(pkg, pkg2, val):
>>> +    if is_dep_cache.has_key(pkg):
>>
>>     if pkg in is_dep_cache
>>
>>> +        is_dep_cache[pkg].update({pkg2: val})
>>> +    else:
>>> +        is_dep_cache[pkg] = {pkg2: val}
>>> +
>>> +# Returns a tuple (r, val), where r is a boolean that indicates if a
>>> +# value was found in the cache or not, and val being the value found
>>> +# (only when r is True).
>>> +def is_dep_cache_lookup(pkg, pkg2):
>>> +    if is_dep_cache.has_key(pkg):
>>> +        if is_dep_cache[pkg].has_key(pkg2):
>>> +            return (True, is_dep_cache[pkg][pkg2])
>>> +    return (False, False)
>>
>> I think using exceptions is better:
>>
>>     return is_dep_cache[pkg][pkg2]
>>
>> Don't catch any exception, it'll be propagated up as usual, see below...
>
> Well, since this patch is about performance optimization, exception
> should be avoided because they are expensive [1].
>
> But in the end, the choice will also depends on the maintainability of
> one or the other option.
>

forgot the ref.:
[1] https://mail.python.org/pipermail/tutor/2003-January/019812.html

>>
>>>  # This function return True if pkg is a dependency (direct or
>>>  # transitive) of pkg2, dependencies being listed in the deps
>>>  # dictionary. Returns False otherwise.
>>>  def is_dep(pkg,pkg2,deps):
>>> +    (r, val) = is_dep_cache_lookup(pkg, pkg2)
>>> +    if r:
>>> +        return val
>>>      if pkg2 in deps:
>>>          for p in deps[pkg2]:
>>>              if pkg == p:
>>> +                is_dep_cache_insert(pkg, pkg2, True)
>>>                  return True
>>>              if is_dep(pkg,p,deps):
>>> +                is_dep_cache_insert(pkg, pkg2, True)
>>>                  return True
>>> +    is_dep_cache_insert(pkg, pkg2, False)
>>>      return False
>>
>> Here, I would have dome a bit differently:
>>
>>   - keep is_dep() untouched
>>   - rename is_dep() to is_dep_uncached()
>>   - add is_dep() as such:
>>
>>     def is_dep(pkg,pkg2,deps):
>>         try:
>>             return is_dep_cache_lookup(pkg,pkg2)
>>         except KeyError:
>>             val = is_dep_uncached(pkg, pkg2, deps)
>>             is_dep_cache_insert(pkg, pkg2, val)
>>             return val
>>
>> Not really tested, but should work I hope! ;-)
>>
>> Regards,
>> Yann E. MORIN.
>>
>>>  # This function eliminates transitive dependencies; for example, given
>>> --
>>> 2.6.4
>>>
>>
>> --
>> .-----------------.--------------------.------------------.--------------------.
>> |  Yann E. MORIN  | Real-Time Embedded | /"\ ASCII RIBBON | Erics' conspiracy: |
>> | +33 662 376 056 | Software  Designer | \ / CAMPAIGN     |  ___               |
>> | +33 223 225 172 `------------.-------:  X  AGAINST      |  \e/  There is no  |
>> | http://ymorin.is-a-geek.org/ | _/*\_ | / \ HTML MAIL    |   v   conspiracy.  |
>> '------------------------------^-------^------------------^--------------------'
>> _______________________________________________
>> buildroot mailing list
>> buildroot@busybox.net
>> http://lists.busybox.net/mailman/listinfo/buildroot
>
> Regards,
>
> --
> Samuel
Thomas Petazzoni Dec. 29, 2015, 10:28 p.m. UTC | #5
Hello,

On Tue, 15 Dec 2015 11:21:41 +0100, Thomas Petazzoni wrote:
> For large configurations, the execution time of
> remove_transitive_deps() becomes really high, due to the number of
> nested loops + the is_dep() function being recursive.
> 
> For an allyespackageconfig, the remove_extra_deps() function takes 334
> seconds to execute, and the overall time to generate the .dot file is
> 6 minutes and 39 seconds. Here is a timing of the different
> graph-depends steps and the overall execution time:
> 
>   Getting dependencies:   42.5735 seconds
>   Turn deps into a dict:   0.0023 seconds
>   Remove extra deps:     334.1542 seconds
>   Get version:            22.4919 seconds
>   Generate .dot:           0.0197 seconds
> 
>   real	6m39.289s
>   user	6m16.644s
>   sys	0m8.792s
> 
> By adding a very simple cache for the results of is_dep(), we bring
> down the execution time of the "Remove extra deps" step from 334
> seconds to just 4 seconds, reducing the overall execution time to 1
> minutes and 10 seconds:
> 
>   Getting dependencies:  42.9546 seconds
>   Turn deps into a dict:  0.0025 seconds
>   Remove extra deps:      4.9643 seconds
>   Get version:           22.1865 seconds
>   Generate .dot:          0.0207 seconds
> 
>   real	1m10.201s
>   user	0m47.716s
>   sys	0m7.948s
> 
> Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com>
> ---
>  support/scripts/graph-depends | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)

I've applied an improved version provided by Yann E. Morin,
implementing the various suggestions he has made (mainly using
exceptions, and making the code more pythonic).

Thanks!

Thomas
diff mbox

Patch

diff --git a/support/scripts/graph-depends b/support/scripts/graph-depends
index 5f77038..d39306e 100755
--- a/support/scripts/graph-depends
+++ b/support/scripts/graph-depends
@@ -237,16 +237,43 @@  for dep in dependencies:
         dict_deps[dep[0]] = []
     dict_deps[dep[0]].append(dep[1])
 
+# Basic cache for the results of the is_dep() function, in order to
+# optimize the execution time. The cache is a dict of dict of boolean
+# values. The key to the primary dict is "pkg", and the key of the
+# sub-dicts is "pkg2".
+is_dep_cache = {}
+
+def is_dep_cache_insert(pkg, pkg2, val):
+    if is_dep_cache.has_key(pkg):
+        is_dep_cache[pkg].update({pkg2: val})
+    else:
+        is_dep_cache[pkg] = {pkg2: val}
+
+# Returns a tuple (r, val), where r is a boolean that indicates if a
+# value was found in the cache or not, and val being the value found
+# (only when r is True).
+def is_dep_cache_lookup(pkg, pkg2):
+    if is_dep_cache.has_key(pkg):
+        if is_dep_cache[pkg].has_key(pkg2):
+            return (True, is_dep_cache[pkg][pkg2])
+    return (False, False)
+
 # This function return True if pkg is a dependency (direct or
 # transitive) of pkg2, dependencies being listed in the deps
 # dictionary. Returns False otherwise.
 def is_dep(pkg,pkg2,deps):
+    (r, val) = is_dep_cache_lookup(pkg, pkg2)
+    if r:
+        return val
     if pkg2 in deps:
         for p in deps[pkg2]:
             if pkg == p:
+                is_dep_cache_insert(pkg, pkg2, True)
                 return True
             if is_dep(pkg,p,deps):
+                is_dep_cache_insert(pkg, pkg2, True)
                 return True
+    is_dep_cache_insert(pkg, pkg2, False)
     return False
 
 # This function eliminates transitive dependencies; for example, given