ASoC: core: Don't schedule DAPM work if already in target state
diff mbox series

Message ID 1534520143-29266-1-git-send-email-jonathanh@nvidia.com
State New
Headers show
Series
  • ASoC: core: Don't schedule DAPM work if already in target state
Related show

Commit Message

Jon Hunter Aug. 17, 2018, 3:35 p.m. UTC
When dapm_power_widgets() is called, the dapm_pre_sequence_async() and
dapm_post_sequence_async() functions are scheduled for all DAPM contexts
(apart from the card DAPM context) regardless of whether the DAPM
context is already in the desired state. The overhead of this is not
insignificant and the more DAPM contexts there are the more overhead
there is.

For example, on the Tegra124 Jetson TK1, when profiling the time taken
to execute the dapm_power_widgets() the following times were observed.

  Times for function dapm_power_widgets() are (us):
     Min 23, Ave 190, Max 434, Count 39

Here 'Count' is the number of times that dapm_power_widgets() has been
called. Please note that the above time were measured using ktime_get()
to log the time on entry and exit from dapm_power_widgets(). So it
should be noted that these times may not be purely the time take to
execute this function if it is preempted. However, after applying this
patch and measuring the time taken to execute dapm_power_widgets() again
a significant improvement is seen as shown below.

  Times for function dapm_power_widgets() are (us):
     Min 4, Ave 16, Max 82, Count 39

Therefore, optimise the dapm_power_widgets() function by only scheduling
the dapm_pre/post_sequence_async() work if the DAPM context is not in
the desired state.

Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
---
 sound/soc/soc-dapm.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Charles Keepax Aug. 28, 2018, 10:39 a.m. UTC | #1
On Fri, Aug 17, 2018 at 04:35:43PM +0100, Jon Hunter wrote:
> When dapm_power_widgets() is called, the dapm_pre_sequence_async() and
> dapm_post_sequence_async() functions are scheduled for all DAPM contexts
> (apart from the card DAPM context) regardless of whether the DAPM
> context is already in the desired state. The overhead of this is not
> insignificant and the more DAPM contexts there are the more overhead
> there is.
> 
> For example, on the Tegra124 Jetson TK1, when profiling the time taken
> to execute the dapm_power_widgets() the following times were observed.
> 
>   Times for function dapm_power_widgets() are (us):
>      Min 23, Ave 190, Max 434, Count 39
> 
> Here 'Count' is the number of times that dapm_power_widgets() has been
> called. Please note that the above time were measured using ktime_get()
> to log the time on entry and exit from dapm_power_widgets(). So it
> should be noted that these times may not be purely the time take to
> execute this function if it is preempted. However, after applying this
> patch and measuring the time taken to execute dapm_power_widgets() again
> a significant improvement is seen as shown below.
> 
>   Times for function dapm_power_widgets() are (us):
>      Min 4, Ave 16, Max 82, Count 39
> 
> Therefore, optimise the dapm_power_widgets() function by only scheduling
> the dapm_pre/post_sequence_async() work if the DAPM context is not in
> the desired state.
> 
> Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
> ---

Looks ok to me:

Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>

Although that said the performance increase is pretty hard to
measure on my systems.

Thanks,
Charles
Jon Hunter Sept. 3, 2018, 2:45 p.m. UTC | #2
On 28/08/18 11:39, Charles Keepax wrote:
> On Fri, Aug 17, 2018 at 04:35:43PM +0100, Jon Hunter wrote:
>> When dapm_power_widgets() is called, the dapm_pre_sequence_async() and
>> dapm_post_sequence_async() functions are scheduled for all DAPM contexts
>> (apart from the card DAPM context) regardless of whether the DAPM
>> context is already in the desired state. The overhead of this is not
>> insignificant and the more DAPM contexts there are the more overhead
>> there is.
>>
>> For example, on the Tegra124 Jetson TK1, when profiling the time taken
>> to execute the dapm_power_widgets() the following times were observed.
>>
>>   Times for function dapm_power_widgets() are (us):
>>      Min 23, Ave 190, Max 434, Count 39
>>
>> Here 'Count' is the number of times that dapm_power_widgets() has been
>> called. Please note that the above time were measured using ktime_get()
>> to log the time on entry and exit from dapm_power_widgets(). So it
>> should be noted that these times may not be purely the time take to
>> execute this function if it is preempted. However, after applying this
>> patch and measuring the time taken to execute dapm_power_widgets() again
>> a significant improvement is seen as shown below.
>>
>>   Times for function dapm_power_widgets() are (us):
>>      Min 4, Ave 16, Max 82, Count 39
>>
>> Therefore, optimise the dapm_power_widgets() function by only scheduling
>> the dapm_pre/post_sequence_async() work if the DAPM context is not in
>> the desired state.
>>
>> Signed-off-by: Jon Hunter <jonathanh@nvidia.com>
>> ---
> 
> Looks ok to me:
> 
> Reviewed-by: Charles Keepax <ckeepax@opensource.cirrus.com>
> 
> Although that said the performance increase is pretty hard to
> measure on my systems.

If you can enable the function graph tracer, then you should be again to
profile the dapm_power_widgets() function with it as it will give you a
time for how long the function took [0].

Cheers
Jon

[0] https://lwn.net/Articles/370423/

Patch
diff mbox series

diff --git a/sound/soc/soc-dapm.c b/sound/soc/soc-dapm.c
index 461d951917c0..2e9231aeabb9 100644
--- a/sound/soc/soc-dapm.c
+++ b/sound/soc/soc-dapm.c
@@ -1953,7 +1953,7 @@  static int dapm_power_widgets(struct snd_soc_card *card, int event)
 	dapm_pre_sequence_async(&card->dapm, 0);
 	/* Run other bias changes in parallel */
 	list_for_each_entry(d, &card->dapm_list, list) {
-		if (d != &card->dapm)
+		if (d != &card->dapm && d->bias_level != d->target_bias_level)
 			async_schedule_domain(dapm_pre_sequence_async, d,
 						&async_domain);
 	}
@@ -1977,7 +1977,7 @@  static int dapm_power_widgets(struct snd_soc_card *card, int event)
 
 	/* Run all the bias changes in parallel */
 	list_for_each_entry(d, &card->dapm_list, list) {
-		if (d != &card->dapm)
+		if (d != &card->dapm && d->bias_level != d->target_bias_level)
 			async_schedule_domain(dapm_post_sequence_async, d,
 						&async_domain);
 	}