diff mbox series

[01/13] ASoC: soc-pcm: Don't reconnect an already active BE

Message ID 1630056839-6562-2-git-send-email-spujar@nvidia.com
State Changes Requested
Headers show
Series Extend AHUB audio support for Tegra210 and later | expand

Commit Message

Sameer Pujar Aug. 27, 2021, 9:33 a.m. UTC
In some cases, multiple FE components have the same BE component in their
respective DPCM paths. One such example would be a mixer component, which
can receive two or more inputs and sends a mixed output. In such cases,
to avoid reconfiguration of already active DAI (mixer output DAI in this
case), check the BE stream state to filter out the redundancy.

In summary, allow connection of BE if the respective current stream state
is either NEW or CLOSED.

Signed-off-by: Sameer Pujar <spujar@nvidia.com>
---
 sound/soc/soc-pcm.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Pierre-Louis Bossart Sept. 28, 2021, 9:25 p.m. UTC | #1
On 8/27/21 4:33 AM, Sameer Pujar wrote:
> In some cases, multiple FE components have the same BE component in their
> respective DPCM paths. One such example would be a mixer component, which
> can receive two or more inputs and sends a mixed output. In such cases,
> to avoid reconfiguration of already active DAI (mixer output DAI in this
> case), check the BE stream state to filter out the redundancy.
> 
> In summary, allow connection of BE if the respective current stream state
> is either NEW or CLOSED.

This patch breaks our SOF CI tests, ironically in all cases where we
have a mixer with a 'Deep buffer' port! The tests with multiple streams
all fail with this error:

[  124.366400]  Port0 Deep Buffer: ASoC: no backend DAIs enabled for
Port0 Deep Buffer
[  124.366406]  Port0 Deep Buffer: ASoC: dpcm_fe_dai_prepare() failed (-22)

Reverting this patch restore the mixer functionality.

I see multiple problems with this patch:

At a high-level, there's at least a race condition where this BE state
is checked without any protection. That was already a problem that I
highlighted in a recent RFC and still working on, when we have multiple
FEs we can have START/STOP triggers happening concurrently and it's
necessary to serialize these triggers when checking the state, and this
patch increases the 'surface' for this race condition.

But in addition we'd need to agree on what an 'active BE' is. Why can't
we connect a second stream while the first one is already in HW_PARAMS
or PAUSED or STOP? It's perfectly legal in ALSA/ASoC to have multiple
HW_PARAMS calls, and when we reach STOP we have to do a prepare again.

And more fundamentally, the ability to add a second FE on a 'active' BE
in START state is a basic requirement for a mixer, e.g. to play a
notification on one FE while listening to music on another. What needs
to happen is only to make sure that the FE and BE are compatible in
terms of HW_PARAMS and not sending a second TRIGGER_STOP, only checking
the BE NEW or CLOSE state is way too restrictive.

I will agree this sort of mixer use cases is not well supported in
soc-pcm.c, but let's not make it worse than it was before this patch,
shall we?

I can send a revert with the explanations in the commit message if there
is a consensus that this patch needs to be revisited.

[1] https://github.com/thesofproject/linux/pull/3177
[2] https://sof-ci.01.org/linuxpr/PR3177/build6440/devicetest/

> Signed-off-by: Sameer Pujar <spujar@nvidia.com>
> ---
>  sound/soc/soc-pcm.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c
> index 48f71bb..e30cb5a 100644
> --- a/sound/soc/soc-pcm.c
> +++ b/sound/soc/soc-pcm.c
> @@ -1395,6 +1395,10 @@ static int dpcm_add_paths(struct snd_soc_pcm_runtime *fe, int stream,
>  		if (!fe->dpcm[stream].runtime && !fe->fe_compr)
>  			continue;
>  
> +		if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_NEW) &&
> +		    (be->dpcm[stream].state != SND_SOC_DPCM_STATE_CLOSE))
> +			continue;
> +
>  		/* newly connected FE and BE */
>  		err = dpcm_be_connect(fe, be, stream);
>  		if (err < 0) {
>
Sameer Pujar Sept. 29, 2021, 7:43 a.m. UTC | #2
On 9/29/2021 2:55 AM, Pierre-Louis Bossart wrote:
> On 8/27/21 4:33 AM, Sameer Pujar wrote:

[...]

> But in addition we'd need to agree on what an 'active BE' is. Why can't
> we connect a second stream while the first one is already in HW_PARAMS
> or PAUSED or STOP? It's perfectly legal in ALSA/ASoC to have multiple
> HW_PARAMS calls, and when we reach STOP we have to do a prepare again.
>
> And more fundamentally, the ability to add a second FE on a 'active' BE
> in START state is a basic requirement for a mixer, e.g. to play a
> notification on one FE while listening to music on another. What needs
> to happen is only to make sure that the FE and BE are compatible in
> terms of HW_PARAMS and not sending a second TRIGGER_STOP, only checking
> the BE NEW or CLOSE state is way too restrictive.

Sorry for the trouble to your system.

Idea was to avoid reconfiguration of the same BE DAI again, but not to 
stop the provision to add a subsequent FE. In fact I had tested mixing 
of streams coming from 10 different FEs.

In your case, because of this patch, looks like the subsequent FE is not 
finding a BE DAI since it is already active due to a prior FE. The 
reason it works at my end is because the mixer input and output DAIs are 
separated. Any new FE would just configure the mixer input DAI to which 
it is attached and skip already running/configured output DAI. I am just 
curious to know, if originally you were reconfiguring the BE DAI again 
with same parameters (for a second FE) or some additional configuration 
is done?


> I can send a revert with the explanations in the commit message if there
> is a consensus that this patch needs to be revisited.

May be this can be revisited since it appears to be a critical problem 
for your system. But I hope this discussion can be alive on following 
points for a better fix.

1. The original issue at my end was not just a configuration redundancy. 
I realize now that with more stream addition following error print is seen.
    "ASoC: too many users playback at open 4"

    This is because the max DPCM users is capped at 8. Increasing this 
may help (need to see what number is better), but does not address the 
redundancy problem.

2. If reconfiguration of the same BE is not necessary for a subsequent 
FE run, shouldn't we avoid the reconfig itself and somehow avoid FE failure?
Pierre-Louis Bossart Sept. 29, 2021, 2:52 p.m. UTC | #3
>>> But in addition we'd need to agree on what an 'active BE' is. Why can't
>>> we connect a second stream while the first one is already in HW_PARAMS
>>> or PAUSED or STOP? It's perfectly legal in ALSA/ASoC to have multiple
>>> HW_PARAMS calls, and when we reach STOP we have to do a prepare again.
>>>
>>> And more fundamentally, the ability to add a second FE on a 'active' BE
>>> in START state is a basic requirement for a mixer, e.g. to play a
>>> notification on one FE while listening to music on another. What needs
>>> to happen is only to make sure that the FE and BE are compatible in
>>> terms of HW_PARAMS and not sending a second TRIGGER_STOP, only checking
>>> the BE NEW or CLOSE state is way too restrictive.
>>
>> Sorry for the trouble to your system.
>>
>> Idea was to avoid reconfiguration of the same BE DAI again, but not to
>> stop the provision to add a subsequent FE. In fact I had tested mixing
>> of streams coming from 10 different FEs.

Can you describe the sequence that you used to start them? That may be
useful to understand the criteria you used?

>> In your case, because of this patch, looks like the subsequent FE is not
>> finding a BE DAI since it is already active due to a prior FE. The
>> reason it works at my end is because the mixer input and output DAIs are
>> separated. Any new FE would just configure the mixer input DAI to which
>> it is attached and skip already running/configured output DAI. 

If you want to visualize the topology we're using in our 'nocodec'
tests, see

https://sof-ci.01.org/linuxpr/PR3177/build6440/devicetest/?model=ADLP_RVP_NOCODEC&testcase=verify-kernel-boot-log

and use the sof-adl-nocodec.

it's a bog-standard topology with processing entities connected by buffers.

I don't fully understand the notion of mixer input DAI, in our case we
have a bunch of PCM devices connected to a mixer. The mixer is not
directly connected to a DAI.

> The problem as I see is that with this patch one can not connect a new
> FE to a BE which is _not_ in NEW or CLOSE state.
> 
> The FE and BE needs to be connected to have DPCM working and this patch
> makes the code to skip the dpcm_be_connect().
> 
> Consider this simple setup:
> 
> FE1 -->|
>        | --> BE -->
> FE2- ->|
> 
> First we start FE1, dpcm_be_connect(FE1, BE, stream) is made.
> 
> Later FE2 is started but dpcm_be_connect(FE2, BE, stream) would be not
> made because BE is no longer in NEW/CLOSE state.

I share Peter's analysis, there cannot be any restrictions on
connections - at any time. A mixer input might become active and be
added to the mix. We might have a temporary lock to delay new
connections but cannot not reject them outright based on BE state.

>> I am just
>> curious to know, if originally you were reconfiguring the BE DAI again
>> with same parameters (for a second FE) or some additional configuration
>> is done?

That's a different question - and a good one.

In the case of a mixer, the propagation of hw_params is a broken
concept. It leads to the first FE configuring the BE to define its
preferred parameters, e.g. S16_LE format. If later on a second FE is
started which could play S24_LE, the mixer and BE are already configured
to a lower resolution. A mixer should really have its own parameters and
be the start of a new 'domain' - as Lars described it several years ago
at the audio miniconference.

For now, the only restriction that we could enforce is that the BE
cannot be reconfigured after the prepare step.

Note that our DAIs tolerate multiple calls to hw_params. If you have a
case where the hw_params allocates resources, maybe you should consider
moving that allocation to the prepare step, or free them if you detect a
reconfiguration. That would be something needed even outside of the DPCM
scope. Similarly you need to support the case where the DAI hw_free is
called without hw_params ever being called, it's a known sequence that
can happen if the FE hw-params fails.

>>> I can send a revert with the explanations in the commit message if there
>>> is a consensus that this patch needs to be revisited.
>>
>> May be this can be revisited since it appears to be a critical problem
>> for your system. But I hope this discussion can be alive on following
>> points for a better fix.
>>
>> 1. The original issue at my end was not just a configuration redundancy.
>> I realize now that with more stream addition following error print is seen.
>>    "ASoC: too many users playback at open 4"
>>
>>    This is because the max DPCM users is capped at 8. Increasing this
>> may help (need to see what number is better), but does not address the
>> redundancy problem.

we haven't used more than 2 users, but it's already broken at 2 with
race conditions left and right. I am really surprised you managed to
have more than 2 without hitting inconsistent states - our automated
play/stop/pause monkey tests reliably break DPCM in less than 20s.

>> 2. If reconfiguration of the same BE is not necessary for a subsequent
>> FE run, shouldn't we avoid the reconfig itself and somehow avoid FE
>> failure?
> 
> I'm not sure, but it might be possible to just skip the
> dpcm_set_be_update_state(be, stream, SND_SOC_DPCM_UPDATE_BE);
> call at the end of the loop, but the question is under which condition?
> Can a BE asked to be reconfigured when STOP/OPEN/HW_PARAMS?
> 
> Skipping the connect does not sound right for a new FE-BE connection.

The reconfiguration is one problem, but what also happens is that the BE
dailink will see multiple triggers. I've been playing with refcounts to
force consistency and make sure there is only one TRIGGER_START send to
the dailink, and conversely there are cases where the TRIGGER_STOP is
never sent...
Sameer Pujar Sept. 30, 2021, 7:57 a.m. UTC | #4
On 9/29/2021 8:22 PM, Pierre-Louis Bossart wrote:
>>>> But in addition we'd need to agree on what an 'active BE' is. Why can't
>>>> we connect a second stream while the first one is already in HW_PARAMS
>>>> or PAUSED or STOP? It's perfectly legal in ALSA/ASoC to have multiple
>>>> HW_PARAMS calls, and when we reach STOP we have to do a prepare again.
>>>>
>>>> And more fundamentally, the ability to add a second FE on a 'active' BE
>>>> in START state is a basic requirement for a mixer, e.g. to play a
>>>> notification on one FE while listening to music on another. What needs
>>>> to happen is only to make sure that the FE and BE are compatible in
>>>> terms of HW_PARAMS and not sending a second TRIGGER_STOP, only checking
>>>> the BE NEW or CLOSE state is way too restrictive.
>>> Sorry for the trouble to your system.
>>>
>>> Idea was to avoid reconfiguration of the same BE DAI again, but not to
>>> stop the provision to add a subsequent FE. In fact I had tested mixing
>>> of streams coming from 10 different FEs.
> Can you describe the sequence that you used to start them? That may be
> useful to understand the criteria you used?

I have something like this:

FE1  --> Crossbar -> Mixer Input1    |
FE2  --> Crossbar -> Mixer Input2    |
...                                  | --> Mixer Output -->
... |
FE10 --> Crossbar -> Mixer Input10   |

All these FEs are started one after the other. This is an example of 
10x1. Similarly we can have 2x1, 3x1 etc.,
In our system, the crossbar [0] and mixer [1] are separate ASoC 
components. Basically audio paths consist of a group of ASoC components 
which are connected back to back.

[0] 
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git/tree/sound/soc/tegra/tegra210_ahub.c
[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git/tree/sound/soc/tegra/tegra210_mixer.c

>

[...]

> I don't fully understand the notion of mixer input DAI, in our case we
> have a bunch of PCM devices connected to a mixer. The mixer is not
> directly connected to a DAI.

Please see above mixer example. Since mixer is a separate ASoC 
component, it exposes 10 inputs (or DAIs) and outputs. Originally what I 
wanted to do was, for subsequent FE runs (FE2, FE3 ...) mixer output 
need not be configured again and again.

>> The problem as I see is that with this patch one can not connect a new
>> FE to a BE which is _not_ in NEW or CLOSE state.
>>
>> The FE and BE needs to be connected to have DPCM working and this patch
>> makes the code to skip the dpcm_be_connect().
>>
>> Consider this simple setup:
>>
>> FE1 -->|
>>         | --> BE -->
>> FE2- ->|
>>
>> First we start FE1, dpcm_be_connect(FE1, BE, stream) is made.
>>
>> Later FE2 is started but dpcm_be_connect(FE2, BE, stream) would be not
>> made because BE is no longer in NEW/CLOSE state.
> I share Peter's analysis, there cannot be any restrictions on
> connections - at any time. A mixer input might become active and be
> added to the mix. We might have a temporary lock to delay new
> connections but cannot not reject them outright based on BE state.

Yes, I understand how this affects a system like yours. As per mixer 
example above, in our case subsequent FEs always find BE from Crossbar. 
That is why I don't see similar error.

>>> I am just
>>> curious to know, if originally you were reconfiguring the BE DAI again
>>> with same parameters (for a second FE) or some additional configuration
>>> is done?
> That's a different question - and a good one.
>
> In the case of a mixer, the propagation of hw_params is a broken
> concept. It leads to the first FE configuring the BE to define its
> preferred parameters, e.g. S16_LE format. If later on a second FE is
> started which could play S24_LE, the mixer and BE are already configured
> to a lower resolution. A mixer should really have its own parameters and
> be the start of a new 'domain' - as Lars described it several years ago
> at the audio miniconference.

Propagation is one of the problems we want to address and require help 
from DPCM experts. But the scenario you mentioned is a special case 
which need not be supported, because mixer can operate in one 
configuration at a given time and subsequent FEs should agree to the 
already running configuration. However mixer should support both S16_LE 
and S24_LE (whenever possible), but not simultaneously. At least this is 
the expecation from our systems. Yes mixer may require fixup of a 
specific config (we earlier had proposed mixer controls to configure 
mixer parameters, but the idea was disliked), but propagation may help 
avoid fixing up everywhere in the audio path where it is not really 
required. But I don't know how this can be done at the moment.

>
> For now, the only restriction that we could enforce is that the BE
> cannot be reconfigured after the prepare step.
>
> Note that our DAIs tolerate multiple calls to hw_params. If you have a
> case where the hw_params allocates resources, maybe you should consider
> moving that allocation to the prepare step, or free them if you detect a
> reconfiguration. That would be something needed even outside of the DPCM
> scope. Similarly you need to support the case where the DAI hw_free is
> called without hw_params ever being called, it's a known sequence that
> can happen if the FE hw-params fails.

Currently this does not seem to be a problem for us. Patch was to avoid 
reconfiguration which was felt to be redundant for our system.

>>>> I can send a revert with the explanations in the commit message if there
>>>> is a consensus that this patch needs to be revisited.
>>> May be this can be revisited since it appears to be a critical problem
>>> for your system. But I hope this discussion can be alive on following
>>> points for a better fix.
>>>
>>> 1. The original issue at my end was not just a configuration redundancy.
>>> I realize now that with more stream addition following error print is seen.
>>>     "ASoC: too many users playback at open 4"
>>>
>>>     This is because the max DPCM users is capped at 8. Increasing this
>>> may help (need to see what number is better), but does not address the
>>> redundancy problem.
> we haven't used more than 2 users, but it's already broken at 2 with
> race conditions left and right. I am really surprised you managed to
> have more than 2 without hitting inconsistent states - our automated
> play/stop/pause monkey tests reliably break DPCM in less than 20s.

I am not sure what is the exact difference, may be DPCM usage in our 
case is different from what you have. I have mixer tests for different 
combinations (2x1, 3x1 ...), which seem to pass. In general, we want to 
have path like this.

FE -> BE1 -> BE2 -> ... -> BEx

Each BEx could be a mixer, resampler etc., Currently DPCM core sees this 
as multiple BEs for a given FE and that is why multiple "users" are 
reported.

In the interim, may be we can have following patch to keep both systems 
working and keep the discussion going to address the oustanding 
requirements/issues?

diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c
index ab25f99..0fbab50 100644
--- a/sound/soc/soc-pcm.c
+++ b/sound/soc/soc-pcm.c
@@ -1395,7 +1395,13 @@ static int dpcm_add_paths(struct 
snd_soc_pcm_runtime *fe, int stream,
                 if (!fe->dpcm[stream].runtime && !fe->fe_compr)
                         continue;

-               if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_NEW) &&
+               /*
+                * Filter for systems with 'component_chaining' enabled.
+                * This helps to avoid unnecessary re-configuration of an
+                * already active BE on such systems.
+                */
+               if (fe->card->component_chaining &&
+                   (be->dpcm[stream].state != SND_SOC_DPCM_STATE_NEW) &&
                     (be->dpcm[stream].state != SND_SOC_DPCM_STATE_CLOSE))
                         continue;

>
>>> 2. If reconfiguration of the same BE is not necessary for a subsequent
>>> FE run, shouldn't we avoid the reconfig itself and somehow avoid FE
>>> failure?
>> I'm not sure, but it might be possible to just skip the
>> dpcm_set_be_update_state(be, stream, SND_SOC_DPCM_UPDATE_BE);
>> call at the end of the loop, but the question is under which condition?
>> Can a BE asked to be reconfigured when STOP/OPEN/HW_PARAMS?
>>
>> Skipping the connect does not sound right for a new FE-BE connection.
> The reconfiguration is one problem, but what also happens is that the BE
> dailink will see multiple triggers. I've been playing with refcounts to
> force consistency and make sure there is only one TRIGGER_START send to
> the dailink, and conversely there are cases where the TRIGGER_STOP is
> never sent...
Just a thought. FE links have dummy codec DAI and core wants to find a 
real BE when FE is started. May be don't fail a FE when no back end DAI 
is found (and/or find if the same BE is already connected to some FE) 
and the above problem becomes simpler?
Pierre-Louis Bossart Sept. 30, 2021, 2:34 p.m. UTC | #5
>>>>> But in addition we'd need to agree on what an 'active BE' is. Why
>>>>> can't
>>>>> we connect a second stream while the first one is already in HW_PARAMS
>>>>> or PAUSED or STOP? It's perfectly legal in ALSA/ASoC to have multiple
>>>>> HW_PARAMS calls, and when we reach STOP we have to do a prepare again.
>>>>>
>>>>> And more fundamentally, the ability to add a second FE on a
>>>>> 'active' BE
>>>>> in START state is a basic requirement for a mixer, e.g. to play a
>>>>> notification on one FE while listening to music on another. What needs
>>>>> to happen is only to make sure that the FE and BE are compatible in
>>>>> terms of HW_PARAMS and not sending a second TRIGGER_STOP, only
>>>>> checking
>>>>> the BE NEW or CLOSE state is way too restrictive.
>>>> Sorry for the trouble to your system.
>>>>
>>>> Idea was to avoid reconfiguration of the same BE DAI again, but not to
>>>> stop the provision to add a subsequent FE. In fact I had tested mixing
>>>> of streams coming from 10 different FEs.
>> Can you describe the sequence that you used to start them? That may be
>> useful to understand the criteria you used?
> 
> I have something like this:
> 
> FE1  --> Crossbar -> Mixer Input1    |
> FE2  --> Crossbar -> Mixer Input2    |
> ...                                  | --> Mixer Output -->
> ... |
> FE10 --> Crossbar -> Mixer Input10   |
> 
> All these FEs are started one after the other. This is an example of
> 10x1. Similarly we can have 2x1, 3x1 etc.,
> In our system, the crossbar [0] and mixer [1] are separate ASoC
> components. Basically audio paths consist of a group of ASoC components
> which are connected back to back.

Not following. Can you explain how starting FE1 does not change the
state of the mixer output then?

Or is each 'Crossbar' instance a full-blown BE? In that case you have a
1:1 mapping between FE and BE, a *really* simple topology...

>> I don't fully understand the notion of mixer input DAI, in our case we
>> have a bunch of PCM devices connected to a mixer. The mixer is not
>> directly connected to a DAI.
> 
> Please see above mixer example. Since mixer is a separate ASoC
> component, it exposes 10 inputs (or DAIs) and outputs. Originally what I
> wanted to do was, for subsequent FE runs (FE2, FE3 ...) mixer output
> need not be configured again and again.
> 
>>> The problem as I see is that with this patch one can not connect a new
>>> FE to a BE which is _not_ in NEW or CLOSE state.
>>>
>>> The FE and BE needs to be connected to have DPCM working and this patch
>>> makes the code to skip the dpcm_be_connect().
>>>
>>> Consider this simple setup:
>>>
>>> FE1 -->|
>>>         | --> BE -->
>>> FE2- ->|
>>>
>>> First we start FE1, dpcm_be_connect(FE1, BE, stream) is made.
>>>
>>> Later FE2 is started but dpcm_be_connect(FE2, BE, stream) would be not
>>> made because BE is no longer in NEW/CLOSE state.
>> I share Peter's analysis, there cannot be any restrictions on
>> connections - at any time. A mixer input might become active and be
>> added to the mix. We might have a temporary lock to delay new
>> connections but cannot not reject them outright based on BE state.
> 
> Yes, I understand how this affects a system like yours. As per mixer
> example above, in our case subsequent FEs always find BE from Crossbar.
> That is why I don't see similar error.

Not following either.

>>>> I am just
>>>> curious to know, if originally you were reconfiguring the BE DAI again
>>>> with same parameters (for a second FE) or some additional configuration
>>>> is done?
>> That's a different question - and a good one.
>>
>> In the case of a mixer, the propagation of hw_params is a broken
>> concept. It leads to the first FE configuring the BE to define its
>> preferred parameters, e.g. S16_LE format. If later on a second FE is
>> started which could play S24_LE, the mixer and BE are already configured
>> to a lower resolution. A mixer should really have its own parameters and
>> be the start of a new 'domain' - as Lars described it several years ago
>> at the audio miniconference.
> 
> Propagation is one of the problems we want to address and require help
> from DPCM experts. But the scenario you mentioned is a special case
> which need not be supported, because mixer can operate in one
> configuration at a given time and subsequent FEs should agree to the
> already running configuration. However mixer should support both S16_LE
> and S24_LE (whenever possible), but not simultaneously. At least this is
> the expecation from our systems. Yes mixer may require fixup of a
> specific config (we earlier had proposed mixer controls to configure
> mixer parameters, but the idea was disliked), but propagation may help
> avoid fixing up everywhere in the audio path where it is not really
> required. But I don't know how this can be done at the moment.

What I am saying is that the mixer should be pre-configured with the
desired resolution/sample rate, and some adaptation needs to happen if
the FE provides data in a different format.

This is similar to what sound servers typically do on their sinks, they
define ONE configuration. Dynamic changes are annoying and result in
corner cases where the quality can vary depending on which FE is started
first.

>>>> For now, the only restriction that we could enforce is that the BE
>> cannot be reconfigured after the prepare step.
>>
>> Note that our DAIs tolerate multiple calls to hw_params. If you have a
>> case where the hw_params allocates resources, maybe you should consider
>> moving that allocation to the prepare step, or free them if you detect a
>> reconfiguration. That would be something needed even outside of the DPCM
>> scope. Similarly you need to support the case where the DAI hw_free is
>> called without hw_params ever being called, it's a known sequence that
>> can happen if the FE hw-params fails.
> 
> Currently this does not seem to be a problem for us. Patch was to avoid
> reconfiguration which was felt to be redundant for our system.
> 
>>>>> I can send a revert with the explanations in the commit message if
>>>>> there
>>>>> is a consensus that this patch needs to be revisited.
>>>> May be this can be revisited since it appears to be a critical problem
>>>> for your system. But I hope this discussion can be alive on following
>>>> points for a better fix.
>>>>
>>>> 1. The original issue at my end was not just a configuration
>>>> redundancy.
>>>> I realize now that with more stream addition following error print
>>>> is seen.
>>>>     "ASoC: too many users playback at open 4"
>>>>
>>>>     This is because the max DPCM users is capped at 8. Increasing this
>>>> may help (need to see what number is better), but does not address the
>>>> redundancy problem.
>> we haven't used more than 2 users, but it's already broken at 2 with
>> race conditions left and right. I am really surprised you managed to
>> have more than 2 without hitting inconsistent states - our automated
>> play/stop/pause monkey tests reliably break DPCM in less than 20s.
> 
> I am not sure what is the exact difference, may be DPCM usage in our
> case is different from what you have. I have mixer tests for different
> combinations (2x1, 3x1 ...), which seem to pass. In general, we want to
> have path like this.
> 
> FE -> BE1 -> BE2 -> ... -> BEx
> 
> Each BEx could be a mixer, resampler etc., Currently DPCM core sees this
> as multiple BEs for a given FE and that is why multiple "users" are
> reported.

This sort of flow vastly exceeds the capabilities of DPCM, which is
already badly broken with one BE and 2 FEs... I think what you want is
what Lars described at the audio miniconf with 'domains'.

> In the interim, may be we can have following patch to keep both systems
> working and keep the discussion going to address the oustanding
> requirements/issues?
> 
> diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c
> index ab25f99..0fbab50 100644
> --- a/sound/soc/soc-pcm.c
> +++ b/sound/soc/soc-pcm.c
> @@ -1395,7 +1395,13 @@ static int dpcm_add_paths(struct
> snd_soc_pcm_runtime *fe, int stream,
>                 if (!fe->dpcm[stream].runtime && !fe->fe_compr)
>                         continue;
> 
> -               if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_NEW) &&
> +               /*
> +                * Filter for systems with 'component_chaining' enabled.
> +                * This helps to avoid unnecessary re-configuration of an
> +                * already active BE on such systems.
> +                */
> +               if (fe->card->component_chaining &&
> +                   (be->dpcm[stream].state != SND_SOC_DPCM_STATE_NEW) &&
>                     (be->dpcm[stream].state != SND_SOC_DPCM_STATE_CLOSE))
>                         continue;

that wouldn't work. We need to support the STOP and START cases as well.


>>>> 2. If reconfiguration of the same BE is not necessary for a subsequent
>>>> FE run, shouldn't we avoid the reconfig itself and somehow avoid FE
>>>> failure?
>>> I'm not sure, but it might be possible to just skip the
>>> dpcm_set_be_update_state(be, stream, SND_SOC_DPCM_UPDATE_BE);
>>> call at the end of the loop, but the question is under which condition?
>>> Can a BE asked to be reconfigured when STOP/OPEN/HW_PARAMS?
>>>
>>> Skipping the connect does not sound right for a new FE-BE connection.
>> The reconfiguration is one problem, but what also happens is that the BE
>> dailink will see multiple triggers. I've been playing with refcounts to
>> force consistency and make sure there is only one TRIGGER_START send to
>> the dailink, and conversely there are cases where the TRIGGER_STOP is
>> never sent...
> Just a thought. FE links have dummy codec DAI and core wants to find a
> real BE when FE is started. May be don't fail a FE when no back end DAI
> is found (and/or find if the same BE is already connected to some FE)
> and the above problem becomes simpler?

That would be just moving the problem. In our case we would be silently
playing on a dummy output just because the correct output was not found
due to state handling issues.
Sameer Pujar Sept. 30, 2021, 3:35 p.m. UTC | #6
On 9/30/2021 8:04 PM, Pierre-Louis Bossart wrote:
>>>>>> But in addition we'd need to agree on what an 'active BE' is. Why
>>>>>> can't
>>>>>> we connect a second stream while the first one is already in HW_PARAMS
>>>>>> or PAUSED or STOP? It's perfectly legal in ALSA/ASoC to have multiple
>>>>>> HW_PARAMS calls, and when we reach STOP we have to do a prepare again.
>>>>>>
>>>>>> And more fundamentally, the ability to add a second FE on a
>>>>>> 'active' BE
>>>>>> in START state is a basic requirement for a mixer, e.g. to play a
>>>>>> notification on one FE while listening to music on another. What needs
>>>>>> to happen is only to make sure that the FE and BE are compatible in
>>>>>> terms of HW_PARAMS and not sending a second TRIGGER_STOP, only
>>>>>> checking
>>>>>> the BE NEW or CLOSE state is way too restrictive.
>>>>> Sorry for the trouble to your system.
>>>>>
>>>>> Idea was to avoid reconfiguration of the same BE DAI again, but not to
>>>>> stop the provision to add a subsequent FE. In fact I had tested mixing
>>>>> of streams coming from 10 different FEs.
>>> Can you describe the sequence that you used to start them? That may be
>>> useful to understand the criteria you used?
>> I have something like this:
>>
>> FE1  --> Crossbar -> Mixer Input1    |
>> FE2  --> Crossbar -> Mixer Input2    |
>> ...                                  | --> Mixer Output -->
>> ... |
>> FE10 --> Crossbar -> Mixer Input10   |
>>
>> All these FEs are started one after the other. This is an example of
>> 10x1. Similarly we can have 2x1, 3x1 etc.,
>> In our system, the crossbar [0] and mixer [1] are separate ASoC
>> components. Basically audio paths consist of a group of ASoC components
>> which are connected back to back.
> Not following. Can you explain how starting FE1 does not change the
> state of the mixer output then?
>
> Or is each 'Crossbar' instance a full-blown BE? In that case you have a
> 1:1 mapping between FE and BE, a *really* simple topology...

Yes 'Crossbar' exposes multiple ports and it is 1:1 mapping with FE. 
Starting FE1 does configure mixer output.

>>> I don't fully understand the notion of mixer input DAI, in our case we
>>> have a bunch of PCM devices connected to a mixer. The mixer is not
>>> directly connected to a DAI.
>> Please see above mixer example. Since mixer is a separate ASoC
>> component, it exposes 10 inputs (or DAIs) and outputs. Originally what I
>> wanted to do was, for subsequent FE runs (FE2, FE3 ...) mixer output
>> need not be configured again and again.
>>
>>>> The problem as I see is that with this patch one can not connect a new
>>>> FE to a BE which is _not_ in NEW or CLOSE state.
>>>>
>>>> The FE and BE needs to be connected to have DPCM working and this patch
>>>> makes the code to skip the dpcm_be_connect().
>>>>
>>>> Consider this simple setup:
>>>>
>>>> FE1 -->|
>>>>          | --> BE -->
>>>> FE2- ->|
>>>>
>>>> First we start FE1, dpcm_be_connect(FE1, BE, stream) is made.
>>>>
>>>> Later FE2 is started but dpcm_be_connect(FE2, BE, stream) would be not
>>>> made because BE is no longer in NEW/CLOSE state.
>>> I share Peter's analysis, there cannot be any restrictions on
>>> connections - at any time. A mixer input might become active and be
>>> added to the mix. We might have a temporary lock to delay new
>>> connections but cannot not reject them outright based on BE state.
>> Yes, I understand how this affects a system like yours. As per mixer
>> example above, in our case subsequent FEs always find BE from Crossbar.
>> That is why I don't see similar error.
> Not following either.

May be it is understandable now with above crossbar point?

>>>>> I am just
>>>>> curious to know, if originally you were reconfiguring the BE DAI again
>>>>> with same parameters (for a second FE) or some additional configuration
>>>>> is done?
>>> That's a different question - and a good one.
>>>
>>> In the case of a mixer, the propagation of hw_params is a broken
>>> concept. It leads to the first FE configuring the BE to define its
>>> preferred parameters, e.g. S16_LE format. If later on a second FE is
>>> started which could play S24_LE, the mixer and BE are already configured
>>> to a lower resolution. A mixer should really have its own parameters and
>>> be the start of a new 'domain' - as Lars described it several years ago
>>> at the audio miniconference.
>> Propagation is one of the problems we want to address and require help
>> from DPCM experts. But the scenario you mentioned is a special case
>> which need not be supported, because mixer can operate in one
>> configuration at a given time and subsequent FEs should agree to the
>> already running configuration. However mixer should support both S16_LE
>> and S24_LE (whenever possible), but not simultaneously. At least this is
>> the expecation from our systems. Yes mixer may require fixup of a
>> specific config (we earlier had proposed mixer controls to configure
>> mixer parameters, but the idea was disliked), but propagation may help
>> avoid fixing up everywhere in the audio path where it is not really
>> required. But I don't know how this can be done at the moment.
> What I am saying is that the mixer should be pre-configured with the
> desired resolution/sample rate, and some adaptation needs to happen if
> the FE provides data in a different format.
>
> This is similar to what sound servers typically do on their sinks, they
> define ONE configuration. Dynamic changes are annoying and result in
> corner cases where the quality can vary depending on which FE is started
> first.

When there are multiple FEs running, yes it is better to run on a 
pre-agreed configuration to minimize the side effects of race between 
FEs. Also there should also be a provision where mixer params directly 
depend on FEs. For example, a 2x1 mixer can mix two 16-bit streams at 
one time and the other time it can mix two 32-bit streams.


[...]

>>>>>> I can send a revert with the explanations in the commit message if
>>>>>> there
>>>>>> is a consensus that this patch needs to be revisited.
>>>>> May be this can be revisited since it appears to be a critical problem
>>>>> for your system. But I hope this discussion can be alive on following
>>>>> points for a better fix.
>>>>>
>>>>> 1. The original issue at my end was not just a configuration
>>>>> redundancy.
>>>>> I realize now that with more stream addition following error print
>>>>> is seen.
>>>>>      "ASoC: too many users playback at open 4"
>>>>>
>>>>>      This is because the max DPCM users is capped at 8. Increasing this
>>>>> may help (need to see what number is better), but does not address the
>>>>> redundancy problem.
>>> we haven't used more than 2 users, but it's already broken at 2 with
>>> race conditions left and right. I am really surprised you managed to
>>> have more than 2 without hitting inconsistent states - our automated
>>> play/stop/pause monkey tests reliably break DPCM in less than 20s.
>> I am not sure what is the exact difference, may be DPCM usage in our
>> case is different from what you have. I have mixer tests for different
>> combinations (2x1, 3x1 ...), which seem to pass. In general, we want to
>> have path like this.
>>
>> FE -> BE1 -> BE2 -> ... -> BEx
>>
>> Each BEx could be a mixer, resampler etc., Currently DPCM core sees this
>> as multiple BEs for a given FE and that is why multiple "users" are
>> reported.
> This sort of flow vastly exceeds the capabilities of DPCM, which is
> already badly broken with one BE and 2 FEs... I think what you want is
> what Lars described at the audio miniconf with 'domains'.

May be the core would require enhancements to fully support such scheme. 
But so far the system is running well for below path:
FE -> BE1 (crossbar) -> BE2 (I2S) -> BE3 (external codec)

I could introduce more BEs like resampler or mixer in the path and 
results seem to be good.

BTW, I don't know what 'domains' mean. I will be curious to know what 
this exactly is. If someone is already using it, a usage reference can help.

>> In the interim, may be we can have following patch to keep both systems
>> working and keep the discussion going to address the oustanding
>> requirements/issues?
>>
>> diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c
>> index ab25f99..0fbab50 100644
>> --- a/sound/soc/soc-pcm.c
>> +++ b/sound/soc/soc-pcm.c
>> @@ -1395,7 +1395,13 @@ static int dpcm_add_paths(struct
>> snd_soc_pcm_runtime *fe, int stream,
>>                  if (!fe->dpcm[stream].runtime && !fe->fe_compr)
>>                          continue;
>>
>> -               if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_NEW) &&
>> +               /*
>> +                * Filter for systems with 'component_chaining' enabled.
>> +                * This helps to avoid unnecessary re-configuration of an
>> +                * already active BE on such systems.
>> +                */
>> +               if (fe->card->component_chaining &&
>> +                   (be->dpcm[stream].state != SND_SOC_DPCM_STATE_NEW) &&
>>                      (be->dpcm[stream].state != SND_SOC_DPCM_STATE_CLOSE))
>>                          continue;
> that wouldn't work. We need to support the STOP and START cases as well.
>

I meant with flag 'fe->card->component_chaining', which is currently 
used by Tegra audio only.

>>>>> 2. If reconfiguration of the same BE is not necessary for a subsequent
>>>>> FE run, shouldn't we avoid the reconfig itself and somehow avoid FE
>>>>> failure?
>>>> I'm not sure, but it might be possible to just skip the
>>>> dpcm_set_be_update_state(be, stream, SND_SOC_DPCM_UPDATE_BE);
>>>> call at the end of the loop, but the question is under which condition?
>>>> Can a BE asked to be reconfigured when STOP/OPEN/HW_PARAMS?
>>>>
>>>> Skipping the connect does not sound right for a new FE-BE connection.
>>> The reconfiguration is one problem, but what also happens is that the BE
>>> dailink will see multiple triggers. I've been playing with refcounts to
>>> force consistency and make sure there is only one TRIGGER_START send to
>>> the dailink, and conversely there are cases where the TRIGGER_STOP is
>>> never sent...
>> Just a thought. FE links have dummy codec DAI and core wants to find a
>> real BE when FE is started. May be don't fail a FE when no back end DAI
>> is found (and/or find if the same BE is already connected to some FE)
>> and the above problem becomes simpler?
> That would be just moving the problem. In our case we would be silently
> playing on a dummy output just because the correct output was not found
> due to state handling issues.

OK. In our case, application would report error since the frames would 
never get consumed for given FE due to unavailable BE.
Pierre-Louis Bossart Sept. 30, 2021, 4:13 p.m. UTC | #7
>>>> Can you describe the sequence that you used to start them? That may be
>>>> useful to understand the criteria you used?
>>> I have something like this:
>>>
>>> FE1  --> Crossbar -> Mixer Input1    |
>>> FE2  --> Crossbar -> Mixer Input2    |
>>> ...                                  | --> Mixer Output -->
>>> ... |
>>> FE10 --> Crossbar -> Mixer Input10   |
>>>
>>> All these FEs are started one after the other. This is an example of
>>> 10x1. Similarly we can have 2x1, 3x1 etc.,
>>> In our system, the crossbar [0] and mixer [1] are separate ASoC
>>> components. Basically audio paths consist of a group of ASoC components
>>> which are connected back to back.
>> Not following. Can you explain how starting FE1 does not change the
>> state of the mixer output then?
>>
>> Or is each 'Crossbar' instance a full-blown BE? In that case you have a
>> 1:1 mapping between FE and BE, a *really* simple topology...
> 
> Yes 'Crossbar' exposes multiple ports and it is 1:1 mapping with FE.
> Starting FE1 does configure mixer output.

Ah ok, now I get the difference with the N:1 topology we used.
Thanks for explaining this.

>>> In the interim, may be we can have following patch to keep both systems
>>> working and keep the discussion going to address the oustanding
>>> requirements/issues?
>>>
>>> diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c
>>> index ab25f99..0fbab50 100644
>>> --- a/sound/soc/soc-pcm.c
>>> +++ b/sound/soc/soc-pcm.c
>>> @@ -1395,7 +1395,13 @@ static int dpcm_add_paths(struct
>>> snd_soc_pcm_runtime *fe, int stream,
>>>                  if (!fe->dpcm[stream].runtime && !fe->fe_compr)
>>>                          continue;
>>>
>>> -               if ((be->dpcm[stream].state !=
>>> SND_SOC_DPCM_STATE_NEW) &&
>>> +               /*
>>> +                * Filter for systems with 'component_chaining' enabled.
>>> +                * This helps to avoid unnecessary re-configuration
>>> of an
>>> +                * already active BE on such systems.
>>> +                */
>>> +               if (fe->card->component_chaining &&
>>> +                   (be->dpcm[stream].state !=
>>> SND_SOC_DPCM_STATE_NEW) &&
>>>                      (be->dpcm[stream].state !=
>>> SND_SOC_DPCM_STATE_CLOSE))
>>>                          continue;
>> that wouldn't work. We need to support the STOP and START cases as well.
>>
> 
> I meant with flag 'fe->card->component_chaining', which is currently
> used by Tegra audio only.

Ah yes, this may be a temporary solution that gets us both back to a
'working solution'. Let me give it a try.
Good discussion, thanks!
-Pierre
Pierre-Louis Bossart Sept. 30, 2021, 7 p.m. UTC | #8
> 1. The original issue at my end was not just a configuration redundancy.
> I realize now that with more stream addition following error print is seen.
>    "ASoC: too many users playback at open 4"
> 
>    This is because the max DPCM users is capped at 8. Increasing this
> may help (need to see what number is better), but does not address the
> redundancy problem.
Going back to this DPCM_MAX_BE_USERS definition, it seems rather
arbitrary and not so useful indeed.

	/* first time the dpcm is open ? */
	if (be->dpcm[stream].users == DPCM_MAX_BE_USERS) {
		dev_err(be->dev, "ASoC: too many users %s at open %d\n",
			stream ? "capture" : "playback",
			be->dpcm[stream].state);
		continue;
	}

The comment is no longer aligned with the code, wondering if this is a
feature or a bug. There's no reason to arbitrarily restrict the number
of users of a BE, or the check would need to use platform-specific
information such as the number of inputs/outputs supported by a mixer/demux.

Maybe Morimoto-san can comment since this was added in:

1db19c151819 ('ASoC: soc-pcm: fixup dpcm_be_dai_startup() user count')

We're not done with soc-pcm.c cleanups :-)
Sameer Pujar Oct. 4, 2021, 4:38 a.m. UTC | #9
On 10/1/2021 12:30 AM, Pierre-Louis Bossart wrote:
>> 1. The original issue at my end was not just a configuration redundancy.
>> I realize now that with more stream addition following error print is seen.
>>     "ASoC: too many users playback at open 4"
>>
>>     This is because the max DPCM users is capped at 8. Increasing this
>> may help (need to see what number is better), but does not address the
>> redundancy problem.
> Going back to this DPCM_MAX_BE_USERS definition, it seems rather
> arbitrary and not so useful indeed.

>          /* first time the dpcm is open ? */
>          if (be->dpcm[stream].users == DPCM_MAX_BE_USERS) {
>                  dev_err(be->dev, "ASoC: too many users %s at open %d\n",
>                          stream ? "capture" : "playback",
>                          be->dpcm[stream].state);
>                  continue;
>          }
>
> The comment is no longer aligned with the code, wondering if this is a
> feature or a bug.

Looks like the comment is misplaced and the intention might have been to 
place it like below?

diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c
index e30cb5a..5cb5019 100644
--- a/sound/soc/soc-pcm.c
+++ b/sound/soc/soc-pcm.c
@@ -1508,7 +1508,6 @@ int dpcm_be_dai_startup(struct snd_soc_pcm_runtime 
*fe, int stream)
                 if (!snd_soc_dpcm_be_can_update(fe, be, stream))
                         continue;

-               /* first time the dpcm is open ? */
                 if (be->dpcm[stream].users == DPCM_MAX_BE_USERS) {
                         dev_err(be->dev, "ASoC: too many users %s at 
open %d\n",
                                 stream ? "capture" : "playback",
@@ -1516,6 +1515,7 @@ int dpcm_be_dai_startup(struct snd_soc_pcm_runtime 
*fe, int stream)
                         continue;
                 }

+               /* first time the dpcm is open ? */
                 if (be->dpcm[stream].users++ != 0)
                         continue;

>   There's no reason to arbitrarily restrict the number
> of users of a BE, or the check would need to use platform-specific
> information such as the number of inputs/outputs supported by a mixer/demux.
>
> Maybe Morimoto-san can comment since this was added in:
>
> 1db19c151819 ('ASoC: soc-pcm: fixup dpcm_be_dai_startup() user count')
>
> We're not done with soc-pcm.c cleanups :-)
diff mbox series

Patch

diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c
index 48f71bb..e30cb5a 100644
--- a/sound/soc/soc-pcm.c
+++ b/sound/soc/soc-pcm.c
@@ -1395,6 +1395,10 @@  static int dpcm_add_paths(struct snd_soc_pcm_runtime *fe, int stream,
 		if (!fe->dpcm[stream].runtime && !fe->fe_compr)
 			continue;
 
+		if ((be->dpcm[stream].state != SND_SOC_DPCM_STATE_NEW) &&
+		    (be->dpcm[stream].state != SND_SOC_DPCM_STATE_CLOSE))
+			continue;
+
 		/* newly connected FE and BE */
 		err = dpcm_be_connect(fe, be, stream);
 		if (err < 0) {