mbox series

[v4,0/3] Introduce Protection Domain Restart (PDR) Helpers

Message ID 20200226170001.24234-1-sibis@codeaurora.org
Headers show
Series Introduce Protection Domain Restart (PDR) Helpers | expand

Message

Sibi Sankar Feb. 26, 2020, 4:59 p.m. UTC
Qualcomm SoCs (starting with MSM8998) allow for multiple protection
domains (PDs) to run on the same Q6 sub-system. This allows for
services like AVS AUDIO to have their own separate address space and
crash/recover without disrupting the other PDs running on the same Q6
ADSP. This patch series introduces pdr helper library and adds PD
tracking functionality for "avs/audio" allowing apr services to register
themselves asynchronously once the dependent PDs are up.

V4:
 * Fixup dt bindings and examples. [Rob]
 * Dropping r-b from Bjorn/Srini for the dt bindings.
 * Privatize pdr_service/pdr_handle. [Srini/Bjorn]
 * Introduce notifier_init_complete to deal with cases where
   qmi_handle_net_reset is not enough to reset the port. This
   is to deal with cases where qrtr-ns starts after the adsp.
 * Introduce per addr per pds to deal with multiple service_paths
   per pdr_handle.
 * Uniformly rename servreg -> notifier, servloc -> locator. [Narendra]
 * Drop pdr_servreg_link_create tracking the service_path tracks
   all pds associated with it. [Bjorn]
 * Remove safe traversal for all cases where list is left
   unmodified. [Bjorn]
 * Address review comments in the apr driver and add comments. [Bjorn]
 * Other misc fixes. [Narendra/Bjorn]

V3:
 * patches 2 and 3 remain unchanged.
 * reset servloc_addr/servreg_addr.
 * fixup the helpers to handle servloc_work/servreg_work asynchronously.
 * fixup useage of list_lock across traversals, insertions and deletions.
 * fixup the helpers to use a single lookup list.
 * skip waiting for response on ind_ack send.
 * introduce pdr_servreg_link_create to re-use existing qmi connection to
   servreg instances. This helps tracking PDs running on the same remote
   processor.
 * have a per node curr_state in pdr_list_node to preserve all state
   updates during indack_cb.
 * introduce additional servreg_service_state values to help the client
   distinguish between a fatal and non-fatal pdr_lookup errors.
 * re-order pdr_handle_release sequence.
 * fixup "!ind_msg->service_path returns true always" warning.
 * fixup comments.

V2:
 * fixup pd_status callback to return void.
 * return 0 from pdr_get_domain_list on success.
 * introduce status_lock to linearize the pd_status reported back
   to the clients.
 * use the correct service name length across various string operations
   performed.
 * service locator will now schedule the pending lookups registered
   when it comes up.
 * other minor cleanups that Bjorn suggested.
 * use pr_warn to indicate that the wait for service locator timed
   out.
 * add Bjorn/Srini's "Reviewed-by" for the dt-bindings.

Sibi Sankar (3):
  soc: qcom: Introduce Protection Domain Restart helpers
  dt-bindings: soc: qcom: apr: Add protection domain bindings
  soc: qcom: apr: Add avs/audio tracking functionality

 .../devicetree/bindings/soc/qcom/qcom,apr.txt |  50 ++
 drivers/soc/qcom/Kconfig                      |   6 +
 drivers/soc/qcom/Makefile                     |   1 +
 drivers/soc/qcom/apr.c                        | 123 ++-
 drivers/soc/qcom/pdr_interface.c              | 768 ++++++++++++++++++
 drivers/soc/qcom/pdr_internal.h               | 379 +++++++++
 include/linux/soc/qcom/apr.h                  |   1 +
 include/linux/soc/qcom/pdr.h                  |  30 +
 include/linux/soc/qcom/qmi.h                  |   1 +
 9 files changed, 1350 insertions(+), 9 deletions(-)
 create mode 100644 drivers/soc/qcom/pdr_interface.c
 create mode 100644 drivers/soc/qcom/pdr_internal.h
 create mode 100644 include/linux/soc/qcom/pdr.h

Comments

Bjorn Andersson Feb. 28, 2020, 5:34 a.m. UTC | #1
On Wed 26 Feb 08:59 PST 2020, Sibi Sankar wrote:
> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
[..]
> +config QCOM_PDR_HELPERS
> +	tristate
> +	depends on ARCH_QCOM || COMPILE_TEST

As discussed on one of you other patches, please omit the depends on for
Kconfig entries that are not user selectable. Presumably anyone
selecting this option will have ARCH_QCOM met already.

> +	select QCOM_QMI_HELPERS
[..]
> diff --git a/drivers/soc/qcom/pdr_interface.c b/drivers/soc/qcom/pdr_interface.c
[..]
> +static void pdr_locator_work(struct work_struct *work)
> +{
> +	struct pdr_handle *pdr = container_of(work, struct pdr_handle,
> +					      locator_work);
> +	struct pdr_service *pds;
> +	int ret = 0;
> +
> +	/* Bail out early if the SERVREG LOCATOR QMI service is not up */
> +	mutex_lock(&pdr->lock);
> +	if (!pdr->locator_init_complete) {
> +		mutex_unlock(&pdr->lock);
> +		pr_debug("PDR: SERVICE LOCATOR service not available\n");
> +		return;
> +	}
> +	mutex_unlock(&pdr->lock);
> +
> +	mutex_lock(&pdr->list_lock);
> +	list_for_each_entry(pds, &pdr->lookups, node) {
> +		if (!pds->need_locator_lookup)
> +			continue;
> +
> +		pds->need_locator_lookup = false;
> +		mutex_unlock(&pdr->list_lock);
> +
> +		ret = pdr_locate_service(pdr, pds);
> +		if (ret < 0)
> +			goto exit;
> +
> +		/* Initialize notifier QMI handle */
> +		mutex_lock(&pdr->lock);
> +		if (!pdr->notifier_init_complete) {
> +			ret = qmi_handle_init(&pdr->notifier_hdl,
> +					      SERVREG_STATE_UPDATED_IND_MAX_LEN,
> +					      &pdr_notifier_ops,
> +					      qmi_indication_handler);
> +			if (ret < 0) {
> +				mutex_unlock(&pdr->lock);
> +				goto exit;
> +			}
> +			pdr->notifier_init_complete = true;
> +		}
> +		mutex_unlock(&pdr->lock);
> +
> +		ret = qmi_add_lookup(&pdr->notifier_hdl, pds->service, 1,
> +				     pds->instance);
> +		if (ret < 0)
> +			goto exit;
> +
> +		return;

If the caller calls pdr_add_lookup() multiple times in quick succession
wouldn't it be possile to get the worker scheduled with multiple entries
in &pdr->lookups with need_locator_lookup set?

If so I think it makes sense to break the content of this loop, and the
error handling under exit out into a separate function.

And even if this would not be the case, breaking this out in a separate
function would allow you to change the loop to:

	list_for_each_entry() {
		if (pdr->need_locator_lookup) {
			do_the_lookup();
			break;
		}
	}

Which I think is easier to reason about than the loop with a return at
the end.

> +	}
> +	mutex_unlock(&pdr->list_lock);
> +exit:
> +	if (ret < 0) {
> +		/* Notify lookup failed */
> +		mutex_lock(&pdr->list_lock);
> +		list_del(&pds->node);
> +		mutex_unlock(&pdr->list_lock);
> +
> +		if (ret == -ENXIO)
> +			pds->state = SERVREG_LOCATOR_UNKNOWN_SERVICE;
> +		else
> +			pds->state = SERVREG_LOCATOR_ERR;
> +
> +		pr_err("PDR: service lookup for %s failed: %d\n",
> +		       pds->service_name, ret);
> +
> +		mutex_lock(&pdr->status_lock);
> +		pdr->status(pds->state, pds->service_path, pdr->priv);
> +		mutex_unlock(&pdr->status_lock);
> +		kfree(pds);
> +	}
> +}
[..]
> +struct pdr_handle *pdr_handle_alloc(void (*status)(int state,
> +						   char *service_path,
> +						   void *priv), void *priv)
> +{
> +	struct pdr_handle *pdr;
> +	int ret;
> +
> +	if (!status)
> +		return ERR_PTR(-EINVAL);
> +
> +	pdr = kzalloc(sizeof(*pdr), GFP_KERNEL);
> +	if (!pdr)
> +		return ERR_PTR(-ENOMEM);
> +
> +	pdr->status = *status;

Please omit the * here.

Regards,
Bjorn
Bjorn Andersson Feb. 28, 2020, 5:38 a.m. UTC | #2
On Wed 26 Feb 09:00 PST 2020, Sibi Sankar wrote:

> Use PDR helper functions to track the protection domains that the apr
> services are dependent upon on SDM845 SoC, specifically the "avs/audio"
> service running on ADSP Q6.
> 
> Signed-off-by: Sibi Sankar <sibis@codeaurora.org>

Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org>

> ---

Please do include a changelog as you respin your patches.

Regards,
Bjorn

>  drivers/soc/qcom/Kconfig     |   1 +
>  drivers/soc/qcom/apr.c       | 123 ++++++++++++++++++++++++++++++++---
>  include/linux/soc/qcom/apr.h |   1 +
>  3 files changed, 116 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/soc/qcom/Kconfig b/drivers/soc/qcom/Kconfig
> index cca6a43e771d9..57000f1615ada 100644
> --- a/drivers/soc/qcom/Kconfig
> +++ b/drivers/soc/qcom/Kconfig
> @@ -202,6 +202,7 @@ config QCOM_APR
>  	tristate "Qualcomm APR Bus (Asynchronous Packet Router)"
>  	depends on ARCH_QCOM || COMPILE_TEST
>  	depends on RPMSG
> +	select QCOM_PDR_HELPERS
>  	help
>  	  Enable APR IPC protocol support between
>  	  application processor and QDSP6. APR is
> diff --git a/drivers/soc/qcom/apr.c b/drivers/soc/qcom/apr.c
> index 4fcc32420c474..1f35b097c6356 100644
> --- a/drivers/soc/qcom/apr.c
> +++ b/drivers/soc/qcom/apr.c
> @@ -11,6 +11,7 @@
>  #include <linux/workqueue.h>
>  #include <linux/of_device.h>
>  #include <linux/soc/qcom/apr.h>
> +#include <linux/soc/qcom/pdr.h>
>  #include <linux/rpmsg.h>
>  #include <linux/of.h>
>  
> @@ -21,6 +22,7 @@ struct apr {
>  	spinlock_t rx_lock;
>  	struct idr svcs_idr;
>  	int dest_domain_id;
> +	struct pdr_handle *pdr;
>  	struct workqueue_struct *rxwq;
>  	struct work_struct rx_work;
>  	struct list_head rx_list;
> @@ -289,6 +291,9 @@ static int apr_add_device(struct device *dev, struct device_node *np,
>  		  id->svc_id + 1, GFP_ATOMIC);
>  	spin_unlock(&apr->svcs_lock);
>  
> +	of_property_read_string_index(np, "qcom,protection-domain",
> +				      1, &adev->service_path);
> +
>  	dev_info(dev, "Adding APR dev: %s\n", dev_name(&adev->dev));
>  
>  	ret = device_register(&adev->dev);
> @@ -300,14 +305,75 @@ static int apr_add_device(struct device *dev, struct device_node *np,
>  	return ret;
>  }
>  
> -static void of_register_apr_devices(struct device *dev)
> +static int of_apr_add_pd_lookups(struct device *dev)
> +{
> +	const char *service_name, *service_path;
> +	struct apr *apr = dev_get_drvdata(dev);
> +	struct device_node *node;
> +	struct pdr_service *pds;
> +	int ret;
> +
> +	for_each_child_of_node(dev->of_node, node) {
> +		ret = of_property_read_string_index(node, "qcom,protection-domain",
> +						    0, &service_name);
> +		if (ret < 0)
> +			continue;
> +
> +		ret = of_property_read_string_index(node, "qcom,protection-domain",
> +						    1, &service_path);
> +		if (ret < 0) {
> +			dev_err(dev, "pdr service path missing: %d\n", ret);
> +			return ret;
> +		}
> +
> +		pds = pdr_add_lookup(apr->pdr, service_name, service_path);
> +		if (IS_ERR(pds) && PTR_ERR(pds) != -EALREADY) {
> +			dev_err(dev, "pdr add lookup failed: %d\n", ret);
> +			return PTR_ERR(pds);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static void of_register_apr_devices(struct device *dev, const char *svc_path)
>  {
>  	struct apr *apr = dev_get_drvdata(dev);
>  	struct device_node *node;
> +	const char *service_path;
> +	int ret;
>  
>  	for_each_child_of_node(dev->of_node, node) {
>  		struct apr_device_id id = { {0} };
>  
> +		/*
> +		 * This function is called with svc_path NULL during
> +		 * apr_probe(), in which case we register any apr devices
> +		 * without a qcom,protection-domain specified.
> +		 *
> +		 * Then as the protection domains becomes available
> +		 * (if applicable) this function is again called, but with
> +		 * svc_path representing the service becoming available. In
> +		 * this case we register any apr devices with a matching
> +		 * qcom,protection-domain.
> +		 */
> +
> +		ret = of_property_read_string_index(node, "qcom,protection-domain",
> +						    1, &service_path);
> +		if (svc_path) {
> +			/* skip APR services that are PD independent */
> +			if (ret)
> +				continue;
> +
> +			/* skip APR services whose PD paths don't match */
> +			if (strcmp(service_path, svc_path))
> +				continue;
> +		} else {
> +			/* skip APR services whose PD lookups are registered */
> +			if (ret == 0)
> +				continue;
> +		}
> +
>  		if (of_property_read_u32(node, "reg", &id.svc_id))
>  			continue;
>  
> @@ -318,6 +384,34 @@ static void of_register_apr_devices(struct device *dev)
>  	}
>  }
>  
> +static int apr_remove_device(struct device *dev, void *svc_path)
> +{
> +	struct apr_device *adev = to_apr_device(dev);
> +
> +	if (svc_path && adev->service_path) {
> +		if (!strcmp(adev->service_path, (char *)svc_path))
> +			device_unregister(&adev->dev);
> +	} else {
> +		device_unregister(&adev->dev);
> +	}
> +
> +	return 0;
> +}
> +
> +static void apr_pd_status(int state, char *svc_path, void *priv)
> +{
> +	struct apr *apr = (struct apr *)priv;
> +
> +	switch (state) {
> +	case SERVREG_SERVICE_STATE_UP:
> +		of_register_apr_devices(apr->dev, svc_path);
> +		break;
> +	case SERVREG_SERVICE_STATE_DOWN:
> +		device_for_each_child(apr->dev, svc_path, apr_remove_device);
> +		break;
> +	}
> +}
> +
>  static int apr_probe(struct rpmsg_device *rpdev)
>  {
>  	struct device *dev = &rpdev->dev;
> @@ -343,28 +437,39 @@ static int apr_probe(struct rpmsg_device *rpdev)
>  		return -ENOMEM;
>  	}
>  	INIT_WORK(&apr->rx_work, apr_rxwq);
> +
> +	apr->pdr = pdr_handle_alloc(apr_pd_status, apr);
> +	if (IS_ERR(apr->pdr)) {
> +		dev_err(dev, "Failed to init PDR handle\n");
> +		ret = PTR_ERR(apr->pdr);
> +		goto destroy_wq;
> +	}
> +
>  	INIT_LIST_HEAD(&apr->rx_list);
>  	spin_lock_init(&apr->rx_lock);
>  	spin_lock_init(&apr->svcs_lock);
>  	idr_init(&apr->svcs_idr);
> -	of_register_apr_devices(dev);
> -
> -	return 0;
> -}
>  
> -static int apr_remove_device(struct device *dev, void *null)
> -{
> -	struct apr_device *adev = to_apr_device(dev);
> +	ret = of_apr_add_pd_lookups(dev);
> +	if (ret)
> +		goto handle_release;
>  
> -	device_unregister(&adev->dev);
> +	of_register_apr_devices(dev, NULL);
>  
>  	return 0;
> +
> +handle_release:
> +	pdr_handle_release(apr->pdr);
> +destroy_wq:
> +	destroy_workqueue(apr->rxwq);
> +	return ret;
>  }
>  
>  static void apr_remove(struct rpmsg_device *rpdev)
>  {
>  	struct apr *apr = dev_get_drvdata(&rpdev->dev);
>  
> +	pdr_handle_release(apr->pdr);
>  	device_for_each_child(&rpdev->dev, NULL, apr_remove_device);
>  	flush_workqueue(apr->rxwq);
>  	destroy_workqueue(apr->rxwq);
> diff --git a/include/linux/soc/qcom/apr.h b/include/linux/soc/qcom/apr.h
> index c5d52e2cb275f..7f0bc3cf4d610 100644
> --- a/include/linux/soc/qcom/apr.h
> +++ b/include/linux/soc/qcom/apr.h
> @@ -85,6 +85,7 @@ struct apr_device {
>  	uint16_t	domain_id;
>  	uint32_t	version;
>  	char name[APR_NAME_SIZE];
> +	const char *service_path;
>  	spinlock_t	lock;
>  	struct list_head node;
>  };
> -- 
> The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
> a Linux Foundation Collaborative Project