From patchwork Wed Jan 15 12:52:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jean-Philippe Brucker X-Patchwork-Id: 1223462 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=linux-pci-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=N5YUZNbi; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47yS1f6Qrxz9sRm for ; Wed, 15 Jan 2020 23:53:42 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729049AbgAOMxi (ORCPT ); Wed, 15 Jan 2020 07:53:38 -0500 Received: from mail-wr1-f66.google.com ([209.85.221.66]:41055 "EHLO mail-wr1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726474AbgAOMxi (ORCPT ); Wed, 15 Jan 2020 07:53:38 -0500 Received: by mail-wr1-f66.google.com with SMTP id c9so15629409wrw.8 for ; Wed, 15 Jan 2020 04:53:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=4bZ/CepjtRIujasbtp83zulcnzN4LuxoEHojLEWXBkU=; b=N5YUZNbidCd+0mRzNLc+mAf25KsWhHqQsu6E/BD8m6TCk1+jFs4VYYA9pSDDDBWId4 Y5hApgNqdB2zC57TbPT6FWU3AmGXIiVAnflDFar9Vj/5v3nxG0RgzdIJ1hqJv13ymIyH xuIkuGjlQJGe/xXOKWEDD9Qt4OZ3IdYsJfxr4Xg2X39vpP+1/UP68dmEs0vkR8U5EAZE 1Cmoyvl9ipEHHe+paOdzaYxtCyOLtlOSb8OkZvzriI0mbrTKq0wH3IFz42TCEKLRnxp6 Niz9VcbUIE5HnEsIaK0Y2MD0lCcKevC/g6R+3zJCSnfG+V5zIYecAb8eB+Wf4/XxEOMY mhzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=4bZ/CepjtRIujasbtp83zulcnzN4LuxoEHojLEWXBkU=; b=pgZ++jRhZbpEVrBxU9nPZz48fBz0oeZ6aiGUIpr1/IAz8piupBow0go6QITeRrfFnD izHYd/PZYU5TP4i6yjBfqkRkCruNJtQ/nWWuFXFdO6B2BpU2FLPt68mLjiHaxnfU6rq9 59sPXdTSl+5J3hDo7P3mn6YUUYyZCzrB+3f3IcWWwFXqmWg7jRH2bqUwxte9MR1RBMmK ICOIQkJshLMfPXG4vv5NpKreG5uDni/SFfMo6M/1uoXBREDvT6wdRH2Ghc4S7WQwJ7pl 7BvenCwEI20oxdx1EJdL5889V3gDGI4X7Dc2zUyDtoAmeCAC8857NCkAdMBbfKGKtpoN Fq1Q== X-Gm-Message-State: APjAAAXODs23y5gcPHVscrXifQsNLvObs23qUA7NpuwKsEnFXOoK642P ZT/UCdfvOdRhLZdOeaqXy3MhhoPz5a8= X-Google-Smtp-Source: APXvYqzJswflwzFBqrDS/iyaXJeEyvdokU6gQySM/KfhP173tNwKFAxkpp71PBxmimx3BWyqBHC8LA== X-Received: by 2002:adf:ef0b:: with SMTP id e11mr30902781wro.128.1579092815640; Wed, 15 Jan 2020 04:53:35 -0800 (PST) Received: from localhost.localdomain ([2001:171b:2266:ba60:116c:c27a:3e7f:5eaf]) by smtp.gmail.com with ESMTPSA id d12sm25196171wrp.62.2020.01.15.04.53.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Jan 2020 04:53:35 -0800 (PST) From: Jean-Philippe Brucker To: linux-pci@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-acpi@vger.kernel.org, devicetree@vger.kernel.org, iommu@lists.linux-foundation.org, will@kernel.org Cc: joro@8bytes.org, robh+dt@kernel.org, mark.rutland@arm.com, lorenzo.pieralisi@arm.com, guohanjun@huawei.com, sudeep.holla@arm.com, rjw@rjwysocki.net, lenb@kernel.org, robin.murphy@arm.com, bhelgaas@google.com, eric.auger@redhat.com, jonathan.cameron@huawei.com, zhangfei.gao@linaro.org Subject: [PATCH v5 07/13] iommu/arm-smmu-v3: Add support for Substream IDs Date: Wed, 15 Jan 2020 13:52:33 +0100 Message-Id: <20200115125239.136759-8-jean-philippe@linaro.org> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200115125239.136759-1-jean-philippe@linaro.org> References: <20200115125239.136759-1-jean-philippe@linaro.org> MIME-Version: 1.0 Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org At the moment, the SMMUv3 driver implements only one stage-1 or stage-2 page directory per device. However SMMUv3 allows more than one address space for some devices, by providing multiple stage-1 page directories. In addition to the Stream ID (SID), that identifies a device, we can now have Substream IDs (SSID) identifying an address space. In PCIe, SID is called Requester ID (RID) and SSID is called Process Address-Space ID (PASID). A complete stage-1 walk goes through the context descriptor table: Stream tables Ctx. Desc. tables Page tables +--------+ ,------->+-------+ ,------->+-------+ : : | : : | : : +--------+ | +-------+ | +-------+ SID->| STE |---' SSID->| CD |---' IOVA->| PTE |--> IPA +--------+ +-------+ +-------+ : : : : : : +--------+ +-------+ +-------+ Rewrite arm_smmu_write_ctx_desc() to modify context descriptor table entries. To keep things simple we only implement one level of context descriptor tables here, but as with stream and page tables, an SSID can be split to index multiple levels of tables. Signed-off-by: Jean-Philippe Brucker --- drivers/iommu/arm-smmu-v3.c | 135 ++++++++++++++++++++++++++++++------ 1 file changed, 112 insertions(+), 23 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index a3438e256bd0..6ab0e93518f6 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -227,6 +227,11 @@ #define STRTAB_STE_0_S1CTXPTR_MASK GENMASK_ULL(51, 6) #define STRTAB_STE_0_S1CDMAX GENMASK_ULL(63, 59) +#define STRTAB_STE_1_S1DSS GENMASK_ULL(1, 0) +#define STRTAB_STE_1_S1DSS_TERMINATE 0x0 +#define STRTAB_STE_1_S1DSS_BYPASS 0x1 +#define STRTAB_STE_1_S1DSS_SSID0 0x2 + #define STRTAB_STE_1_S1C_CACHE_NC 0UL #define STRTAB_STE_1_S1C_CACHE_WBRA 1UL #define STRTAB_STE_1_S1C_CACHE_WT 2UL @@ -329,6 +334,7 @@ #define CMDQ_PREFETCH_1_SIZE GENMASK_ULL(4, 0) #define CMDQ_PREFETCH_1_ADDR_MASK GENMASK_ULL(63, 12) +#define CMDQ_CFGI_0_SSID GENMASK_ULL(31, 12) #define CMDQ_CFGI_0_SID GENMASK_ULL(63, 32) #define CMDQ_CFGI_1_LEAF (1UL << 0) #define CMDQ_CFGI_1_RANGE GENMASK_ULL(4, 0) @@ -446,8 +452,11 @@ struct arm_smmu_cmdq_ent { #define CMDQ_OP_CFGI_STE 0x3 #define CMDQ_OP_CFGI_ALL 0x4 + #define CMDQ_OP_CFGI_CD 0x5 + #define CMDQ_OP_CFGI_CD_ALL 0x6 struct { u32 sid; + u32 ssid; union { bool leaf; u8 span; @@ -569,6 +578,7 @@ struct arm_smmu_ctx_desc_cfg { struct arm_smmu_s1_cfg { struct arm_smmu_ctx_desc_cfg cdcfg; struct arm_smmu_ctx_desc cd; + u8 s1fmt; u8 s1cdmax; }; @@ -861,10 +871,16 @@ static int arm_smmu_cmdq_build_cmd(u64 *cmd, struct arm_smmu_cmdq_ent *ent) cmd[1] |= FIELD_PREP(CMDQ_PREFETCH_1_SIZE, ent->prefetch.size); cmd[1] |= ent->prefetch.addr & CMDQ_PREFETCH_1_ADDR_MASK; break; + case CMDQ_OP_CFGI_CD: + cmd[0] |= FIELD_PREP(CMDQ_CFGI_0_SSID, ent->cfgi.ssid); + /* Fallthrough */ case CMDQ_OP_CFGI_STE: cmd[0] |= FIELD_PREP(CMDQ_CFGI_0_SID, ent->cfgi.sid); cmd[1] |= FIELD_PREP(CMDQ_CFGI_1_LEAF, ent->cfgi.leaf); break; + case CMDQ_OP_CFGI_CD_ALL: + cmd[0] |= FIELD_PREP(CMDQ_CFGI_0_SID, ent->cfgi.sid); + break; case CMDQ_OP_CFGI_ALL: /* Cover the entire SID range */ cmd[1] |= FIELD_PREP(CMDQ_CFGI_1_RANGE, 31); @@ -1457,6 +1473,33 @@ static int arm_smmu_cmdq_issue_sync(struct arm_smmu_device *smmu) } /* Context descriptor manipulation functions */ +static void arm_smmu_sync_cd(struct arm_smmu_domain *smmu_domain, + int ssid, bool leaf) +{ + size_t i; + unsigned long flags; + struct arm_smmu_master *master; + struct arm_smmu_device *smmu = smmu_domain->smmu; + struct arm_smmu_cmdq_ent cmd = { + .opcode = CMDQ_OP_CFGI_CD, + .cfgi = { + .ssid = ssid, + .leaf = leaf, + }, + }; + + spin_lock_irqsave(&smmu_domain->devices_lock, flags); + list_for_each_entry(master, &smmu_domain->devices, domain_head) { + for (i = 0; i < master->num_sids; i++) { + cmd.cfgi.sid = master->sids[i]; + arm_smmu_cmdq_issue_cmd(smmu, &cmd); + } + } + spin_unlock_irqrestore(&smmu_domain->devices_lock, flags); + + arm_smmu_cmdq_issue_sync(smmu); +} + static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr) { u64 val = 0; @@ -1474,34 +1517,74 @@ static u64 arm_smmu_cpu_tcr_to_cd(u64 tcr) return val; } -static void arm_smmu_write_ctx_desc(struct arm_smmu_device *smmu, - struct arm_smmu_s1_cfg *cfg) +static int arm_smmu_write_ctx_desc(struct arm_smmu_domain *smmu_domain, + int ssid, struct arm_smmu_ctx_desc *cd) { - u64 val; - __le64 *cdptr = cfg->cdcfg.cdtab; - /* - * We don't need to issue any invalidation here, as we'll invalidate - * the STE when installing the new entry anyway. + * This function handles the following cases: + * + * (1) Install primary CD, for normal DMA traffic (SSID = 0). + * (2) Install a secondary CD, for SID+SSID traffic. + * (3) Update ASID of a CD. Atomically write the first 64 bits of the + * CD, then invalidate the old entry and mappings. + * (4) Remove a secondary CD. */ - val = arm_smmu_cpu_tcr_to_cd(cfg->cd.tcr) | -#ifdef __BIG_ENDIAN - CTXDESC_CD_0_ENDI | -#endif - CTXDESC_CD_0_R | CTXDESC_CD_0_A | CTXDESC_CD_0_ASET | - CTXDESC_CD_0_AA64 | FIELD_PREP(CTXDESC_CD_0_ASID, cfg->cd.asid) | - CTXDESC_CD_0_V; + u64 val; + bool cd_live; + struct arm_smmu_device *smmu = smmu_domain->smmu; + __le64 *cdptr = smmu_domain->s1_cfg.cdcfg.cdtab + ssid * + CTXDESC_CD_DWORDS; + + val = le64_to_cpu(cdptr[0]); + cd_live = !!(val & CTXDESC_CD_0_V); + + if (!cd) { /* (4) */ + val = 0; + } else if (cd_live) { /* (3) */ + val &= ~CTXDESC_CD_0_ASID; + val |= FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid); + /* + * Until CD+TLB invalidation, both ASIDs may be used for tagging + * this substream's traffic + */ + } else { /* (1) and (2) */ + cdptr[1] = cpu_to_le64(cd->ttbr & CTXDESC_CD_1_TTB0_MASK); + cdptr[2] = 0; + cdptr[3] = cpu_to_le64(cd->mair); - /* STALL_MODEL==0b10 && CD.S==0 is ILLEGAL */ - if (smmu->features & ARM_SMMU_FEAT_STALL_FORCE) - val |= CTXDESC_CD_0_S; + /* + * STE is live, and the SMMU might read dwords of this CD in any + * order. Ensure that it observes valid values before reading + * V=1. + */ + arm_smmu_sync_cd(smmu_domain, ssid, true); - cdptr[0] = cpu_to_le64(val); + val = arm_smmu_cpu_tcr_to_cd(cd->tcr) | +#ifdef __BIG_ENDIAN + CTXDESC_CD_0_ENDI | +#endif + CTXDESC_CD_0_R | CTXDESC_CD_0_A | CTXDESC_CD_0_ASET | + CTXDESC_CD_0_AA64 | + FIELD_PREP(CTXDESC_CD_0_ASID, cd->asid) | + CTXDESC_CD_0_V; - val = cfg->cd.ttbr & CTXDESC_CD_1_TTB0_MASK; - cdptr[1] = cpu_to_le64(val); + /* STALL_MODEL==0b10 && CD.S==0 is ILLEGAL */ + if (smmu->features & ARM_SMMU_FEAT_STALL_FORCE) + val |= CTXDESC_CD_0_S; + } - cdptr[3] = cpu_to_le64(cfg->cd.mair); + /* + * The SMMU accesses 64-bit values atomically. See IHI0070Ca 3.21.3 + * "Configuration structures and configuration invalidation completion" + * + * The size of single-copy atomic reads made by the SMMU is + * IMPLEMENTATION DEFINED but must be at least 64 bits. Any single + * field within an aligned 64-bit span of a structure can be altered + * without first making the structure invalid. + */ + WRITE_ONCE(cdptr[0], cpu_to_le64(val)); + arm_smmu_sync_cd(smmu_domain, ssid, true); + return 0; } static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain) @@ -1511,6 +1594,8 @@ static int arm_smmu_alloc_cd_tables(struct arm_smmu_domain *smmu_domain) struct arm_smmu_s1_cfg *cfg = &smmu_domain->s1_cfg; struct arm_smmu_ctx_desc_cfg *cdcfg = &cfg->cdcfg; + cfg->s1fmt = STRTAB_STE_0_S1FMT_LINEAR; + cdcfg->num_l1_ents = 1UL << cfg->s1cdmax; l1size = cdcfg->num_l1_ents * (CTXDESC_CD_DWORDS << 3); cdcfg->cdtab = dmam_alloc_coherent(smmu->dev, l1size, &cdcfg->cdtab_dma, @@ -1652,6 +1737,7 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, if (s1_cfg) { BUG_ON(ste_live); dst[1] = cpu_to_le64( + FIELD_PREP(STRTAB_STE_1_S1DSS, STRTAB_STE_1_S1DSS_SSID0) | FIELD_PREP(STRTAB_STE_1_S1CIR, STRTAB_STE_1_S1C_CACHE_WBRA) | FIELD_PREP(STRTAB_STE_1_S1COR, STRTAB_STE_1_S1C_CACHE_WBRA) | FIELD_PREP(STRTAB_STE_1_S1CSH, ARM_SMMU_SH_ISH) | @@ -1662,7 +1748,9 @@ static void arm_smmu_write_strtab_ent(struct arm_smmu_master *master, u32 sid, dst[1] |= cpu_to_le64(STRTAB_STE_1_S1STALLD); val |= (s1_cfg->cdcfg.cdtab_dma & STRTAB_STE_0_S1CTXPTR_MASK) | - FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS); + FIELD_PREP(STRTAB_STE_0_CFG, STRTAB_STE_0_CFG_S1_TRANS) | + FIELD_PREP(STRTAB_STE_0_S1CDMAX, s1_cfg->s1cdmax) | + FIELD_PREP(STRTAB_STE_0_S1FMT, s1_cfg->s1fmt); } if (s2_cfg) { @@ -2467,7 +2555,8 @@ static int arm_smmu_attach_dev(struct iommu_domain *domain, struct device *dev) master->ats_enabled = arm_smmu_ats_supported(master); if (smmu_domain->stage == ARM_SMMU_DOMAIN_S1) - arm_smmu_write_ctx_desc(smmu, &smmu_domain->s1_cfg); + arm_smmu_write_ctx_desc(smmu_domain, 0, + &smmu_domain->s1_cfg.cd); arm_smmu_install_ste_for_dev(master);