From patchwork Wed Feb 26 10:17:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vaidyanathan Srinivasan X-Patchwork-Id: 1245751 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48SrNG2F1Vz9sR4 for ; Thu, 27 Feb 2020 22:41:18 +1100 (AEDT) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 48SrNG1JwYzDqym for ; Thu, 27 Feb 2020 22:41:18 +1100 (AEDT) X-Original-To: skiboot@lists.ozlabs.org Delivered-To: skiboot@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.158.5; helo=mx0a-001b2d01.pphosted.com; envelope-from=svaidy@linux.vnet.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 48SBZw5ZkPzDqHv for ; Wed, 26 Feb 2020 21:18:16 +1100 (AEDT) Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 01QACnl5148425 for ; Wed, 26 Feb 2020 05:18:13 -0500 Received: from e06smtp05.uk.ibm.com (e06smtp05.uk.ibm.com [195.75.94.101]) by mx0b-001b2d01.pphosted.com with ESMTP id 2yden0x5er-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 26 Feb 2020 05:18:11 -0500 Received: from localhost by e06smtp05.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 26 Feb 2020 10:18:09 -0000 Received: from b06cxnps4076.portsmouth.uk.ibm.com (9.149.109.198) by e06smtp05.uk.ibm.com (192.168.101.135) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 26 Feb 2020 10:18:06 -0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06cxnps4076.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 01QAI6eC37879926 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 26 Feb 2020 10:18:06 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F003DA4051; Wed, 26 Feb 2020 10:18:05 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B14DBA404D; Wed, 26 Feb 2020 10:18:04 +0000 (GMT) Received: from drishya.in.ibm.com (unknown [9.102.3.58]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Wed, 26 Feb 2020 10:18:04 +0000 (GMT) From: Vaidyanathan Srinivasan To: "Oliver O'Halloran" Date: Wed, 26 Feb 2020 15:47:45 +0530 X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200226101752.122998-1-svaidy@linux.vnet.ibm.com> References: <20200226101752.122998-1-svaidy@linux.vnet.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 x-cbid: 20022610-0020-0000-0000-000003ADB57A X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 20022610-0021-0000-0000-00002205CFB6 Message-Id: <20200226101752.122998-2-svaidy@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.572 definitions=2020-02-26_02:2020-02-26, 2020-02-26 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 lowpriorityscore=0 mlxlogscore=999 adultscore=0 spamscore=0 suspectscore=0 priorityscore=1501 impostorscore=0 phishscore=0 clxscore=1015 malwarescore=0 mlxscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2001150001 definitions=main-2002260077 X-Mailman-Approved-At: Thu, 27 Feb 2020 22:40:34 +1100 Subject: [Skiboot] [PATCH v4 1/8] Add basic P9 fused core support X-BeenThere: skiboot@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Mailing list for skiboot development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: skiboot@lists.ozlabs.org, Michael Neuling Errors-To: skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Skiboot" From: Ryan Grimm P9 cores can be configured into fused core mode where two core chiplets function as an 8-threaded, single core. So, bump four to eight in boot_entry when in fused core mode and cpu_thread_count in init_boot_cpu. The HID, AMOR, TSCR, RPR require the first active thread on that core chiplet to load the copy for that core chiplet. So, send thread 1 of a fused core to init_shared_sprs in boot_entry. The code checks for fused core mode in the core thead state register and puts a field in struct cpu_thread. This flag is checked when updating the HID and in XIVE code when setting the special bar. For XSCOM, the core ID is the non-fused EX. So, create macros to arrange the bits. It's fairly verbose but somewhat readable. This was tested on a P9 ZZ with 16 fused cores and ran HTX for over 24 hours. Signed-off-by: Ryan Grimm Signed-off-by: Benjamin Herrenschmidt Signed-off-by: Michael Neuling --- asm/head.S | 24 +++++++++++++++++++++--- core/chip.c | 15 +++++++++++---- core/cpu.c | 39 ++++++++++++++++++++++++++++++++++----- core/fast-reboot.c | 2 +- hdata/test/hdata_to_dt.c | 9 ++++++++- hw/xive.c | 2 +- include/chip.h | 31 +++++++++++++++++++++++++++++++ include/cpu.h | 6 ++++++ include/xscom.h | 3 +++ 9 files changed, 116 insertions(+), 15 deletions(-) diff --git a/asm/head.S b/asm/head.S index b565f6c9..14615390 100644 --- a/asm/head.S +++ b/asm/head.S @@ -328,6 +328,7 @@ boot_offset: * r28 : PVR * r27 : DTB pointer (or NULL) * r26 : PIR thread mask + * r25 : P9 fused core flag */ .global boot_entry boot_entry: @@ -342,13 +343,21 @@ boot_entry: cmpwi cr0,%r3,PVR_TYPE_P8NVL beq 2f cmpwi cr0,%r3,PVR_TYPE_P9 - beq 1f + beq 3f cmpwi cr0,%r3,PVR_TYPE_P9P - beq 1f + beq 3f attn /* Unsupported CPU type... what do we do ? */ b . /* loop here, just in case attn is disabled */ - /* P8 -> 8 threads */ + /* Check for fused core and set flag */ +3: + li %r3, 0x1e0 + mtspr SPR_SPRC, %r3 + mfspr %r3, SPR_SPRD + andi. %r25, %r3, 1 + beq 1f + + /* P8 or P9 fused -> 8 threads */ 2: li %r26,7 /* Get our reloc offset into r30 */ @@ -374,6 +383,15 @@ boot_entry: #endif mtmsrd %r3,0 + /* If fused, t1 is primary chiplet and must init shared sprs */ + andi. %r3,%r25,1 + beq not_fused + + mfspr %r31,SPR_PIR + andi. %r3,%r31,1 + bnel init_shared_sprs + +not_fused: /* Check our PIR, avoid threads */ mfspr %r31,SPR_PIR and. %r0,%r31,%r26 diff --git a/core/chip.c b/core/chip.c index 8afc6bb5..1e02244a 100644 --- a/core/chip.c +++ b/core/chip.c @@ -6,6 +6,7 @@ #include #include #include +#include static struct proc_chip *chips[MAX_CHIPS]; enum proc_chip_quirks proc_chip_quirks; @@ -23,7 +24,10 @@ uint32_t pir_to_chip_id(uint32_t pir) uint32_t pir_to_core_id(uint32_t pir) { if (proc_gen == proc_gen_p9) - return P9_PIR2COREID(pir); + if (this_cpu()->is_fused_core) + return P9_PIRFUSED2NORMALCOREID(pir); + else + return P9_PIR2COREID(pir); else if (proc_gen == proc_gen_p8) return P8_PIR2COREID(pir); else @@ -32,9 +36,12 @@ uint32_t pir_to_core_id(uint32_t pir) uint32_t pir_to_thread_id(uint32_t pir) { - if (proc_gen == proc_gen_p9) - return P9_PIR2THREADID(pir); - else if (proc_gen == proc_gen_p8) + if (proc_gen == proc_gen_p9) { + if (this_cpu()->is_fused_core) + return P9_PIR2FUSEDTHREADID(pir); + else + return P9_PIR2THREADID(pir); + } else if (proc_gen == proc_gen_p8) return P8_PIR2THREADID(pir); else assert(false); diff --git a/core/cpu.c b/core/cpu.c index d5b7d623..489cad56 100644 --- a/core/cpu.c +++ b/core/cpu.c @@ -913,6 +913,14 @@ void cpu_disable_all_threads(struct cpu_thread *cpu) /* XXX Do something to actually stop the core */ } +static int is_fused_core (void) +{ + unsigned int core_thread_state; + mtspr(SPR_SPRC, 0x00000000000001e0ULL); + core_thread_state = mfspr(SPR_SPRD); + return core_thread_state & PPC_BIT(63); +} + static void init_cpu_thread(struct cpu_thread *t, enum cpu_thread_state state, unsigned int pir) @@ -932,6 +940,7 @@ static void init_cpu_thread(struct cpu_thread *t, #ifdef STACK_CHECK_ENABLED t->stack_bot_mark = LONG_MAX; #endif + t->is_fused_core = is_fused_core(); assert(pir == container_of(t, struct cpu_stack, cpu) - cpu_stacks); } @@ -1016,14 +1025,16 @@ void init_boot_cpu(void) " (max %d threads/core)\n", cpu_thread_count); break; case proc_gen_p9: - cpu_thread_count = 4; + if (is_fused_core()) + cpu_thread_count = 8; + else + cpu_thread_count = 4; prlog(PR_INFO, "CPU: P9 generation processor" " (max %d threads/core)\n", cpu_thread_count); break; default: prerror("CPU: Unknown PVR, assuming 1 thread\n"); cpu_thread_count = 1; - cpu_max_pir = mfspr(SPR_PIR); } if (is_power9n(pvr) && (PVR_VERS_MAJ(pvr) == 1)) { @@ -1151,7 +1162,7 @@ void init_all_cpus(void) /* Iterate all CPUs in the device-tree */ dt_for_each_child(cpus, cpu) { - unsigned int pir, server_no, chip_id; + unsigned int pir, server_no, chip_id, threads; enum cpu_thread_state state; const struct dt_property *p; struct cpu_thread *t, *pt; @@ -1179,6 +1190,14 @@ void init_all_cpus(void) prlog(PR_INFO, "CPU: CPU from DT PIR=0x%04x Server#=0x%x" " State=%d\n", pir, server_no, state); + /* Check max PIR */ + if (cpu_max_pir < (pir + cpu_thread_count - 1)) { + prlog(PR_WARNING, "CPU: CPU potentially out of range" + "PIR=0x%04x MAX=0x%04x !\n", + pir, cpu_max_pir); + continue; + } + /* Setup thread 0 */ assert(pir <= cpu_max_pir); t = pt = &cpu_stacks[pir].cpu; @@ -1204,11 +1223,21 @@ void init_all_cpus(void) /* Add the decrementer width property */ dt_add_property_cells(cpu, "ibm,dec-bits", dec_bits); + if (t->is_fused_core) + dt_add_property(t->node, "ibm,fused-core", NULL, 0); + /* Iterate threads */ p = dt_find_property(cpu, "ibm,ppc-interrupt-server#s"); if (!p) continue; - for (thread = 1; thread < (p->len / 4); thread++) { + threads = p->len / 4; + if (threads > cpu_thread_count) { + prlog(PR_WARNING, "CPU: Threads out of range for PIR 0x%04x" + " threads=%d max=%d\n", + pir, threads, cpu_thread_count); + threads = cpu_thread_count; + } + for (thread = 1; thread < threads; thread++) { prlog(PR_TRACE, "CPU: secondary thread %d found\n", thread); t = &cpu_stacks[pir + thread].cpu; @@ -1394,7 +1423,7 @@ static int64_t cpu_change_all_hid0(struct hid0_change_req *req) assert(jobs); for_each_available_cpu(cpu) { - if (!cpu_is_thread0(cpu)) + if (!cpu_is_thread0(cpu) && !cpu_is_core_chiplet_primary(cpu)) continue; if (cpu == this_cpu()) continue; diff --git a/core/fast-reboot.c b/core/fast-reboot.c index 410acfe6..8ce3ae6a 100644 --- a/core/fast-reboot.c +++ b/core/fast-reboot.c @@ -227,7 +227,7 @@ static void cleanup_cpu_state(void) struct cpu_thread *cpu = this_cpu(); /* Per core cleanup */ - if (cpu_is_thread0(cpu)) { + if (cpu_is_thread0(cpu) | cpu_is_core_chiplet_primary(cpu)) { /* Shared SPRs whacked back to normal */ /* XXX Update the SLW copies ! Also dbl check HIDs etc... */ diff --git a/hdata/test/hdata_to_dt.c b/hdata/test/hdata_to_dt.c index 11b7a3ac..bafdb90d 100644 --- a/hdata/test/hdata_to_dt.c +++ b/hdata/test/hdata_to_dt.c @@ -38,7 +38,11 @@ struct spira_ntuple; static void *ntuple_addr(const struct spira_ntuple *n); /* Stuff which core expects. */ -#define __this_cpu ((struct cpu_thread *)NULL) +struct cpu_thread *my_fake_cpu; +static struct cpu_thread *this_cpu(void) +{ + return my_fake_cpu; +} unsigned long tb_hz = 512000000; @@ -74,6 +78,7 @@ unsigned long tb_hz = 512000000; struct cpu_thread { uint32_t pir; uint32_t chip_id; + bool is_fused_core; }; struct cpu_job *__cpu_queue_job(struct cpu_thread *cpu, const char *name, @@ -95,6 +100,8 @@ static inline struct cpu_job *cpu_queue_job(struct cpu_thread *cpu, struct cpu_thread __boot_cpu, *boot_cpu = &__boot_cpu; static unsigned long fake_pvr = PVR_P8; +unsigned int cpu_thread_count = 8; + static inline unsigned long mfspr(unsigned int spr) { assert(spr == SPR_PVR); diff --git a/hw/xive.c b/hw/xive.c index 41575dae..78b8ab3a 100644 --- a/hw/xive.c +++ b/hw/xive.c @@ -3048,7 +3048,7 @@ static void xive_init_cpu(struct cpu_thread *c) * of a pair is present we just do the setup for each of them, which * is harmless. */ - if (cpu_is_thread0(c)) + if (cpu_is_thread0(c) || cpu_is_core_chiplet_primary(c)) xive_configure_ex_special_bar(x, c); /* Initialize the state structure */ diff --git a/include/chip.h b/include/chip.h index f14e78b3..066e37ad 100644 --- a/include/chip.h +++ b/include/chip.h @@ -56,6 +56,26 @@ * thus we have a 6-bit core number. * * Note: XIVE Only supports 4-bit chip numbers ... + * + * Upper PIR Bits + * -------------- + * + * Normal-Core Mode: + * 57:61 CoreID + * 62:63 ThreadID + * + * Fused-Core Mode: + * 57:59 FusedQuadID + * 60 FusedCoreID + * 61:63 FusedThreadID + * + * FusedCoreID 0 contains normal-core chiplet 0 and 1 + * FusedCoreID 1 contains normal-core chiplet 2 and 3 + * + * Fused cores have interleaved threads: + * core chiplet 0/2 = t0, t2, t4, t6 + * core chiplet 1/3 = t1, t3, t5, t7 + * */ #define P9_PIR2GCID(pir) (((pir) >> 8) & 0x7f) @@ -67,6 +87,17 @@ #define P9_GCID2CHIPID(gcid) ((gcid) & 0x7) +#define P9_PIR2FUSEDQUADID(pir) (((pir) >> 4) & 0x7) + +#define P9_PIR2FUSEDCOREID(pir) (((pir) >> 3) & 0x1) + +#define P9_PIR2FUSEDTHREADID(pir) ((pir) & 0x7) + +#define P9_PIRFUSED2NORMALCOREID(pir) \ + (P9_PIR2FUSEDQUADID(pir) << 2) | \ + (P9_PIR2FUSEDCOREID(pir) << 1) | \ + (P9_PIR2FUSEDTHREADID(pir) & 1) + /* P9 specific ones mostly used by XIVE */ #define P9_PIR2LOCALCPU(pir) ((pir) & 0xff) #define P9_PIRFROMLOCALCPU(chip, cpu) (((chip) << 8) | (cpu)) diff --git a/include/cpu.h b/include/cpu.h index 686310d7..05bd0941 100644 --- a/include/cpu.h +++ b/include/cpu.h @@ -41,6 +41,7 @@ struct cpu_thread { uint32_t server_no; uint32_t chip_id; bool is_secondary; + bool is_fused_core; struct cpu_thread *primary; enum cpu_thread_state state; struct dt_node *node; @@ -238,6 +239,11 @@ static inline bool cpu_is_thread0(struct cpu_thread *cpu) return cpu->primary == cpu; } +static inline bool cpu_is_core_chiplet_primary(struct cpu_thread *cpu) +{ + return cpu->is_fused_core & (cpu_get_thread_index(cpu) == 1); +} + static inline bool cpu_is_sibling(struct cpu_thread *cpu1, struct cpu_thread *cpu2) { diff --git a/include/xscom.h b/include/xscom.h index 8a466d56..76eea9ac 100644 --- a/include/xscom.h +++ b/include/xscom.h @@ -110,6 +110,9 @@ /* * Additional useful definitions for P9 + * + * Note: In all of these, the core numbering is the + * *normal* (small) core number. */ /*