From patchwork Mon Apr 27 14:56:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gautham R Shenoy X-Patchwork-Id: 1278140 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [203.11.71.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49B48N1hb2z9sSb for ; Tue, 28 Apr 2020 11:40:04 +1000 (AEST) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from bilbo.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 49B48N0jbPzDqSS for ; Tue, 28 Apr 2020 11:40:04 +1000 (AEST) X-Original-To: skiboot-stable@lists.ozlabs.org Delivered-To: skiboot-stable@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=linux.vnet.ibm.com (client-ip=148.163.156.1; helo=mx0a-001b2d01.pphosted.com; envelope-from=ego@linux.vnet.ibm.com; receiver=) Authentication-Results: lists.ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.vnet.ibm.com Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 499p8x531mzDqP0; Tue, 28 Apr 2020 01:09:37 +1000 (AEST) Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03REYrAa070404; Mon, 27 Apr 2020 11:09:34 -0400 Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com with ESMTP id 30mfhd2u46-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 27 Apr 2020 11:09:34 -0400 Received: from m0098396.ppops.net (m0098396.ppops.net [127.0.0.1]) by pps.reinject (8.16.0.36/8.16.0.36) with SMTP id 03REZlJm074923; Mon, 27 Apr 2020 11:09:34 -0400 Received: from ppma03dal.us.ibm.com (b.bd.3ea9.ip4.static.sl-reverse.com [169.62.189.11]) by mx0a-001b2d01.pphosted.com with ESMTP id 30mfhd2u3d-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 27 Apr 2020 11:09:34 -0400 Received: from pps.filterd (ppma03dal.us.ibm.com [127.0.0.1]) by ppma03dal.us.ibm.com (8.16.0.27/8.16.0.27) with SMTP id 03RF4tU9008876; Mon, 27 Apr 2020 15:09:33 GMT Received: from b03cxnp08028.gho.boulder.ibm.com (b03cxnp08028.gho.boulder.ibm.com [9.17.130.20]) by ppma03dal.us.ibm.com with ESMTP id 30mcu715x8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 27 Apr 2020 15:09:33 +0000 Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 03RF9Vct23789928 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 27 Apr 2020 15:09:32 GMT Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DF759C6055; Mon, 27 Apr 2020 15:09:31 +0000 (GMT) Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 83650C6061; Mon, 27 Apr 2020 15:09:31 +0000 (GMT) Received: from sofia.ibm.com (unknown [9.79.187.104]) by b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTP; Mon, 27 Apr 2020 15:09:31 +0000 (GMT) Received: by sofia.ibm.com (Postfix, from userid 1000) id 099DD2E3019; Mon, 27 Apr 2020 20:26:52 +0530 (IST) From: "Gautham R. Shenoy" To: skiboot@lists.ozlabs.org, Vaidyanathan Srinivasan , Nicholas Piggin , Frederic Barrat , "Oliver O'Halloran" , Vasant Hegde Date: Mon, 27 Apr 2020 20:26:40 +0530 Message-Id: <1587999401-3719-2-git-send-email-ego@linux.vnet.ibm.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1587999401-3719-1-git-send-email-ego@linux.vnet.ibm.com> References: <1587999401-3719-1-git-send-email-ego@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-27_10:2020-04-27, 2020-04-27 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 suspectscore=0 phishscore=0 lowpriorityscore=0 bulkscore=0 adultscore=0 mlxlogscore=999 malwarescore=0 impostorscore=0 clxscore=1015 priorityscore=1501 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2003020000 definitions=main-2004270123 X-Mailman-Approved-At: Tue, 28 Apr 2020 11:40:01 +1000 Subject: [Skiboot-stable] [PATCH v2 1/2] sensors: occ: Fix the GPU detection code X-BeenThere: skiboot-stable@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: "Patches, review, and discussion for stable releases of skiboot" List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "Gautham R. Shenoy" , skiboot-stable@lists.ozlabs.org MIME-Version: 1.0 Errors-To: skiboot-stable-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Skiboot-stable" From: "Gautham R. Shenoy" commit bebe096ee242 ("sensors: occ: Skip GPU sensors for non-gpu systems") assumes that presence of "ibm,power9-npu" compatible node indicates the presence of GPUs. However this is incorrect, as even OpenCAPI is supported via NPU. Thus ZZ systems, which have OpenCAPI connectors but not GPUs will have "ibm,power9-npu" compatible nodes. This results in OPAL creating device-tree entries for the GPU sensors on ZZ systems which don't even have GPUs. This patch fixes the GPU detection code in occ-sensors, by first checking for "ibm,ioda2-npu2-phb" compatible node which indicates the presence of nvlink. Only if such a node exists, do we check with the OCC for presence of GPUs on systems to confirm the presence of the GPU. Otherwise, we cut the GPU sensors. Thanks to Frederic Barrat for suggesting "ibm,ioda2-npu2-phb" for detecting the presence of nvlink GPUs. cc: skiboot-stable@lists.ozlabs.org Fixes: commit bebe096ee242 ("sensors: occ: Skip GPU sensors for non-gpu systems") Reported-by: Pavaman Subramaniyam Tested-by: Pavaman Subramaniyam Reviewed-by: Vaidyanathan Srinivasan Reviewed-by: Frederic Barrat Signed-off-by: Gautham R. Shenoy --- Change from v1: Fixed max_gpus_per_chip = 3 hw/occ-sensor.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/hw/occ-sensor.c b/hw/occ-sensor.c index 524d00f..d97cc33 100644 --- a/hw/occ-sensor.c +++ b/hw/occ-sensor.c @@ -521,8 +521,26 @@ bool occ_sensors_init(void) dt_add_property_cells(sg, "#address-cells", 1); dt_add_property_cells(sg, "#size-cells", 0); - if (dt_find_compatible_node(dt_root, NULL, "ibm,power9-npu")) - has_gpu = true; + /* + * On POWER9, ibm,ioda2-npu2-phb indicates the presence of a + * GPU NVlink. + */ + if (dt_find_compatible_node(dt_root, NULL, "ibm,ioda2-npu2-phb")) { + + for_each_chip(chip) { + int max_gpus_per_chip = 3, i; + + for(i = 0; i < max_gpus_per_chip; i++) { + has_gpu = occ_get_gpu_presence(chip, i); + + if (has_gpu) + break; + } + + if (has_gpu) + break; + } + } for_each_chip(chip) { struct occ_sensor_data_header *hb;