From patchwork Tue Apr 2 14:14:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Thompson X-Patchwork-Id: 1918854 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=185.125.189.65; helo=lists.ubuntu.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=patchwork.ozlabs.org) Received: from lists.ubuntu.com (lists.ubuntu.com [185.125.189.65]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4V88zK0FJDz1yY4 for ; Wed, 3 Apr 2024 01:15:08 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=lists.ubuntu.com) by lists.ubuntu.com with esmtp (Exim 4.86_2) (envelope-from ) id 1rreub-0004BY-EH; Tue, 02 Apr 2024 14:14:53 +0000 Received: from mail-dm6nam12on2082.outbound.protection.outlook.com ([40.107.243.82] helo=NAM12-DM6-obe.outbound.protection.outlook.com) by lists.ubuntu.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1rreuZ-0004BG-Bb for kernel-team@lists.ubuntu.com; Tue, 02 Apr 2024 14:14:51 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=m5xK+Ki0VQjV/uumcS0JlUQz5ShhzRlTfFHj1Vk0DL0vMXt+imWI+kUb2rh4q6w9C7kjkAcdVCXHpnbMgOJiRyJjN6sPNQWSYJBCP+BjP8fFxcRi6YXxNrBTtN4/+ikCyWxSS8zcnDuxxHNbnusiuJMI0A1czwuM+Vdb3/SL68nUJWsyOCE1JtRu11uUXrKeqinyjpS6jynxgMvVYu0Irv7cHgTxL9JK91OIBQdvv/eJdNC4V9uID+q2j4oY5q8dEjTpB2qwmCVWQUydu47TQMinHkC9aHB5uPpxgSowTlE8wV9Ll/t6Axkn4GikkDJWe8BRL6GCPQ0fkZMaGDVPTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=d8qT+CWda3Lf1BwYHpzyPBUd7wpLy9IShCbFKG4uoaQ=; b=Rw12KrKU2UEGsASbiIWBpHvYdcj5ndPNqQ6xyMF1HLnQ6rzV7fqEs51qq5Q/nMyG97yjOhKsJZIiXzcalKEsrzJKXcUaXfCfo5IGk53pNOZ0VHgPsJkd8g0J/Y7jdLDa/zXdjv8f56iaAXFCIJjb9zQ3PzsgXL5cLe+oOlj6fH+MyEKMVOHL26XP+q4WVkGg+HD4r77Qup5xLQyW3i5w4LKAWiDCZZIdgqQnrUXstJq1V1fvLNwzyuegOIqxKJw1lmHiIDwJNfXkx6sqk+tKrp9BNyjzbmuKHjKTidCHsyMCT7hSDxmycBpWHy4FBaFIV/zroJByXS09h/esGAlvKA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=lists.ubuntu.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none (0) Received: from SJ0PR13CA0179.namprd13.prod.outlook.com (2603:10b6:a03:2c7::34) by MW5PR12MB5624.namprd12.prod.outlook.com (2603:10b6:303:19d::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7409.46; Tue, 2 Apr 2024 14:14:46 +0000 Received: from SJ5PEPF000001CB.namprd05.prod.outlook.com (2603:10b6:a03:2c7:cafe::1f) by SJ0PR13CA0179.outlook.office365.com (2603:10b6:a03:2c7::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.26 via Frontend Transport; Tue, 2 Apr 2024 14:14:46 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by SJ5PEPF000001CB.mail.protection.outlook.com (10.167.242.40) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7452.22 via Frontend Transport; Tue, 2 Apr 2024 14:14:46 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.41; Tue, 2 Apr 2024 07:14:34 -0700 Received: from rnnvmail202.nvidia.com (10.129.68.7) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1258.12; Tue, 2 Apr 2024 07:14:33 -0700 Received: from vdi.nvidia.com (10.127.8.9) by mail.nvidia.com (10.129.68.7) with Microsoft SMTP Server id 15.2.1258.12 via Frontend Transport; Tue, 2 Apr 2024 07:14:33 -0700 From: David Thompson To: Subject: [SRU][J:linux-bluefield][PATCH v1 1/1] mlxbf_gige: stop interface during shutdown Date: Tue, 2 Apr 2024 10:14:31 -0400 Message-ID: X-Mailer: git-send-email 2.30.1 In-Reply-To: References: MIME-Version: 1.0 X-NV-OnPremToCloud: ExternallySecured X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SJ5PEPF000001CB:EE_|MW5PR12MB5624:EE_ X-MS-Office365-Filtering-Correlation-Id: 33349f30-ef5b-4370-e6f1-08dc531f4019 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 7t/d52RW35HQQ7jJgZOJEwGvH3d/qscY4+8cyFHD5Xewo/dFj196oVwaTu2RPlPdqOMExpX2YIgl0X/rNvdntcZoDtfCGHF18hq0oCJ9BOdjobVRTNMgLuDkUEEoe6ZT0fxQTMuYVORpBbyNBlfuopHGexdYczaFw53Jq5duYs1fZZHQcqxmec1VQDRlswUudeLwuyUW4HGmc7kATzww/Zzd+y8bpdlV5tp2rn4X8yNbnKl1X81S0ebJc8eoL5oJVXr//quRl2M7VQqsY+b7OWru2um++Zww1PbTh0i/CpbNIVYUoxM+GQ4Sl/LK1RY6eLPf3MGHbY7VtHsb7c5PLF4w8/PyGA7qEgRPUADbF5e/f40u7h/wuHQFdaVrkcfKQT7aJqICh1yK2lBD9m9PBbpWZoxHrQSgHtC4KBg44Tm549L3ZYqD7D1wba0uOYH3rSyAUAoLFRC6yuWQ6pzpP6gfRI7cGVVr/t7VLIbcaH1+W4YkVpdt4epiLHik7s3h90PnAnRpAHsWFgpFvYEex+n2Kg+dw49CmNUmXLJ0NcfzhrCajsspo/mhKnQfBiCKYyIaauNPD3GBmO1CMfHruwImPTd9umHeOg+sYm9KR5VYtenUnZ/nOD2BhJshtleCqvQ1GdgxbF2VYoIYXZrEPnEfudMRM14vSXh+/zkuj7g/U1htKjwUSsGD0V3i1pA+IukKEYwVNPR8Azj9sHwZ6nNS/Jbws2YvxZlXc56sFpx5o+y/tuEZfNRSOFIWQ2Nt X-Forefront-Antispam-Report: CIP:216.228.117.161; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge2.nvidia.com; CAT:NONE; SFS:(13230031)(376005)(36860700004)(82310400014)(1800799015); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 02 Apr 2024 14:14:46.4142 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 33349f30-ef5b-4370-e6f1-08dc531f4019 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.161]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: SJ5PEPF000001CB.namprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW5PR12MB5624 Received-SPF: softfail client-ip=40.107.243.82; envelope-from=davthompson@nvidia.com; helo=NAM12-DM6-obe.outbound.protection.outlook.com X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: asmaa@nvidia.com, davthompson@nvidia.com Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" BugLink: https://bugs.launchpad.net/bugs/2059951 The mlxbf_gige driver intermittantly encounters a NULL pointer exception while the system is shutting down via "reboot" command. The mlxbf_driver will experience an exception right after executing its shutdown() method. One example of this exception is: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000070 Mem abort info: ESR = 0x0000000096000004 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault Data abort info: ISV = 0, ISS = 0x00000004 CM = 0, WnR = 0 user pgtable: 4k pages, 48-bit VAs, pgdp=000000011d373000 [0000000000000070] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 96000004 [#1] SMP CPU: 0 PID: 13 Comm: ksoftirqd/0 Tainted: G S OE 5.15.0-bf.6.gef6992a #1 Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS 4.0.2.12669 Apr 21 2023 pstate: 20400009 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : mlxbf_gige_handle_tx_complete+0xc8/0x170 [mlxbf_gige] lr : mlxbf_gige_poll+0x54/0x160 [mlxbf_gige] sp : ffff8000080d3c10 x29: ffff8000080d3c10 x28: ffffcce72cbb7000 x27: ffff8000080d3d58 x26: ffff0000814e7340 x25: ffff331cd1a05000 x24: ffffcce72c4ea008 x23: ffff0000814e4b40 x22: ffff0000814e4d10 x21: ffff0000814e4128 x20: 0000000000000000 x19: ffff0000814e4a80 x18: ffffffffffffffff x17: 000000000000001c x16: ffffcce72b4553f4 x15: ffff80008805b8a7 x14: 0000000000000000 x13: 0000000000000030 x12: 0101010101010101 x11: 7f7f7f7f7f7f7f7f x10: c2ac898b17576267 x9 : ffffcce720fa5404 x8 : ffff000080812138 x7 : 0000000000002e9a x6 : 0000000000000080 x5 : ffff00008de3b000 x4 : 0000000000000000 x3 : 0000000000000001 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: mlxbf_gige_handle_tx_complete+0xc8/0x170 [mlxbf_gige] mlxbf_gige_poll+0x54/0x160 [mlxbf_gige] __napi_poll+0x40/0x1c8 net_rx_action+0x314/0x3a0 __do_softirq+0x128/0x334 run_ksoftirqd+0x54/0x6c smpboot_thread_fn+0x14c/0x190 kthread+0x10c/0x110 ret_from_fork+0x10/0x20 Code: 8b070000 f9000ea0 f95056c0 f86178a1 (b9407002) ---[ end trace 7cc3941aa0d8e6a4 ]--- Kernel panic - not syncing: Oops: Fatal exception in interrupt Kernel Offset: 0x4ce722520000 from 0xffff800008000000 PHYS_OFFSET: 0x80000000 CPU features: 0x000005c1,a3330e5a Memory Limit: none ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- During system shutdown, the mlxbf_gige driver's shutdown() is always executed. However, the driver's stop() method will only execute if networking interface configuration logic within the Linux distribution has been setup to do so. If shutdown() executes but stop() does not execute, NAPI remains enabled and this can lead to an exception if NAPI is scheduled while the hardware interface has only been partially deinitialized. The networking interface managed by the mlxbf_gige driver must be properly stopped during system shutdown so that IFF_UP is cleared, the hardware interface is put into a clean state, and NAPI is fully deinitialized. Fixes: f92e1869d74e ("Add Mellanox BlueField Gigabit Ethernet driver") Signed-off-by: David Thompson Link: https://lore.kernel.org/r/20240325210929.25362-1-davthompson@nvidia.com Signed-off-by: Jakub Kicinski (cherry picked from commit 09ba28e1cd3cf715daab1fca6e1623e22fd754a6) Signed-off-by: David Thompson Acked-by: Andrei Gherzan --- .../net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c index 74ef75e00739..29fe513442f9 100644 --- a/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c +++ b/drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_main.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include "mlxbf_gige.h" @@ -531,8 +532,13 @@ static void mlxbf_gige_shutdown(struct platform_device *pdev) { struct mlxbf_gige *priv = platform_get_drvdata(pdev); - writeq(0, priv->base + MLXBF_GIGE_INT_EN); - mlxbf_gige_clean_port(priv); + rtnl_lock(); + netif_device_detach(priv->netdev); + + if (netif_running(priv->netdev)) + dev_close(priv->netdev); + + rtnl_unlock(); } static const struct acpi_device_id __maybe_unused mlxbf_gige_acpi_match[] = {