From patchwork Thu Mar 15 16:03:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Scheurich X-Patchwork-Id: 886318 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=ericsson.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=ericsson.com header.i=@ericsson.com header.b="CIMwYvwN"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=ericsson.com header.i=@ericsson.com header.b="MwsT+FJD"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 402D2v37YZz9sVK for ; Fri, 16 Mar 2018 03:05:43 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 285C4113C; Thu, 15 Mar 2018 16:05:20 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 902B310F6 for ; Thu, 15 Mar 2018 16:05:18 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from sessmg23.ericsson.net (sessmg23.ericsson.net [193.180.251.45]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id A1B735CD for ; Thu, 15 Mar 2018 16:05:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; d=ericsson.com; s=mailgw201801; c=relaxed/simple; q=dns/txt; i=@ericsson.com; t=1521129914; h=From:Sender:Reply-To:Subject:Date:Message-ID:To:CC:MIME-Version:Content-Type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=MlrEFT7NoPKtqTvh9Ru2RxhgOKCNnsuXDcB2TnfGOU8=; b=CIMwYvwN1xWBAF5frWUFdFZS3QrRnafNM8Z4taQJpXqQHsbUHiA5vCjmQJyuNlsH 7sKVQPGJs7/NU8gOF7ycioc87OjFhAcalZudNGG7+cLcTWiIkWGu7fAKAT39F0Jm r3eqnF+uWFcmkvTpY9kl+pp6KB3ZKgWTWs1QNwEWWSw=; X-AuditID: c1b4fb2d-499ff70000005540-b8-5aaa99ba7560 Received: from ESESSHC016.ericsson.se (Unknown_Domain [153.88.183.66]) by sessmg23.ericsson.net (Symantec Mail Security) with SMTP id 70.FB.21824.AB99AAA5; Thu, 15 Mar 2018 17:05:14 +0100 (CET) Received: from ESESBMB505.ericsson.se (153.88.183.172) by ESESSHC016.ericsson.se (153.88.183.66) with Microsoft SMTP Server (TLS) id 14.3.382.0; Thu, 15 Mar 2018 17:05:14 +0100 Received: from ESESBMB505.ericsson.se (153.88.183.172) by ESESBMB505.ericsson.se (153.88.183.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1034.26; Thu, 15 Mar 2018 17:05:14 +0100 Received: from EUR03-VE1-obe.outbound.protection.outlook.com (153.88.183.157) by ESESBMB505.ericsson.se (153.88.183.172) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1034.26 via Frontend Transport; Thu, 15 Mar 2018 17:05:14 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=nm63caFZTGuWT5lUYCHYbJCUv3RvOKBLre8209UzT2w=; b=MwsT+FJDtEgNZRF4JOytpOEO3MrkqDPev4BpwP9OYRzZyM0Qj2BEuhfe/yGfM4EQngeCNzSvOD6krF9PAEPZRjLh8FE2eDob+qpU3VWngv5rKMlcgxsFbzG5SoAv+Hn+FPm9fsgn2hknwskFOmavX+PSlh/pshY+6Td0GX8QyDs= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=jan.scheurich@ericsson.com; Received: from ubuntu.eed.ericsson.se (129.192.10.2) by DB3PR07MB0650.eurprd07.prod.outlook.com (2a01:111:e400:943d::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.588.7; Thu, 15 Mar 2018 16:05:00 +0000 From: Jan Scheurich To: Date: Thu, 15 Mar 2018 17:03:11 +0100 Message-ID: <1521129793-27851-2-git-send-email-jan.scheurich@ericsson.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1521129793-27851-1-git-send-email-jan.scheurich@ericsson.com> References: <1521129793-27851-1-git-send-email-jan.scheurich@ericsson.com> MIME-Version: 1.0 X-Originating-IP: [129.192.10.2] X-ClientProxiedBy: HE1PR07CA0009.eurprd07.prod.outlook.com (2603:10a6:7:67::19) To DB3PR07MB0650.eurprd07.prod.outlook.com (2a01:111:e400:943d::16) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: cfe1eba6-92a8-4c23-0b0d-08d58a8e81e8 X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(5600026)(4604075)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7153060)(7193020); SRVR:DB3PR07MB0650; X-Microsoft-Exchange-Diagnostics: 1; DB3PR07MB0650; 3:LjEX6lmrpHgQ08LDyX2GuSTZCLdADY6lx6wWpTedvMlJHKh0DYb9fR3z/Ig/yRxsp56ehyV9+E6X7uytj1LN6ZEJ/MDPLDFlkoXToaF8e9mnuE5FtVyMWZ3lw84VY9eXsObtY32k3sosuiZe1HTPneR0UA/cvfnR7+gVd47+e7Wd8GVVbfj9wl8BFCoPU+q+aQLtdV+46Avx9KDk8Js1C0gVVcXi181ATheY2qj3+8PltgNGNMZx+vzssuJTiH5V; 25:w95upCLueHCMX312SUC7KqXMuvhWujSzEY9Bzg76jARLmbnWqTGTrS7KGJPZXy5HPnANYdT+AbXQbtRk/NbdNmDJH6wXMsA/h8/QciVnvVADsNMxf/tDvPqjQUSKZCevE4uuNvsWc2xYmIS8m5wN+qDXysTcOTaHNg1RQ3p2BoG1i/sEyCwj6K1glxtAF2lhjNgL5euSmNHiX0MEiu0shfB4TJEoiNEiwX+WlZNSGTWK6G5EpLyR86QIVO2hdiEHO0g991H4wBrV1nqa+ovYLZ0Ul5aMh9NWRjCzqyut3jh/6r4Km2HjW1PhKunt7jR1iO0vn6mvBQ97pl6qSYph/g==; 31:tW1YrLorMA4Lwc5OboB5MEBdafBWBqtrbLPQbgvFzOwuX+XWpd6FRxA3t3TTOiVj3tl30vRBNDo0hwrAYDNEop7+Wph0KLR0TtevP6avRd9cCUIYo4hCGCs+NNKygcSXwrFJ62v57/3YNZFijoMR7P/nCuTZA4iNnEsRvSzftIOZe6IJi+lKLhbx1oPbtv+dyaNif0eEfVTNwexRa4i/IMGa/sacVyKyXGkcv1VIoiY= X-MS-TrafficTypeDiagnostic: DB3PR07MB0650: X-Microsoft-Exchange-Diagnostics: 1; DB3PR07MB0650; 20:mnjyPwb4lLSmxiDCK7diwptEe+pXxs5AdBbVz64sATc789BGUlNIoqs5m9TUOyL+nIXqQzLIvZVBCqeARiWkQ97y7CsNuqiVYAcIG3mUkyaFqJd4AROLLwDCrOPAgmSbIV9kjqSFwvpLXip1eZGilmpo6hpDOKqaDw4tr7DClgQA+AP3UwTUtKKOT3LQuj8GpmCayW+sLM2H0AS4ulpuJvVovanWXkZAKgdE9Sc+/MPaSLCdJjX4OneVU7h7IxwommfVCoetFH3+nbMtXDS78YtW9+nHuNMEw0YTJ5L0iv8fI5makfgZILnVyDoAscLIc6bXaH4nAh6+9zge6WHVbOeLExAIDAdSdebmGLKqjAYE0oHgO4BP12U0SmLyiK6SPcnCUFSGO8ztDfpCrsrltbbIsogjGuMM7bLt3lsLbFzaZrMK5xwmPmJ0T7xSkKSn1gV0UMj60uyNSUtZKZElFJv/5aKsVyGn7QdYeW5PmBECPOsQzkzG0il/c3LsmQaX; 4:25cp+ZL/bXJiUohGkMz2Cf15xCoMedFOPZROTSvKOY7DVFmXBKmqgUr4fQkNGsUxA7/u92sH3kMSkuJInvL9o082IfLXQ9u4SMeIwrZ/fwh8nSuNMhlgK184ldg7q8hgIer21toGgKJiXIthY4PUArqhQ0koJxFH67UlkB0mjtS/qX6kJ+3w88HuHfjD3d4C+mWdlNweXEp7l0cpCkcda+lj4LFYykfBzBXipjaHVW4LIlaX5IQ9e7JTgwSCnfgg/E9DyScgBwHa+vHgNwAkN7bjuaAigOVG6K4++HQx4iJVT9poYOoxbnzh73nCIb19 X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(37575265505322); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(3231221)(944501244)(52105095)(3002001)(93006095)(93001095)(10201501046)(6041310)(20161123560045)(20161123564045)(20161123562045)(20161123558120)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011); SRVR:DB3PR07MB0650; BCL:0; PCL:0; RULEID:; SRVR:DB3PR07MB0650; X-Forefront-PRVS: 0612E553B4 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(376002)(396003)(366004)(39860400002)(39380400002)(346002)(189003)(199004)(6116002)(316002)(16586007)(59450400001)(16526019)(478600001)(6512007)(6666003)(106356001)(50226002)(4326008)(305945005)(186003)(3846002)(86362001)(2351001)(66066001)(8936002)(105586002)(2950100002)(50466002)(48376002)(6916009)(76176011)(52116002)(7736002)(97736004)(36756003)(81156014)(51416003)(6506007)(386003)(81166006)(8676002)(5660300001)(6486002)(25786009)(2361001)(68736007)(47776003)(53936002)(2906002)(107886003)(26005); DIR:OUT; SFP:1101; SCL:1; SRVR:DB3PR07MB0650; H:ubuntu.eed.ericsson.se; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Received-SPF: None (protection.outlook.com: ericsson.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; DB3PR07MB0650; 23:8t28txK/fCFRNNc+9Qq5/Rglh04pUq7cYOjk91K3X?= /1Q2doXPuvSOYmKYDtrWnAdOEASmkI7wMLD7snuVChqIIO7g0z4iQQkppcEALyYiLF4R1Zbgq0a6NXDpqF0bgANVUwej84zdsbCAqfcZj563YFMrkJVuTqr6+RhAC6EtrW4PR/UEXOOjlpb0GXrj+IDy++SxQr57FthmTK69TkXQJWJGAebBgQTgB5PZiJ2MJRuP/2U5A4lD+YoP4wXx6bqc0aWHrk4RARHR3oKEmOTQGBjpe4iUmhPI3zLjLEjZ7OeXFMcQGA/N0uXkPPpDOKelN087Vfim/rfj/TPVkEGVHP1VggftbRwXHh6rSyHPLOII7kAqrFRS4aJ9etnQxKzU6vM8xC4ddHe/RDr3AlCQzZr/9JywRXwSj9+TioZ/Z1ZuJ18t2+x83DKY6dInL79MwhijX+fiB6tovuPZNrM3uv3vSJJZbj01RaYxfZiWOPP84XX3HtNegPzsE7kRfA55xquNg3JaBEXMylTnh3jFE2ISOr9zuLotTdB+Wr8E2x8RtP5MbASfojBViAR1ZSQLNwPgVXCRvH/bKbDVj7Py39v8MftYmv6f2jLeGBrfToFHZQKL8JGDR7KiLEf4plE4nQ3JxM6qWq6W/SRpZOWZaRXdHzuu2dP8Ss9te+M605lbcS+AIAC3E9MdPVybsF4cqJty84B8XxdSsU+fMtVmj8xY7xW49MxxBdEUEPH7wjY7o3BIKv68gMjoSrdMeeYefHMIkyIMeQ3HtiP1hRZ9LqX8ezIrJFo0rt8GmAZ8ewTLU3Jdo4ObK8jEqHOIQdUTd/JrN9gHawJMBrHe3jCQYQMoVVCKZMNNrOVdMkJcMRhWyrUzvmUhZ4Ic++pg765V64Hmc+9fNxOp4a9L82FPMnb2ktSATjX8Zh5aZSKXJgdGxKJnlSYgMpc3bIwJz0gK7C1c+0gzbBe8U4j/3uj+oNWgD29ahgg0/SfwclOPLGvEzZLiOcu+UnpGX8d7q/axkhifofmIs366UmKAg+XuuOlxss+r9R3Klqd8LNx8mHjVQuX3MDz+tqAtn7rm6H9FjtoC2dgukXQAiE04h8va8akt/6slzObSEI0KPYfi/ApCMqGNvbgscMV7HhV2lL+jGrja2dAsmnpK0W/7A76KpzMeADlViUxorEZ2G4g2AM= X-Microsoft-Antispam-Message-Info: qQtjDeCqVww1dVd4zfuZzzzpoVTVUttMb2gzVrnbUN6/VrEwMFVRrRJ3nurDQb/j26uOn8MXsEk8G4Tcoh8TbM1pjo3kk7hqSHh51mZY6HjKsByLB3WPyFGPpgb7suW+GghdD20s2zVBx/yR6uKbjERTeOQnP5djPYRAlBOkhBaHyFgSBEa80rzlBdDIMzgn X-Microsoft-Exchange-Diagnostics: 1; DB3PR07MB0650; 6:DJEbAzDTP5tT6Z1kQxO9wkA7irM+a9XuJ1MmerNPRzj/offC+EFs34YmmtMgNcK1U4ff4BZY9jXWMTtFvqC9J8daoEA+0VHcDhjuMTbqzSuesqCB4qYc9/9gTNLJse7BreeBTYwRO4bABDyZA4LXuedsZx78QZBPB0rlKrhX4vEDhs3pCK0KPKtm0w7q+vWUH/p+UPyFuW3iEnaig0qCwoDJZA8NmmeOlE2Lcf1ujVBZFgy8V99+zAS4wKA++E4rdkpylx4YbUeoDHeer0hwH00kg27o50r7CEfnRHNpFjXE6pdFRMaqDjGl+ZmZz02q8ikxjK+g6Y3/J6FwOM2tE8hS1phnKLJZxHJu4NbOSWU=; 5:WNfswAc4F6yQZ3ragEdSBXhjBFfSou6BRkJL38BsvgaEbpPtirYqarDPMDbienmX05+nmwqHxsUhE7HsL0/beqERY1+StYn8QgBIO6lqzDvdr82i8B7TKgx6zzovbfkT95ceYyCJu+SbtmWlbLiu8EUy406ejds4K26PQpyNQHg=; 24:7hOHIKg7ovS9hNrOb5qrRC5aow4VLRamU8X7n5ZWHTCZpdCHW929B73tmYc4nOJGutMyxgU36OHQtuqoSzfAXuQNxsP5DSXzkz1XwT3yuSo=; 7:7kMSr65zdYUEXGWGXzjemlHc1dvURNONtl8tFZVX3NBCnQHoHfgIglC0Vh2PxlgHq59UTLh0Zxs5rZg2Q2Ge+9q6/ceVpT7aRMfG6ryKWB9QUySZ7pHXd7oXvyTEFns7fgYEyv8/06h8X+MPI9Rho4+npF/8JwcNmKCD2Kz9y/7Qy+YskV2+WGoyrf54d/JwG35M8Pj+77vJA6/m93QDGw53Y58dBUGs/F1fYx+eTHYvQtwfiWca9JunfbsHftln SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Mar 2018 16:05:00.4594 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: cfe1eba6-92a8-4c23-0b0d-08d58a8e81e8 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR07MB0650 X-OriginatorOrg: ericsson.com X-Brightmail-Tracker: H4sIAAAAAAAAA02SfUhTYRTGe++92+6Gi7dleLQWMpDA0NT6Y0mYGsFKhD4IY1S68qLLT3ZN nERZLYZLLIk1NkMtk2g1ajY/yFE6caUmKGmgoSAtM4X8Kqepmdtd4H/POc/vOTwvvDQpKeeF 0er8IkaTr8qV8UWU+WxLctQbs1UZ82Q5Xt5qa0Dyrl4nKR/ULwvk9vE+ntzmmhUk8hT1zh+E YmJ4HSlm3g7xFZUOKzpBKUWHMplcdTGj2ZeQIcqeKhugCnXJJTO1LWQZ+hBnQEIa8AH4NTOM DEhES3AngrrPboobHAhGy2sDjhfBsKeZzw0NBHj7b/sdCs8TMPeymfAdQ/g86JqMBEcZCFis fiYwIJrm4xiobkv0McE4FP7WO/yXSGxEcOv7oj+8HZ8E3ZSb9GkKR4CtapDn02J8HCbmPHyu 7W7odt/n+W4KcQrMzu31rSUbiG5MT3H4Nug2e/yaxAAdk5MkFw0Hu8PsfxtgE4KRBRPiwntg zfWb4iApdPU9DOhUsP+8SXKBdwjer/YE0mUCMD91B6hI+FQ1JuCMJT60Nb/y1wOcA6s9qRwT D9N/vIhjGkgYNRkRZ+yCRo+Ddw9FWTZVt2yqXocIK9rBMiyblxW3P5rRqC+xbEF+dD5T1Ig2 PkiHYyWqFT2fTnIhTCNZkNhYYVVKeKpiVpvnQkCTsmDx4J2NlThTpS1lNAXpmiu5DOtCO2lK FiLuPiZWSnCWqojJYZhCRvPfJWhhWBnSGxNeB1/NK9RHh8vF6rpv/ckL8lPtIV+GPt7dkma7 HMRWZ6OC9LWS3sJr9prRCsuZtIyk0Asu+3ynu3yraUkvvC6tJDJinI+ajAlLRePKI9OH1wpW StuT6heYlHOW0YPrX1NehF98fOOBPVZb0yv1Oo2nnYMjA5NSbTzvaIRORrHZqthIUsOq/gE2 F+mMHAMAAA== X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: i.maximets@samsung.com Subject: [ovs-dev] [PATCH v9 1/3] netdev: Add optional qfill output parameter to rxq_recv() X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org If the caller provides a non-NULL qfill pointer and the netdev implemementation supports reading the rx queue fill level, the rxq_recv() function returns the remaining number of packets in the rx queue after reception of the packet burst to the caller. If the implementation does not support this, it returns -ENOTSUP instead. Reading the remaining queue fill level should not substantilly slow down the recv() operation. A first implementation is provided for ethernet and vhostuser DPDK ports in netdev-dpdk.c. This output parameter will be used in the upcoming commit for PMD performance metrics to supervise the rx queue fill level for DPDK vhostuser ports. Signed-off-by: Jan Scheurich --- lib/dpif-netdev.c | 2 +- lib/netdev-bsd.c | 8 +++++++- lib/netdev-dpdk.c | 25 +++++++++++++++++++++++-- lib/netdev-dummy.c | 8 +++++++- lib/netdev-linux.c | 7 ++++++- lib/netdev-provider.h | 7 ++++++- lib/netdev.c | 5 +++-- lib/netdev.h | 3 ++- 8 files changed, 55 insertions(+), 10 deletions(-) diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index b07fc6b..86d8739 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -3276,7 +3276,7 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, pmd->ctx.last_rxq = rxq; dp_packet_batch_init(&batch); - error = netdev_rxq_recv(rxq->rx, &batch); + error = netdev_rxq_recv(rxq->rx, &batch, NULL); if (!error) { /* At least one packet received. */ *recirc_depth_get() = 0; diff --git a/lib/netdev-bsd.c b/lib/netdev-bsd.c index 05974c1..b70f327 100644 --- a/lib/netdev-bsd.c +++ b/lib/netdev-bsd.c @@ -618,7 +618,8 @@ netdev_rxq_bsd_recv_tap(struct netdev_rxq_bsd *rxq, struct dp_packet *buffer) } static int -netdev_bsd_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch) +netdev_bsd_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch, + int *qfill) { struct netdev_rxq_bsd *rxq = netdev_rxq_bsd_cast(rxq_); struct netdev *netdev = rxq->up.netdev; @@ -643,6 +644,11 @@ netdev_bsd_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch) batch->packets[0] = packet; batch->count = 1; } + + if (qfill) { + *qfill = -ENOTSUP; + } + return retval; } diff --git a/lib/netdev-dpdk.c b/lib/netdev-dpdk.c index af9843a..66f2439 100644 --- a/lib/netdev-dpdk.c +++ b/lib/netdev-dpdk.c @@ -1808,7 +1808,7 @@ netdev_dpdk_vhost_update_rx_counters(struct netdev_stats *stats, */ static int netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, - struct dp_packet_batch *batch) + struct dp_packet_batch *batch, int *qfill) { struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev); struct ingress_policer *policer = netdev_dpdk_get_ingress_policer(dev); @@ -1846,11 +1846,24 @@ netdev_dpdk_vhost_rxq_recv(struct netdev_rxq *rxq, batch->count = nb_rx; dp_packet_batch_init_packet_fields(batch); + if (qfill) { + if (nb_rx == NETDEV_MAX_BURST) { + /* The DPDK API returns a uint32_t which often has invalid bits in + * the upper 16-bits. Need to restrict the value to uint16_t. */ + *qfill = rte_vhost_rx_queue_count(netdev_dpdk_get_vid(dev), + qid * VIRTIO_QNUM + VIRTIO_TXQ) + & UINT16_MAX; + } else { + *qfill = 0; + } + } + return 0; } static int -netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch) +netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch, + int *qfill) { struct netdev_rxq_dpdk *rx = netdev_rxq_dpdk_cast(rxq); struct netdev_dpdk *dev = netdev_dpdk_cast(rxq->netdev); @@ -1887,6 +1900,14 @@ netdev_dpdk_rxq_recv(struct netdev_rxq *rxq, struct dp_packet_batch *batch) batch->count = nb_rx; dp_packet_batch_init_packet_fields(batch); + if (qfill) { + if (nb_rx == NETDEV_MAX_BURST) { + *qfill = rte_eth_rx_queue_count(rx->port_id, rxq->queue_id); + } else { + *qfill = 0; + } + } + return 0; } diff --git a/lib/netdev-dummy.c b/lib/netdev-dummy.c index 8af9e1a..13bc580 100644 --- a/lib/netdev-dummy.c +++ b/lib/netdev-dummy.c @@ -992,7 +992,8 @@ netdev_dummy_rxq_dealloc(struct netdev_rxq *rxq_) } static int -netdev_dummy_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch) +netdev_dummy_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch, + int *qfill) { struct netdev_rxq_dummy *rx = netdev_rxq_dummy_cast(rxq_); struct netdev_dummy *netdev = netdev_dummy_cast(rx->up.netdev); @@ -1037,6 +1038,11 @@ netdev_dummy_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch) batch->packets[0] = packet; batch->count = 1; + + if (qfill) { + *qfill = -ENOTSUP; + } + return 0; } diff --git a/lib/netdev-linux.c b/lib/netdev-linux.c index 7ea40a8..c179ce2 100644 --- a/lib/netdev-linux.c +++ b/lib/netdev-linux.c @@ -1132,7 +1132,8 @@ netdev_linux_rxq_recv_tap(int fd, struct dp_packet *buffer) } static int -netdev_linux_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch) +netdev_linux_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch, + int *qfill) { struct netdev_rxq_linux *rx = netdev_rxq_linux_cast(rxq_); struct netdev *netdev = rx->up.netdev; @@ -1161,6 +1162,10 @@ netdev_linux_rxq_recv(struct netdev_rxq *rxq_, struct dp_packet_batch *batch) dp_packet_batch_init_packet(batch, buffer); } + if (qfill) { + *qfill = -ENOTSUP; + } + return retval; } diff --git a/lib/netdev-provider.h b/lib/netdev-provider.h index 25bd671..37add95 100644 --- a/lib/netdev-provider.h +++ b/lib/netdev-provider.h @@ -786,12 +786,17 @@ struct netdev_class { * pointers and metadata itself, if desired, e.g. with pkt_metadata_init() * and miniflow_extract(). * + * If the caller provides a non-NULL qfill pointer, the implementation + * returns the remaining rx queue fill level (zero or more) after the + * reception of packets, if it supports that, or -ENOTSUP otherwise. + * * Implementations should allocate buffers with DP_NETDEV_HEADROOM bytes of * headroom. * * Returns EAGAIN immediately if no packet is ready to be received or * another positive errno value if an error was encountered. */ - int (*rxq_recv)(struct netdev_rxq *rx, struct dp_packet_batch *batch); + int (*rxq_recv)(struct netdev_rxq *rx, struct dp_packet_batch *batch, + int *qfill); /* Registers with the poll loop to wake up from the next call to * poll_block() when a packet is ready to be received with diff --git a/lib/netdev.c b/lib/netdev.c index b303a7d..fe646cc 100644 --- a/lib/netdev.c +++ b/lib/netdev.c @@ -695,11 +695,12 @@ netdev_rxq_close(struct netdev_rxq *rx) * Returns EAGAIN immediately if no packet is ready to be received or another * positive errno value if an error was encountered. */ int -netdev_rxq_recv(struct netdev_rxq *rx, struct dp_packet_batch *batch) +netdev_rxq_recv(struct netdev_rxq *rx, struct dp_packet_batch *batch, + int *qfill) { int retval; - retval = rx->netdev->netdev_class->rxq_recv(rx, batch); + retval = rx->netdev->netdev_class->rxq_recv(rx, batch, qfill); if (!retval) { COVERAGE_INC(netdev_received); } else { diff --git a/lib/netdev.h b/lib/netdev.h index ff1b604..3f51634 100644 --- a/lib/netdev.h +++ b/lib/netdev.h @@ -175,7 +175,8 @@ void netdev_rxq_close(struct netdev_rxq *); const char *netdev_rxq_get_name(const struct netdev_rxq *); int netdev_rxq_get_queue_id(const struct netdev_rxq *); -int netdev_rxq_recv(struct netdev_rxq *rx, struct dp_packet_batch *); +int netdev_rxq_recv(struct netdev_rxq *rx, struct dp_packet_batch *, + int *qfill); void netdev_rxq_wait(struct netdev_rxq *); int netdev_rxq_drain(struct netdev_rxq *); From patchwork Thu Mar 15 16:03:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Scheurich X-Patchwork-Id: 886321 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=ericsson.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=ericsson.com header.i=@ericsson.com header.b="a3Jv/k4I"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=ericsson.com header.i=@ericsson.com header.b="KobJeM4H"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 402D4h0PcLz9sVK for ; Fri, 16 Mar 2018 03:07:16 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id DFDEB11CB; Thu, 15 Mar 2018 16:06:23 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 1261B11C1 for ; Thu, 15 Mar 2018 16:06:23 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from sessmg23.ericsson.net (sessmg23.ericsson.net [193.180.251.45]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id 544E4360 for ; Thu, 15 Mar 2018 16:06:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; d=ericsson.com; s=mailgw201801; c=relaxed/simple; q=dns/txt; i=@ericsson.com; t=1521129977; h=From:Sender:Reply-To:Subject:Date:Message-ID:To:CC:MIME-Version:Content-Type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=RSwSj1XcgRlUD4cr1IYpX/ee+MPJjIHrdjSm/B8hR9s=; b=a3Jv/k4IztPdJfAzCSyhuYBXDFkV14Agtk8/Ew25C1Q8x4lY+vsEmqswy0uxO8KC M8NPYj/W5Lcuyb2gz4or1NFehJbNRc7rwLTSHgYwKKDZU0L16PKv5rcx/5e5XHri j+P1N+smF9U9n2eqpBeOLrz7EjuWORNU2bMVzJuPCbs=; X-AuditID: c1b4fb2d-87c029c000005540-e9-5aaa99f9130f Received: from ESESSHC011.ericsson.se (Unknown_Domain [153.88.183.51]) by sessmg23.ericsson.net (Symantec Mail Security) with SMTP id BE.2C.21824.9F99AAA5; Thu, 15 Mar 2018 17:06:17 +0100 (CET) Received: from ESESBMR503.ericsson.se (153.88.183.135) by ESESSHC011.ericsson.se (153.88.183.51) with Microsoft SMTP Server (TLS) id 14.3.382.0; Thu, 15 Mar 2018 17:05:15 +0100 Received: from ESESSMB504.ericsson.se (153.88.183.165) by ESESBMR503.ericsson.se (153.88.183.135) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1034.26; Thu, 15 Mar 2018 17:05:15 +0100 Received: from EUR03-VE1-obe.outbound.protection.outlook.com (153.88.183.157) by ESESSMB504.ericsson.se (153.88.183.165) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1034.26 via Frontend Transport; Thu, 15 Mar 2018 17:05:15 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=b00SeofTql03rU40LKkYXF41Lb94GCKOyf0bFNxpyLo=; b=KobJeM4HgGCeYhbW04xoPmX5InWPyDNcc+H9W3FdF3WWFGcaXGTMqQUeB9EziJc1iBySmMvFTHw78mV5HeXItJfR9M5+5kVPLLFulbmSPessiUX+1SEX+49HrWZZWmAr3XGC/H5ecazTkbwIQY0b24p7yVi+2MisBDTO6+dbUFo= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=jan.scheurich@ericsson.com; Received: from ubuntu.eed.ericsson.se (129.192.10.2) by DB3PR07MB0650.eurprd07.prod.outlook.com (2a01:111:e400:943d::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.588.7; Thu, 15 Mar 2018 16:05:11 +0000 From: Jan Scheurich To: Date: Thu, 15 Mar 2018 17:03:12 +0100 Message-ID: <1521129793-27851-3-git-send-email-jan.scheurich@ericsson.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1521129793-27851-1-git-send-email-jan.scheurich@ericsson.com> References: <1521129793-27851-1-git-send-email-jan.scheurich@ericsson.com> MIME-Version: 1.0 X-Originating-IP: [129.192.10.2] X-ClientProxiedBy: HE1PR07CA0009.eurprd07.prod.outlook.com (2603:10a6:7:67::19) To DB3PR07MB0650.eurprd07.prod.outlook.com (2a01:111:e400:943d::16) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 75ee9e87-d82d-49e2-7e18-08d58a8e8895 X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(5600026)(4604075)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7153060)(7193020); SRVR:DB3PR07MB0650; X-Microsoft-Exchange-Diagnostics: 1; DB3PR07MB0650; 3:8xGQykRADgUF6dOyAYQ7tTVzVVFuWBeLyDUHecnKI+CylqXJoFv/Y8CcQjUvaHXZQkpu6ghL8SgPNhm/TDvPIyz8CxruwMazr4RAOdhcPiX/8ErGdncmpH/29HYl373vts3fxwWthlBsX5HcN9YDmZ2dYYigqeh0kpIAGrPnggjYAE52XzHQiz3MGIg21JcF495xzgDgCUZrX/u9LAE/yc7JmtLIdu68VnCvUmsB1gwGUQzOh950Vazw+VwMVg5U; 25:609RDxJlaF6PNM9I/Hcn1nU0w22+ZCqedm82G9uh52/ZU52dSMNkC0YDOSjkTAdh8YXM21Y3yOgh6TehxQT3lqGrozyutCnr31QuYN1kig9ThVOjE06jPla357mVVDlSiGZy9T4H0thswAw9MXBom+VtNugvRmbC0O65vsHIJ2mCmE7rxbs4kOaeFruRQKywK7LSCouIWIytYMoAEJUglqL8H6HxMTVA4fR5NlrKZD4JJk4kSeHN2kg17HgN4+58euuIhH+k2iZkFPke5FyGlWchdFJ+T7V3LgOoNLRCoNrkPuOheaIX0zdnOH0075FitZrzKYTcot5pbaB7kIgKTA==; 31:+PMy2U0xlUo1/xYRJHXpreuSlesRFWIfirDh5Un61CyoiVpFEQL6zB/o6TKdgrg81Dvb1O4ediFOF55UVeVpwSgT36J96Sn8YlYUJL3BIXxIbRwZd6MYabKpFASFJ6+rIOuGSmNvqtuswhiYhkKLZPY3CcEKTKDkrlncN/vwqREmsxmq9hY31joJmrlYBOGpB63tDASrosiLyVSG6C3S7rneQHVl2sYeprpk7BrkJUg= X-MS-TrafficTypeDiagnostic: DB3PR07MB0650: X-Microsoft-Exchange-Diagnostics: 1; DB3PR07MB0650; 20:IK0wUT2DZcL6uTHUwZ25Y7/ssWvb+hLuOS92XqnYjGSOATAzRgctg1ijTlnpKctGsGKTF1Ts6805EKExcn9omNc4Nshbd0Pqe2CgsS20PM0FgXEOIaQPaMun8Az0RAK1OeTT87IsW6shB3bDfaUadjq0Fy+FJ7j30SFke+JPFwPnRsbuNkvkQisTC9AQZY57JUFarWCA0+8acXcJfHc1vYqJGMeuiJStk+tZTFGKDB21F+n6oplgEoWhFgUwS3zoNVpaWjZWhZbhfetdrg0UmhwzwEvy/i1Y0uZdZDymfB29ykbdqoUi8AebaQcMgV3abF4MYswvRbTWnv8N9S8ZxZ0s8Xc6ohx6jrS/1SDNZWJlpIfz3E3gC1h7q8g5LiVXTTcJuy0iP4mzHR/I/t2CovU0aI18FaM3ZjiVOslWhi9hGLCLPlFOYF81xvHaptLOP4eCF3LON/JQG5p2L/eimOQZrtmKT3cwBq/84R6YuMniODogODgWz2rKu032tKSP; 4:adwiBP0wH2RwpDJ8OVr4R/y38bEVdCIdy24kjNgZNQ16IaYt2GFpaZ+8rKeEq8nuFUYZriJtd/qC3TSSSMXBeEru+H3tLC+wBjRTXHobVP/PAMinRp2miAXo39LQvPRPm3BmdR8s9UQyEZaECw3M0PzSiDByYnzzMalro38tcDQWWw6NbDz4pKcymQ+GP+dDLICSdNpOUXpoG36Ptv+5dpSKI5n83/m1ZcAcmHOg8MZIa7Z38f7jGBSGQpPO43B3oHyGKik+SDPfJY+zjxKmPbyz8fxO4N4aKOAxHUjf6chqyThyIoBl+d37vPg75g9n X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(37575265505322); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(3231221)(944501244)(52105095)(3002001)(93006095)(93001095)(10201501046)(6041310)(20161123560045)(20161123564045)(20161123562045)(20161123558120)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011); SRVR:DB3PR07MB0650; BCL:0; PCL:0; RULEID:; SRVR:DB3PR07MB0650; X-Forefront-PRVS: 0612E553B4 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(376002)(396003)(366004)(39860400002)(39380400002)(346002)(189003)(199004)(6116002)(316002)(16586007)(59450400001)(16526019)(478600001)(6512007)(106356001)(50226002)(4326008)(305945005)(186003)(3846002)(86362001)(2351001)(66066001)(8936002)(105586002)(2950100002)(50466002)(48376002)(6916009)(76176011)(52116002)(7736002)(97736004)(575784001)(36756003)(81156014)(51416003)(6506007)(386003)(81166006)(8676002)(5660300001)(6486002)(25786009)(551984002)(2361001)(68736007)(47776003)(53936002)(2906002)(107886003)(53946003)(26005)(579004)(559001); DIR:OUT; SFP:1101; SCL:1; SRVR:DB3PR07MB0650; H:ubuntu.eed.ericsson.se; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Received-SPF: None (protection.outlook.com: ericsson.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; DB3PR07MB0650; 23:erFTnqsLtEaow7tEzAOnVKTGh5I8ZLhtdCJB/olGm?= H5sf+hx4xKMC0PCa2Np9jEpM9DpLV6qDXwNjLkYaqLzccnQ0jzDr+AY5n16xPbhTVz/DvdKNeocyKUCn1IFvX66pAVH2gKOm6CiwJG9rwwAmhBGylCuqdU2yZ7cNrX6PWBCZDngiTNQN0fWuYYPpZvtLKjf7A+auAq7hgnF5aKVNvFZ0KuE2qhR3dYdPGaSqOFuuz7bQL2zC/ngHdK6mgqhpgQrBpoGmuinofhB+O/mc4tqmkbvF29uQHQz2Wt1bZCmX8l4VmdA/Dvb5ylDbyDw1NEUUPTAGtwWI87Z3ytAvFp9l51PSI2GxTVClfKWZ7x8IQB00pak3sb/VcDV2r3Qt0xpYCA0sZmNAoCCJWceoJZhFw542rLPRxdXqry6ATwqjiOr5z9WNlJWDEn6WyFxHw4bobvM4XpyaSXOPrwXGkaDM/XZbgd06PtS56p1Zc2QxkZ0vvyTZHTJuelrLyWWJ0120G2uFIXnk+WKL3mJCfmV5lYMA15JG+MsUdXFx2qLsiBqG9kxxVSgu/QT9b5ugj38adxcq96oPvLX9M/4NECkU/Yb0vED5pPbvX6h0YrXS9qlJ4wnBwui/oPhr7FHPhJQwGz2dfKgW1yJUQzAVgncaRt/ODeeogobYFHtkjsdUyoZptrC6WdFePlofE7mF5j9xPKRM/QMxNj4OSldE95wWXVhsWIS07tvPiLxg3LZkz9kTsPvmoU4//Es7aEccyg6FT03J2+8FaHBagpcWsH+NVKxE6AE3eDPjL+0xGqkO239MWXuZABNxmMC1T0+I0XfI5Kst0T1OBzC7o9ZxFNYXeX84FYq+uJ7tdL3iPKm9rCSXJw0ccqtVlc2QuNxKenSe/rNkgxOExHWLxldRNQQ65zqzSfPbvhId0PAeHazuPzVi9E+6d63pjLs2lZCwqCv/xQIucIdEbwsXHmAntNu3Q4CQQd/jkjYextJ37KtY30kyhYenOBsU3ECXnWuTJC5ufcWmIcqjOWDelQktGJujNsPp9/QCnBKUSKd7pQSIWAJz+EJo8AyQWhdB3ZAoOtJXGwUFUKaQwPzr7DCwtvdCNggqmx2Huls/b2k0HZIT+qlTX5IxfIHq8M7fZvbby32A15bj0nCZy/zBLaOaeiGHD5Ylg8s8WJ/vIunCGxaMzBMFG9LnxU1qyvpXgiRvaF8hxTdvrKIuHfTI6MB2o4zj17tET4yQYFWB0cZAYdGcrXqADaoBea68JGuNBh7 X-Microsoft-Antispam-Message-Info: flUieLBQ9OIy+HjJF3Tk2Ux/6usH9Si+ykLqbwLFMNEScAu+qLaj5XISc8aXQoHt5I+WxTay7XfqS89spKpfD/Jv3IEMxOUPOvGtBp/eEt3ZAqlJjZ8kCZp00YaOlffQ4184Nmx5of9peAqONuI6UWQi3VlVTvNbeNkXBjGR3nolfwdFySVXLewnZfpTEmPl X-Microsoft-Exchange-Diagnostics: 1; DB3PR07MB0650; 6:vHQNThvkVfrSzQANj/lbIXOMlbD2EjNYkF8mcLuqPqFrhrodh2+ds+lPwVayxdRlrJcTnAQV1jdLbkJzB4izQSbi7PaibLssUEXioDQsj5ruMiyRuXLhg6aNZmAFj+Mpkgq8gdiHIw4Jb7lwt94ryaVXkgPVLLlzs3d6xXEgM/tNzuFe6JrK/g4Gy6vkL7jcNMpOpdOSw1o2oJvwp4fw/S+ouvhEm++Kcm/hgqUQzlxvsVGSFSfkw7EJaSRowKsTt5Umv230izpSNW0Rb7dlYd6GuSL3gGy3OJj8UvQUTe/WQSZnT1s09IDYDrPb7uKETpse2LgiGj/i43xOOva6N0BdR1VNaeudPjLcXqdEA1E=; 5:rhaZxb64cGQ986b3ujxlOftyTNzP5MTHqpQJxqw22Ta4tNschmKq/0Vr+KbPFvVItbpjYzWgKwDiyXnuTYflryxMl7eSk127erXtf0V5Kh+QNz96FhWlKgWyPVqJy0hIHUgs2RyiiOuUYxrni3UhKTGGOAf8Aswi4lELSacGtsw=; 24:Bi91nEbD9Zdveh/krjCwsl+K6oD5dbs97aTznyT6BiJ8efZmybE9k+y0c3LkbkDGSLpMaB9Gi/4E0IC0220NTug6GLmdIlHjYGlDNkJ5N8k=; 7:gWQjJfHgAL5nAufdnktC1gwRYPn9k23e7qQTwrV16FDf+gqLwStffbIGkgtY1xssUTlzTNb3PClgzfKDvuwKx7WuNkr+AMcfz98lehJUEOIp9UIqC6J03CWniuzr950D+FevX0jfz+6x8YdfMEVhrdRRY7dY6RuZyLe5vUEpQfPVbNmhJ8dvUCdP1q+zS4wWyu6KIjNvdK/B4OHDp3fJ2XqBMPazUDy7oF8Ac4c1MjOvf9nv0AMzxFMgYNLeYj3b SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Mar 2018 16:05:11.6780 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 75ee9e87-d82d-49e2-7e18-08d58a8e8895 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR07MB0650 X-OriginatorOrg: ericsson.com X-Brightmail-Tracker: H4sIAAAAAAAAA02SfUhTURjGO/djuxuNrkvp1TJsZYGmmfbHEEmD/lhGHwRRSVIjb7qmm9xr kkGxTBCmlB+MOZXUdBCahXpDcVI4naEllVNCS6WwnEyZFmWWZl7vAv97n/f5PYfnHA6FK4vI EEpnyGFYgzZTJZETtvPtcVFLtsaUmB7PHnVHsx2pXa+6cPVw4ZJU3fJpkFQ3O+elSaSmvmsG 03wdXUUa3/MRieYe34hOEynyhDQmU5fLsAcOX5ZnvBv5iGcvP8BvLA+UYybU0ISZkYwC+hC4 npQSZiSnlHQPgrm2NlIUPIKyh30SUSwicNmKkSjsGPAOt1QQBP0NA2vdECkchuhUKHhmwUTK jEHN3Ju1PEVJ6BiociQJTCAdDH/r+fVjcdqC4O70z/UmW2kNdLZXIIEn6HC4X6UW1go6GQpc s7hYdif095WTAiKjj8P8QqSwVgrIRCEh4gHQb5tan3EaoNvj8UfDoIW3rd8TaCuCse9WJIb3 wYrzByFCoeAarPbPJ+B1wzgpBl4geLk84E+bpLDifep/vghwl05IReOXBBY6Z0nR0MPjz+VI nOPB+3sRiZAdh3GrxW/sgNYpnixBUZUbuldu6F6LsEYUxDEcl5UeGxfNsLorHGc0RBuYnFa0 9kW6+T9RHajJe8SJaAqpNissxY0pSlKby+VlORFQuCpQMVy0tlKkafNuMqzxEns9k+GcaDtF qLYp+o8pUpR0ujaH0TNMNsP+dzFKFmJC6nZ5JT704ZF3ch6MIXxvcq9Mlx/jC/UdPXfHPf6e jT+b5gibPlMWuakn9u2W3Xp3rv6Wfc6hGS2cLPU58NzxQaO+byao3pJavZrQfPvaxZL82lZT 8Knkmop2yj0cEDHC9tZFyZb3zuBklTV8f/UXhYfpKu4070q8euHkWKKK4DK0ByNwltP+Axfl CHgeAwAA X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: i.maximets@samsung.com Subject: [ovs-dev] [PATCH v9 2/3] dpif-netdev: Detailed performance stats for PMDs X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This patch instruments the dpif-netdev datapath to record detailed statistics of what is happening in every iteration of a PMD thread. The collection of detailed statistics can be controlled by a new Open_vSwitch configuration parameter "other_config:pmd-perf-metrics". By default it is disabled. The run-time overhead, when enabled, is in the order of 1%. The covered metrics per iteration are: - cycles - packets - (rx) batches - packets/batch - max. vhostuser qlen - upcalls - cycles spent in upcalls This raw recorded data is used threefold: 1. In histograms for each of the following metrics: - cycles/iteration (log.) - packets/iteration (log.) - cycles/packet - packets/batch - max. vhostuser qlen (log.) - upcalls - cycles/upcall (log) The histograms bins are divided linear or logarithmic. 2. A cyclic history of the above statistics for 999 iterations 3. A cyclic history of the cummulative/average values per millisecond wall clock for the last 1000 milliseconds: - number of iterations - avg. cycles/iteration - packets (Kpps) - avg. packets/batch - avg. max vhost qlen - upcalls - avg. cycles/upcall The gathered performance metrics can be printed at any time with the new CLI command ovs-appctl dpif-netdev/pmd-perf-show [-nh] [-it iter_len] [-ms ms_len] [-pmd core] [dp] The options are -nh: Suppress the histograms -it iter_len: Display the last iter_len iteration stats -ms ms_len: Display the last ms_len millisecond stats -pmd core: Display only the specified PMD The performance statistics are reset with the existing dpif-netdev/pmd-stats-clear command. The output always contains the following global PMD statistics, similar to the pmd-stats-show command: Time: 15:24:55.270 Measurement duration: 1.008 s pmd thread numa_id 0 core_id 1: Cycles: 2419034712 (2.40 GHz) Iterations: 572817 (1.76 us/it) - idle: 486808 (15.9 % cycles) - busy: 86009 (84.1 % cycles) Rx packets: 2399607 (2381 Kpps, 848 cycles/pkt) Datapath passes: 3599415 (1.50 passes/pkt) - EMC hits: 336472 ( 9.3 %) - Megaflow hits: 3262943 (90.7 %, 1.00 subtbl lookups/hit) - Upcalls: 0 ( 0.0 %, 0.0 us/upcall) - Lost upcalls: 0 ( 0.0 %) Tx packets: 2399607 (2381 Kpps) Tx batches: 171400 (14.00 pkts/batch) Signed-off-by: Jan Scheurich --- NEWS | 3 + lib/automake.mk | 1 + lib/dpif-netdev-perf.c | 350 +++++++++++++++++++++++++++++++++++++++++++- lib/dpif-netdev-perf.h | 258 ++++++++++++++++++++++++++++++-- lib/dpif-netdev-unixctl.man | 157 ++++++++++++++++++++ lib/dpif-netdev.c | 182 +++++++++++++++++++++-- manpages.mk | 2 + vswitchd/ovs-vswitchd.8.in | 27 +--- vswitchd/vswitch.xml | 12 ++ 9 files changed, 939 insertions(+), 53 deletions(-) create mode 100644 lib/dpif-netdev-unixctl.man diff --git a/NEWS b/NEWS index 8d0b502..8f66fd3 100644 --- a/NEWS +++ b/NEWS @@ -73,6 +73,9 @@ v2.9.0 - 19 Feb 2018 * Add support for vHost dequeue zero copy (experimental) - Userspace datapath: * Output packet batching support. + * Commands ovs-appctl dpif-netdev/pmd-*-show can now work on a single PMD + * Detailed PMD performance metrics available with new command + ovs-appctl dpif-netdev/pmd-perf-show - vswitchd: * Datapath IDs may now be specified as 0x1 (etc.) instead of 16 digits. * Configuring a controller, or unconfiguring all controllers, now deletes diff --git a/lib/automake.mk b/lib/automake.mk index 5c26e0f..7a5632d 100644 --- a/lib/automake.mk +++ b/lib/automake.mk @@ -484,6 +484,7 @@ MAN_FRAGMENTS += \ lib/dpctl.man \ lib/memory-unixctl.man \ lib/netdev-dpdk-unixctl.man \ + lib/dpif-netdev-unixctl.man \ lib/ofp-version.man \ lib/ovs.tmac \ lib/service.man \ diff --git a/lib/dpif-netdev-perf.c b/lib/dpif-netdev-perf.c index f06991a..43f537e 100644 --- a/lib/dpif-netdev-perf.c +++ b/lib/dpif-netdev-perf.c @@ -15,18 +15,324 @@ */ #include +#include +#include "dpif-netdev-perf.h" #include "openvswitch/dynamic-string.h" #include "openvswitch/vlog.h" -#include "dpif-netdev-perf.h" +#include "ovs-thread.h" #include "timeval.h" VLOG_DEFINE_THIS_MODULE(pmd_perf); +#ifdef DPDK_NETDEV +static uint64_t +get_tsc_hz(void) +{ + return rte_get_tsc_hz(); +} +#else +/* This function is only invoked from PMD threads which depend on DPDK. + * A dummy function is sufficient when building without DPDK_NETDEV. */ +static uint64_t +get_tsc_hz(void) +{ + return 1; +} +#endif + +/* Histogram functions. */ + +static void +histogram_walls_set_lin(struct histogram *hist, uint32_t min, uint32_t max) +{ + int i; + + ovs_assert(min < max); + for (i = 0; i < NUM_BINS-1; i++) { + hist->wall[i] = min + (i * (max - min)) / (NUM_BINS - 2); + } + hist->wall[NUM_BINS-1] = UINT32_MAX; +} + +static void +histogram_walls_set_log(struct histogram *hist, uint32_t min, uint32_t max) +{ + int i, start, bins, wall; + double log_min, log_max; + + ovs_assert(min < max); + if (min > 0) { + log_min = log(min); + log_max = log(max); + start = 0; + bins = NUM_BINS - 1; + } else { + hist->wall[0] = 0; + log_min = log(1); + log_max = log(max); + start = 1; + bins = NUM_BINS - 2; + } + wall = start; + for (i = 0; i < bins; i++) { + /* Make sure each wall is monotonically increasing. */ + wall = MAX(wall, exp(log_min + (i * (log_max - log_min)) / (bins-1))); + hist->wall[start + i] = wall++; + } + if (hist->wall[NUM_BINS-2] < max) { + hist->wall[NUM_BINS-2] = max; + } + hist->wall[NUM_BINS-1] = UINT32_MAX; +} + +uint64_t +histogram_samples(const struct histogram *hist) +{ + uint64_t samples = 0; + + for (int i = 0; i < NUM_BINS; i++) { + samples += hist->bin[i]; + } + return samples; +} + +static void +histogram_clear(struct histogram *hist) +{ + int i; + + for (i = 0; i < NUM_BINS; i++) { + hist->bin[i] = 0; + } +} + +static void +history_init(struct history *h) +{ + memset(h, 0, sizeof(*h)); +} + void pmd_perf_stats_init(struct pmd_perf_stats *s) { - memset(s, 0 , sizeof(*s)); + memset(s, 0, sizeof(*s)); + ovs_mutex_init(&s->stats_mutex); + ovs_mutex_init(&s->clear_mutex); + histogram_walls_set_log(&s->cycles, 500, 24000000); + histogram_walls_set_log(&s->pkts, 0, 1000); + histogram_walls_set_lin(&s->cycles_per_pkt, 100, 30000); + histogram_walls_set_lin(&s->pkts_per_batch, 0, 32); + histogram_walls_set_lin(&s->upcalls, 0, 30); + histogram_walls_set_log(&s->cycles_per_upcall, 1000, 1000000); + histogram_walls_set_log(&s->max_vhost_qfill, 0, 512); + s->start_ms = time_msec(); +} + +void +pmd_perf_format_overall_stats(struct ds *str, struct pmd_perf_stats *s, + double duration) +{ + uint64_t stats[PMD_N_STATS]; + double us_per_cycle = 1000000.0 / get_tsc_hz(); + + if (duration == 0) { + return; + } + + pmd_perf_read_counters(s, stats); + uint64_t tot_cycles = stats[PMD_CYCLES_ITER_IDLE] + + stats[PMD_CYCLES_ITER_BUSY]; + uint64_t rx_packets = stats[PMD_STAT_RECV]; + uint64_t tx_packets = stats[PMD_STAT_SENT_PKTS]; + uint64_t tx_batches = stats[PMD_STAT_SENT_BATCHES]; + uint64_t passes = stats[PMD_STAT_RECV] + + stats[PMD_STAT_RECIRC]; + uint64_t upcalls = stats[PMD_STAT_MISS]; + uint64_t upcall_cycles = stats[PMD_CYCLES_UPCALL]; + uint64_t tot_iter = histogram_samples(&s->pkts); + uint64_t idle_iter = s->pkts.bin[0]; + uint64_t busy_iter = tot_iter >= idle_iter ? tot_iter - idle_iter : 0; + + ds_put_format(str, + " Cycles: %12"PRIu64" (%.2f GHz)\n" + " Iterations: %12"PRIu64" (%.2f us/it)\n" + " - idle: %12"PRIu64" (%4.1f %% cycles)\n" + " - busy: %12"PRIu64" (%4.1f %% cycles)\n", + tot_cycles, (tot_cycles / duration) / 1E9, + tot_iter, tot_cycles * us_per_cycle / tot_iter, + idle_iter, + 100.0 * stats[PMD_CYCLES_ITER_IDLE] / tot_cycles, + busy_iter, + 100.0 * stats[PMD_CYCLES_ITER_BUSY] / tot_cycles); + if (rx_packets > 0) { + ds_put_format(str, + " Rx packets: %12"PRIu64" (%.0f Kpps, %.0f cycles/pkt)\n" + " Datapath passes: %12"PRIu64" (%.2f passes/pkt)\n" + " - EMC hits: %12"PRIu64" (%4.1f %%)\n" + " - Megaflow hits: %12"PRIu64" (%4.1f %%, %.2f subtbl lookups/" + "hit)\n" + " - Upcalls: %12"PRIu64" (%4.1f %%, %.1f us/upcall)\n" + " - Lost upcalls: %12"PRIu64" (%4.1f %%)\n", + rx_packets, (rx_packets / duration) / 1000, + 1.0 * stats[PMD_CYCLES_ITER_BUSY] / rx_packets, + passes, rx_packets ? 1.0 * passes / rx_packets : 0, + stats[PMD_STAT_EXACT_HIT], + 100.0 * stats[PMD_STAT_EXACT_HIT] / passes, + stats[PMD_STAT_MASKED_HIT], + 100.0 * stats[PMD_STAT_MASKED_HIT] / passes, + stats[PMD_STAT_MASKED_HIT] + ? 1.0 * stats[PMD_STAT_MASKED_LOOKUP] / stats[PMD_STAT_MASKED_HIT] + : 0, + upcalls, 100.0 * upcalls / passes, + upcalls ? (upcall_cycles * us_per_cycle) / upcalls : 0, + stats[PMD_STAT_LOST], + 100.0 * stats[PMD_STAT_LOST] / passes); + } else { + ds_put_format(str, + " Rx packets: %12"PRIu64"\n", + 0UL); + } + if (tx_packets > 0) { + ds_put_format(str, + " Tx packets: %12"PRIu64" (%.0f Kpps)\n" + " Tx batches: %12"PRIu64" (%.2f pkts/batch)" + "\n", + tx_packets, (tx_packets / duration) / 1000, + tx_batches, 1.0 * tx_packets / tx_batches); + } else { + ds_put_format(str, + " Tx packets: %12"PRIu64"\n" + "\n", + 0UL); + } +} + +void +pmd_perf_format_histograms(struct ds *str, struct pmd_perf_stats *s) +{ + int i; + + ds_put_cstr(str, "Histograms\n"); + ds_put_format(str, + " %-21s %-21s %-21s %-21s %-21s %-21s %-21s\n", + "cycles/it", "packets/it", "cycles/pkt", "pkts/batch", + "max vhost qlen", "upcalls/it", "cycles/upcall"); + for (i = 0; i < NUM_BINS-1; i++) { + ds_put_format(str, + " %-9d %-11"PRIu64" %-9d %-11"PRIu64" %-9d %-11"PRIu64 + " %-9d %-11"PRIu64" %-9d %-11"PRIu64" %-9d %-11"PRIu64 + " %-9d %-11"PRIu64"\n", + s->cycles.wall[i], s->cycles.bin[i], + s->pkts.wall[i],s->pkts.bin[i], + s->cycles_per_pkt.wall[i], s->cycles_per_pkt.bin[i], + s->pkts_per_batch.wall[i], s->pkts_per_batch.bin[i], + s->max_vhost_qfill.wall[i], s->max_vhost_qfill.bin[i], + s->upcalls.wall[i], s->upcalls.bin[i], + s->cycles_per_upcall.wall[i], s->cycles_per_upcall.bin[i]); + } + ds_put_format(str, + " %-9s %-11"PRIu64" %-9s %-11"PRIu64" %-9s %-11"PRIu64 + " %-9s %-11"PRIu64" %-9s %-11"PRIu64" %-9s %-11"PRIu64 + " %-9s %-11"PRIu64"\n", + ">", s->cycles.bin[i], + ">", s->pkts.bin[i], + ">", s->cycles_per_pkt.bin[i], + ">", s->pkts_per_batch.bin[i], + ">", s->max_vhost_qfill.bin[i], + ">", s->upcalls.bin[i], + ">", s->cycles_per_upcall.bin[i]); + if (s->totals.iterations > 0) { + ds_put_cstr(str, + "-----------------------------------------------------" + "-----------------------------------------------------" + "------------------------------------------------\n"); + ds_put_format(str, + " %-21s %-21s %-21s %-21s %-21s %-21s %-21s\n", + "cycles/it", "packets/it", "cycles/pkt", "pkts/batch", + "vhost qlen", "upcalls/it", "cycles/upcall"); + ds_put_format(str, + " %-21"PRIu64" %-21.5f %-21"PRIu64 + " %-21.5f %-21.5f %-21.5f %-21"PRIu32"\n", + s->totals.cycles / s->totals.iterations, + 1.0 * s->totals.pkts / s->totals.iterations, + s->totals.pkts + ? s->totals.busy_cycles / s->totals.pkts : 0, + s->totals.batches + ? 1.0 * s->totals.pkts / s->totals.batches : 0, + 1.0 * s->totals.max_vhost_qfill / s->totals.iterations, + 1.0 * s->totals.upcalls / s->totals.iterations, + s->totals.upcalls + ? s->totals.upcall_cycles / s->totals.upcalls : 0); + } +} + +void +pmd_perf_format_iteration_history(struct ds *str, struct pmd_perf_stats *s, + int n_iter) +{ + struct iter_stats *is; + size_t index; + int i; + + if (n_iter == 0) { + return; + } + ds_put_format(str, " %-17s %-10s %-10s %-10s %-10s " + "%-10s %-10s %-10s\n", + "tsc", "cycles", "packets", "cycles/pkt", "pkts/batch", + "vhost qlen", "upcalls", "cycles/upcall"); + for (i = 1; i <= n_iter; i++) { + index = (s->iterations.idx + HISTORY_LEN - i) % HISTORY_LEN; + is = &s->iterations.sample[index]; + ds_put_format(str, + " %-17"PRIu64" %-11"PRIu64" %-11"PRIu32 + " %-11"PRIu64" %-11"PRIu32" %-11"PRIu32 + " %-11"PRIu32" %-11"PRIu32"\n", + is->timestamp, + is->cycles, + is->pkts, + is->pkts ? is->cycles / is->pkts : 0, + is->batches ? is->pkts / is->batches : 0, + is->max_vhost_qfill, + is->upcalls, + is->upcalls ? is->upcall_cycles / is->upcalls : 0); + } +} + +void +pmd_perf_format_ms_history(struct ds *str, struct pmd_perf_stats *s, int n_ms) +{ + struct iter_stats *is; + size_t index; + int i; + + if (n_ms == 0) { + return; + } + ds_put_format(str, + " %-12s %-10s %-10s %-10s %-10s" + " %-10s %-10s %-10s %-10s\n", + "ms", "iterations", "cycles/it", "Kpps", "cycles/pkt", + "pkts/batch", "vhost qlen", "upcalls", "cycles/upcall"); + for (i = 1; i <= n_ms; i++) { + index = (s->milliseconds.idx + HISTORY_LEN - i) % HISTORY_LEN; + is = &s->milliseconds.sample[index]; + ds_put_format(str, + " %-12"PRIu64" %-11"PRIu32" %-11"PRIu64 + " %-11"PRIu32" %-11"PRIu64" %-11"PRIu32 + " %-11"PRIu32" %-11"PRIu32" %-11"PRIu32"\n", + is->timestamp, + is->iterations, + is->iterations ? is->cycles / is->iterations : 0, + is->pkts, + is->pkts ? is->busy_cycles / is->pkts : 0, + is->batches ? is->pkts / is->batches : 0, + is->iterations + ? is->max_vhost_qfill / is->iterations : 0, + is->upcalls, + is->upcalls ? is->upcall_cycles / is->upcalls : 0); + } } void @@ -51,10 +357,48 @@ pmd_perf_read_counters(struct pmd_perf_stats *s, } } +/* This function clears the PMD performance counters from within the PMD + * thread or from another thread when the PMD thread is not executing its + * poll loop. */ void -pmd_perf_stats_clear(struct pmd_perf_stats *s) +pmd_perf_stats_clear_lock(struct pmd_perf_stats *s) + OVS_REQUIRES(pmd->stats_mutex) { + ovs_mutex_lock(&s->clear_mutex); for (int i = 0; i < PMD_N_STATS; i++) { atomic_read_relaxed(&s->counters.n[i], &s->counters.zero[i]); } + /* The following stats are only applicable in PMD thread and */ + memset(&s->current, 0 , sizeof(struct iter_stats)); + memset(&s->totals, 0 , sizeof(struct iter_stats)); + histogram_clear(&s->cycles); + histogram_clear(&s->pkts); + histogram_clear(&s->cycles_per_pkt); + histogram_clear(&s->upcalls); + histogram_clear(&s->cycles_per_upcall); + histogram_clear(&s->pkts_per_batch); + histogram_clear(&s->max_vhost_qfill); + history_init(&s->iterations); + history_init(&s->milliseconds); + s->start_ms = time_msec(); + s->milliseconds.sample[0].timestamp = s->start_ms; + /* Clearing finished. */ + s->clear = false; + ovs_mutex_unlock(&s->clear_mutex); +} + +/* This function can be called from the anywhere to clear the stats + * of PMD and non-PMD threads. */ +void +pmd_perf_stats_clear(struct pmd_perf_stats *s) +{ + if (ovs_mutex_trylock(&s->stats_mutex) == 0) { + /* Locking successful. PMD not polling. */ + pmd_perf_stats_clear_lock(s); + ovs_mutex_unlock(&s->stats_mutex); + } else { + /* Request the polling PMD to clear the stats. There is no need to + * block here as stats retrieval is prevented during clearing. */ + s->clear = true; + } } diff --git a/lib/dpif-netdev-perf.h b/lib/dpif-netdev-perf.h index 5993c25..b91cb30 100644 --- a/lib/dpif-netdev-perf.h +++ b/lib/dpif-netdev-perf.h @@ -38,10 +38,18 @@ extern "C" { #endif -/* This module encapsulates data structures and functions to maintain PMD - * performance metrics such as packet counters, execution cycles. It - * provides a clean API for dpif-netdev to initialize, update and read and +/* This module encapsulates data structures and functions to maintain basic PMD + * performance metrics such as packet counters, execution cycles as well as + * histograms and time series recording for more detailed PMD metrics. + * + * It provides a clean API for dpif-netdev to initialize, update and read and * reset these metrics. + * + * The basic set of PMD counters is implemented as atomic_uint64_t variables + * to guarantee correct read also in 32-bit systems. + * + * The detailed PMD performance metrics are only supported on 64-bit systems + * with atomic 64-bit read and store semantics for plain uint64_t counters. */ /* Set of counter types maintained in pmd_perf_stats. */ @@ -66,6 +74,7 @@ enum pmd_stat_type { PMD_STAT_SENT_BATCHES, /* Number of batches sent. */ PMD_CYCLES_ITER_IDLE, /* Cycles spent in idle iterations. */ PMD_CYCLES_ITER_BUSY, /* Cycles spent in busy iterations. */ + PMD_CYCLES_UPCALL, /* Cycles spent processing upcalls. */ PMD_N_STATS }; @@ -81,18 +90,87 @@ struct pmd_counters { uint64_t zero[PMD_N_STATS]; /* Value at last _clear(). */ }; -/* Container for all performance metrics of a PMD. - * Part of the struct dp_netdev_pmd_thread. */ +/* Data structure to collect statistical distribution of an integer measurement + * type in form of a histogram. The wall[] array contains the inclusive + * upper boundaries of the bins, while the bin[] array contains the actual + * counters per bin. The histogram walls are typically set automatically + * using the functions provided below.*/ + +#define NUM_BINS 32 /* Number of histogram bins. */ + +struct histogram { + uint32_t wall[NUM_BINS]; + uint64_t bin[NUM_BINS]; +}; + +/* Data structure to record details PMD execution metrics per iteration for + * a history period of up to HISTORY_LEN iterations in circular buffer. + * Also used to record up to HISTORY_LEN millisecond averages/totals of these + * metrics.*/ + +struct iter_stats { + uint64_t timestamp; /* TSC or millisecond. */ + uint64_t cycles; /* Number of TSC cycles spent in it/ms. */ + uint64_t busy_cycles; /* Cycles spent in busy iterations in ms. */ + uint32_t iterations; /* Iterations in ms. */ + uint32_t pkts; /* Packets processed in iteration/ms. */ + uint32_t upcalls; /* Number of upcalls in iteration/ms. */ + uint32_t upcall_cycles; /* Cycles spent in upcalls in iteration/ms. */ + uint32_t batches; /* Number of rx batches in iteration/ms. */ + uint32_t max_vhost_qfill; /* Maximum fill level encountered in it/ms. */ +}; + +#define HISTORY_LEN 1000 /* Length of recorded history + (iterations and ms). */ +#define DEF_HIST_SHOW 20 /* Default number of history samples to + display. */ + +struct history { + size_t idx; /* Slot to which next call to history_store() + will write. */ + struct iter_stats sample[HISTORY_LEN]; +}; + +/* Container for all performance metrics of a PMD within the struct + * dp_netdev_pmd_thread. The metrics must be updated from within the PMD + * thread but can be read from any thread. The basic PMD counters in + * struct pmd_counters can be read without protection against concurrent + * clearing. The other metrics may only be safely read with the clear_mutex + * held to protect against concurrent clearing. */ struct pmd_perf_stats { - /* Start of the current PMD iteration in TSC cycles.*/ - uint64_t start_it_tsc; + /* Prevents interference between PMD polling and stats clearing. */ + struct ovs_mutex stats_mutex; + /* Set by CLI thread to order clearing of PMD stats. */ + volatile bool clear; + /* Prevents stats retrieval while clearing is in progress. */ + struct ovs_mutex clear_mutex; + /* Start of the current performance measurement period. */ + uint64_t start_ms; /* Latest TSC time stamp taken in PMD. */ uint64_t last_tsc; + /* Used to space certain checks in time. */ + uint64_t next_check_tsc; /* If non-NULL, outermost cycle timer currently running in PMD. */ struct cycle_timer *cur_timer; /* Set of PMD counters with their zero offsets. */ struct pmd_counters counters; + /* Statistics of the current iteration. */ + struct iter_stats current; + /* Totals for the current millisecond. */ + struct iter_stats totals; + /* Histograms for the PMD metrics. */ + struct histogram cycles; + struct histogram pkts; + struct histogram cycles_per_pkt; + struct histogram upcalls; + struct histogram cycles_per_upcall; + struct histogram pkts_per_batch; + struct histogram max_vhost_qfill; + /* Iteration history buffer. */ + struct history iterations; + /* Millisecond history buffer. */ + struct history milliseconds; }; /* Support for accurate timing of PMD execution on TSC clock cycle level. @@ -175,8 +253,14 @@ cycle_timer_stop(struct pmd_perf_stats *s, return now - timer->start; } +/* Functions to initialize and reset the PMD performance metrics. */ + void pmd_perf_stats_init(struct pmd_perf_stats *s); void pmd_perf_stats_clear(struct pmd_perf_stats *s); +void pmd_perf_stats_clear_lock(struct pmd_perf_stats *s); + +/* Functions to read and update PMD counters. */ + void pmd_perf_read_counters(struct pmd_perf_stats *s, uint64_t stats[PMD_N_STATS]); @@ -199,32 +283,182 @@ pmd_perf_update_counter(struct pmd_perf_stats *s, atomic_store_relaxed(&s->counters.n[counter], tmp); } +/* Functions to manipulate a sample history. */ + +static inline void +histogram_add_sample(struct histogram *hist, uint32_t val) +{ + /* TODO: Can do better with binary search? */ + for (int i = 0; i < NUM_BINS-1; i++) { + if (val <= hist->wall[i]) { + hist->bin[i]++; + return; + } + } + hist->bin[NUM_BINS-1]++; +} + +uint64_t histogram_samples(const struct histogram *hist); + +/* Add an offset to idx modulo HISTORY_LEN. */ +static inline uint32_t +history_add(uint32_t idx, uint32_t offset) +{ + return (idx + offset) % HISTORY_LEN; +} + +/* Subtract idx2 from idx1 modulo HISTORY_LEN. */ +static inline uint32_t +history_sub(uint32_t idx1, uint32_t idx2) +{ + return (idx1 + HISTORY_LEN - idx2) % HISTORY_LEN; +} + +static inline struct iter_stats * +history_current(struct history *h) +{ + return &h->sample[h->idx]; +} + +static inline struct iter_stats * +history_next(struct history *h) +{ + size_t next_idx = (h->idx + 1) % HISTORY_LEN; + struct iter_stats *next = &h->sample[next_idx]; + + memset(next, 0, sizeof(*next)); + h->idx = next_idx; + return next; +} + +static inline struct iter_stats * +history_store(struct history *h, struct iter_stats *is) +{ + if (is) { + h->sample[h->idx] = *is; + } + /* Advance the history pointer */ + return history_next(h); +} + +/* Functions recording PMD metrics per iteration. */ + static inline void pmd_perf_start_iteration(struct pmd_perf_stats *s) { + if (s->clear) { + /* Clear the PMD stats before starting next iteration. */ + pmd_perf_stats_clear_lock(s); + } + /* Initialize the current interval stats. */ + memset(&s->current, 0, sizeof(struct iter_stats)); if (OVS_LIKELY(s->last_tsc)) { /* We assume here that last_tsc was updated immediately prior at * the end of the previous iteration, or just before the first * iteration. */ - s->start_it_tsc = s->last_tsc; + s->current.timestamp = s->last_tsc; } else { /* In case last_tsc has never been set before. */ - s->start_it_tsc = cycles_counter_update(s); + s->current.timestamp = cycles_counter_update(s); } } static inline void -pmd_perf_end_iteration(struct pmd_perf_stats *s, int rx_packets) +pmd_perf_end_iteration(struct pmd_perf_stats *s, int rx_packets, + int tx_packets, bool full_metrics) { - uint64_t cycles = cycles_counter_update(s) - s->start_it_tsc; + uint64_t now_tsc = cycles_counter_update(s); + struct iter_stats *cum_ms; + uint64_t cycles, cycles_per_pkt = 0; - if (rx_packets > 0) { + cycles = now_tsc - s->current.timestamp; + s->current.cycles = cycles; + s->current.pkts = rx_packets; + + if (rx_packets + tx_packets > 0) { pmd_perf_update_counter(s, PMD_CYCLES_ITER_BUSY, cycles); } else { pmd_perf_update_counter(s, PMD_CYCLES_ITER_IDLE, cycles); } + /* Add iteration samples to histograms. */ + histogram_add_sample(&s->cycles, cycles); + histogram_add_sample(&s->pkts, rx_packets); + + if (!full_metrics) { + return; + } + + s->counters.n[PMD_CYCLES_UPCALL] += s->current.upcall_cycles; + + if (rx_packets > 0) { + cycles_per_pkt = cycles / rx_packets; + histogram_add_sample(&s->cycles_per_pkt, cycles_per_pkt); + } + if (s->current.batches > 0) { + histogram_add_sample(&s->pkts_per_batch, + rx_packets / s->current.batches); + } + histogram_add_sample(&s->upcalls, s->current.upcalls); + if (s->current.upcalls > 0) { + histogram_add_sample(&s->cycles_per_upcall, + s->current.upcall_cycles / s->current.upcalls); + } + histogram_add_sample(&s->max_vhost_qfill, s->current.max_vhost_qfill); + + /* Add iteration samples to millisecond stats. */ + cum_ms = history_current(&s->milliseconds); + cum_ms->iterations++; + cum_ms->cycles += cycles; + if (rx_packets > 0) { + cum_ms->busy_cycles += cycles; + } + cum_ms->pkts += s->current.pkts; + cum_ms->upcalls += s->current.upcalls; + cum_ms->upcall_cycles += s->current.upcall_cycles; + cum_ms->batches += s->current.batches; + cum_ms->max_vhost_qfill += s->current.max_vhost_qfill; + + /* Store in iteration history. This advances the iteration idx and + * clears the next slot in the iteration history. */ + history_store(&s->iterations, &s->current); + if (now_tsc > s->next_check_tsc) { + /* Check if ms is completed and store in milliseconds history. */ + uint64_t now = time_msec(); + if (now != cum_ms->timestamp) { + /* Add ms stats to totals. */ + s->totals.iterations += cum_ms->iterations; + s->totals.cycles += cum_ms->cycles; + s->totals.busy_cycles += cum_ms->busy_cycles; + s->totals.pkts += cum_ms->pkts; + s->totals.upcalls += cum_ms->upcalls; + s->totals.upcall_cycles += cum_ms->upcall_cycles; + s->totals.batches += cum_ms->batches; + s->totals.max_vhost_qfill += cum_ms->max_vhost_qfill; + cum_ms = history_next(&s->milliseconds); + cum_ms->timestamp = now; + } + s->next_check_tsc = cycles_counter_update(s) + 10000; + } } +/* Formatting the output of commands. */ + +struct pmd_perf_params { + int command_type; + bool histograms; + size_t iter_hist_len; + size_t ms_hist_len; +}; + +void pmd_perf_format_overall_stats(struct ds *str, struct pmd_perf_stats *s, + double duration); +void pmd_perf_format_histograms(struct ds *str, struct pmd_perf_stats *s); +void pmd_perf_format_iteration_history(struct ds *str, + struct pmd_perf_stats *s, + int n_iter); +void pmd_perf_format_ms_history(struct ds *str, struct pmd_perf_stats *s, + int n_ms); + #ifdef __cplusplus } #endif diff --git a/lib/dpif-netdev-unixctl.man b/lib/dpif-netdev-unixctl.man new file mode 100644 index 0000000..76c3e4e --- /dev/null +++ b/lib/dpif-netdev-unixctl.man @@ -0,0 +1,157 @@ +.SS "DPIF-NETDEV COMMANDS" +These commands are used to expose internal information (mostly statistics) +about the "dpif-netdev" userspace datapath. If there is only one datapath +(as is often the case, unless \fBdpctl/\fR commands are used), the \fIdp\fR +argument can be omitted. By default the commands present data for all pmd +threads in the datapath. By specifying the "-pmd Core" option one can filter +the output for a single pmd in the datapath. +. +.IP "\fBdpif-netdev/pmd-stats-show\fR [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]" +Shows performance statistics for one or all pmd threads of the datapath +\fIdp\fR. The special thread "main" sums up the statistics of every non pmd +thread. + +The sum of "emc hits", "masked hits" and "miss" is the number of +packet lookups performed by the datapath. Beware that a recirculated packet +experiences one additional lookup per recirculation, so there may be +more lookups than forwarded packets in the datapath. + +Cycles are counted using the TSC or similar facilities (when available on +the platform). The duration of one cycle depends on the processing platform. + +"idle cycles" refers to cycles spent in PMD iterations not forwarding any +any packets. "processing cycles" refers to cycles spent in PMD iterations +forwarding at least one packet, including the cost for polling, processing and +transmitting said packets. + +To reset these counters use \fBdpif-netdev/pmd-stats-clear\fR. +. +.IP "\fBdpif-netdev/pmd-stats-clear\fR [\fIdp\fR]" +Resets to zero the per pmd thread performance numbers shown by the +\fBdpif-netdev/pmd-stats-show\fR and \fBdpif-netdev/pmd-perf-show\fR commands. +It will NOT reset datapath or bridge statistics, only the values shown by +the above commands. +. +.IP "\fBdpif-netdev/pmd-perf-show\fR [\fB-nh\fR] [\fB-it\fR \fIiter_len\fR] \ +[\fB-ms\fR \fIms_len\fR] [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]" +Shows detailed performance metrics for one or all pmds threads of the +user space datapath. + +The collection of detailed statistics can be controlled by a new +configuration parameter "other_config:pmd-perf-metrics". By default it +is disabled. The run-time overhead, when enabled, is in the order of 1%. + +.RS +.IP +.PD .4v +.IP \(em +used cycles +.IP \(em +forwared packets +.IP \(em +number of rx batches +.IP \(em +packets/rx batch +.IP \(em +max. vhostuser queue fill level +.IP \(em +number of upcalls +.IP \(em +cycles spent in upcalls +.PD +.RE +.IP +This raw recorded data is used threefold: + +.RS +.IP +.PD .4v +.IP 1. +In histograms for each of the following metrics: +.RS +.IP \(em +cycles/iteration (logarithmic) +.IP \(em +packets/iteration (logarithmic) +.IP \(em +cycles/packet +.IP \(em +packets/batch +.IP \(em +max. vhostuser qlen (logarithmic) +.IP \(em +upcalls +.IP \(em +cycles/upcall (logarithmic) +The histograms bins are divided linear or logarithmic. +.RE +.IP 2. +A cyclic history of the above metrics for 1024 iterations +.IP 3. +A cyclic history of the cummulative/average values per millisecond wall +clock for the last 1024 milliseconds: +.RS +.IP \(em +number of iterations +.IP \(em +avg. cycles/iteration +.IP \(em +packets (Kpps) +.IP \(em +avg. packets/batch +.IP \(em +avg. max vhost qlen +.IP \(em +upcalls +.IP \(em +avg. cycles/upcall +.RE +.PD +.RE +.IP +. +The command options are: +.RS +.IP "\fB-nh\fR" +Suppress the histograms +.IP "\fB-it\fR \fIiter_len\fR" +Display the last iter_len iteration stats +.IP "\fB-ms\fR \fIms_len\fR" +Display the last ms_len millisecond stats +.RE +.IP +The output always contains the following global PMD statistics: +.RS +.IP +Time: 15:24:55.270 .br +Measurement duration: 1.008 s + +pmd thread numa_id 0 core_id 1: + + Cycles: 2419034712 (2.40 GHz) + Iterations: 572817 (1.76 us/it) + - idle: 486808 (15.9 % cycles) + - busy: 86009 (84.1 % cycles) + Rx packets: 2399607 (2381 Kpps, 848 cycles/pkt) + Datapath passes: 3599415 (1.50 passes/pkt) + - EMC hits: 336472 ( 9.3 %) + - Megaflow hits: 3262943 (90.7 %, 1.00 subtbl lookups/hit) + - Upcalls: 0 ( 0.0 %, 0.0 us/upcall) + - Lost upcalls: 0 ( 0.0 %) + Tx packets: 2399607 (2381 Kpps) + Tx batches: 171400 (14.00 pkts/batch) +.RE +.IP +Here "Rx packets" actually reflects the number of packets forwarded by the +datapath. "Datapath passes" matches the number of packet lookups as +reported by the \fBdpif-netdev/pmd-stats-show\fR command. + +To reset the counters and start a new measurement use +\fBdpif-netdev/pmd-stats-clear\fR. +. +.IP "\fBdpif-netdev/pmd-rxq-show\fR [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]" +For one or all pmd threads of the datapath \fIdp\fR show the list of queue-ids +with port names, which this thread polls. +. +.IP "\fBdpif-netdev/pmd-rxq-rebalance\fR [\fIdp\fR]" +Reassigns rxqs to pmds in the datapath \fIdp\fR based on their current usage. diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index 86d8739..c4ac176 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -49,6 +49,7 @@ #include "id-pool.h" #include "latch.h" #include "netdev.h" +#include "netdev-provider.h" #include "netdev-vport.h" #include "netlink.h" #include "odp-execute.h" @@ -281,6 +282,8 @@ struct dp_netdev { /* Probability of EMC insertions is a factor of 'emc_insert_min'.*/ OVS_ALIGNED_VAR(CACHE_LINE_SIZE) atomic_uint32_t emc_insert_min; + /* Enable collection of PMD performance metrics. */ + atomic_bool pmd_perf_metrics; /* Protects access to ofproto-dpif-upcall interface during revalidator * thread synchronization. */ @@ -356,6 +359,7 @@ struct dp_netdev_rxq { particular core. */ unsigned intrvl_idx; /* Write index for 'cycles_intrvl'. */ struct dp_netdev_pmd_thread *pmd; /* pmd thread that polls this queue. */ + bool is_vhost; /* Is rxq of a vhost port. */ /* Counters of cycles spent successfully polling and processing pkts. */ atomic_ullong cycles[RXQ_N_CYCLES]; @@ -717,6 +721,8 @@ static inline bool emc_entry_alive(struct emc_entry *ce); static void emc_clear_entry(struct emc_entry *ce); static void dp_netdev_request_reconfigure(struct dp_netdev *dp); +static inline bool +pmd_perf_metrics_enabled(const struct dp_netdev_pmd_thread *pmd); static void emc_cache_init(struct emc_cache *flow_cache) @@ -800,7 +806,8 @@ get_dp_netdev(const struct dpif *dpif) enum pmd_info_type { PMD_INFO_SHOW_STATS, /* Show how cpu cycles are spent. */ PMD_INFO_CLEAR_STATS, /* Set the cycles count to 0. */ - PMD_INFO_SHOW_RXQ /* Show poll-lists of pmd threads. */ + PMD_INFO_SHOW_RXQ, /* Show poll lists of pmd threads. */ + PMD_INFO_PERF_SHOW, /* Show pmd performance details. */ }; static void @@ -891,6 +898,47 @@ pmd_info_show_stats(struct ds *reply, stats[PMD_CYCLES_ITER_BUSY], total_packets); } +static void +pmd_info_show_perf(struct ds *reply, + struct dp_netdev_pmd_thread *pmd, + struct pmd_perf_params *par) +{ + if (pmd->core_id != NON_PMD_CORE_ID) { + char *time_str = + xastrftime_msec("%H:%M:%S.###", time_wall_msec(), true); + long long now = time_msec(); + double duration = (now - pmd->perf_stats.start_ms) / 1000.0; + + ds_put_cstr(reply, "\n"); + ds_put_format(reply, "Time: %s\n", time_str); + ds_put_format(reply, "Measurement duration: %.3f s\n", duration); + ds_put_cstr(reply, "\n"); + format_pmd_thread(reply, pmd); + ds_put_cstr(reply, "\n"); + pmd_perf_format_overall_stats(reply, &pmd->perf_stats, duration); + if (pmd_perf_metrics_enabled(pmd)) { + /* Prevent parallel clearing of perf metrics. */ + ovs_mutex_lock(&pmd->perf_stats.clear_mutex); + if (par->histograms) { + ds_put_cstr(reply, "\n"); + pmd_perf_format_histograms(reply, &pmd->perf_stats); + } + if (par->iter_hist_len > 0) { + ds_put_cstr(reply, "\n"); + pmd_perf_format_iteration_history(reply, &pmd->perf_stats, + par->iter_hist_len); + } + if (par->ms_hist_len > 0) { + ds_put_cstr(reply, "\n"); + pmd_perf_format_ms_history(reply, &pmd->perf_stats, + par->ms_hist_len); + } + ovs_mutex_unlock(&pmd->perf_stats.clear_mutex); + } + free(time_str); + } +} + static int compare_poll_list(const void *a_, const void *b_) { @@ -1068,7 +1116,7 @@ dpif_netdev_pmd_info(struct unixctl_conn *conn, int argc, const char *argv[], ovs_mutex_lock(&dp_netdev_mutex); while (argc > 1) { - if (!strcmp(argv[1], "-pmd") && argc >= 3) { + if (!strcmp(argv[1], "-pmd") && argc > 2) { if (str_to_uint(argv[2], 10, &core_id)) { filter_on_pmd = true; } @@ -1108,6 +1156,8 @@ dpif_netdev_pmd_info(struct unixctl_conn *conn, int argc, const char *argv[], pmd_perf_stats_clear(&pmd->perf_stats); } else if (type == PMD_INFO_SHOW_STATS) { pmd_info_show_stats(&reply, pmd); + } else if (type == PMD_INFO_PERF_SHOW) { + pmd_info_show_perf(&reply, pmd, (struct pmd_perf_params *)aux); } } free(pmd_list); @@ -1117,6 +1167,48 @@ dpif_netdev_pmd_info(struct unixctl_conn *conn, int argc, const char *argv[], unixctl_command_reply(conn, ds_cstr(&reply)); ds_destroy(&reply); } + +static void +pmd_perf_show_cmd(struct unixctl_conn *conn, int argc, + const char *argv[], + void *aux OVS_UNUSED) +{ + struct pmd_perf_params par; + long int it_hist = 0, ms_hist = 0; + par.histograms = true; + + while (argc > 1) { + if (!strcmp(argv[1], "-nh")) { + par.histograms = false; + argc -= 1; + argv += 1; + } else if (!strcmp(argv[1], "-it") && argc > 2) { + it_hist = strtol(argv[2], NULL, 10); + if (it_hist < 0) { + it_hist = 0; + } else if (it_hist > HISTORY_LEN) { + it_hist = HISTORY_LEN; + } + argc -= 2; + argv += 2; + } else if (!strcmp(argv[1], "-ms") && argc > 2) { + ms_hist = strtol(argv[2], NULL, 10); + if (ms_hist < 0) { + ms_hist = 0; + } else if (ms_hist > HISTORY_LEN) { + ms_hist = HISTORY_LEN; + } + argc -= 2; + argv += 2; + } else { + break; + } + } + par.iter_hist_len = it_hist; + par.ms_hist_len = ms_hist; + par.command_type = PMD_INFO_PERF_SHOW; + dpif_netdev_pmd_info(conn, argc, argv, &par); +} static int dpif_netdev_init(void) @@ -1134,6 +1226,12 @@ dpif_netdev_init(void) unixctl_command_register("dpif-netdev/pmd-rxq-show", "[-pmd core] [dp]", 0, 3, dpif_netdev_pmd_info, (void *)&poll_aux); + unixctl_command_register("dpif-netdev/pmd-perf-show", + "[-nh] [-it iter-history-len]" + " [-ms ms-history-len]" + " [-pmd core] [dp]", + 0, 8, pmd_perf_show_cmd, + NULL); unixctl_command_register("dpif-netdev/pmd-rxq-rebalance", "[dp]", 0, 1, dpif_netdev_pmd_rebalance, NULL); @@ -3020,6 +3118,18 @@ dpif_netdev_set_config(struct dpif *dpif, const struct smap *other_config) } } + bool perf_enabled = smap_get_bool(other_config, "pmd-perf-metrics", false); + bool cur_perf_enabled; + atomic_read_relaxed(&dp->pmd_perf_metrics, &cur_perf_enabled); + if (perf_enabled != cur_perf_enabled) { + atomic_store_relaxed(&dp->pmd_perf_metrics, perf_enabled); + if (perf_enabled) { + VLOG_INFO("PMD performance metrics collection enabled"); + } else { + VLOG_INFO("PMD performance metrics collection disabled"); + } + } + return 0; } @@ -3189,6 +3299,20 @@ dp_netdev_rxq_get_intrvl_cycles(struct dp_netdev_rxq *rx, unsigned idx) return processing_cycles; } +static inline bool +pmd_perf_metrics_enabled(const struct dp_netdev_pmd_thread *pmd) +{ + /* If stores and reads of 64-bit integers are not atomic, the + * full PMD performance metrics are not available as locked + * access to 64 bit integers would be prohibitively expensive. */ + if (sizeof(uint64_t) > sizeof(void *)) { + return false; + } + bool pmd_perf_enabled; + atomic_read_relaxed(&pmd->dp->pmd_perf_metrics, &pmd_perf_enabled); + return pmd_perf_enabled; +} + static int dp_netdev_pmd_flush_output_on_port(struct dp_netdev_pmd_thread *pmd, struct tx_port *p) @@ -3264,10 +3388,12 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, struct dp_netdev_rxq *rxq, odp_port_t port_no) { + struct pmd_perf_stats *s = &pmd->perf_stats; struct dp_packet_batch batch; struct cycle_timer timer; int error; - int batch_cnt = 0, output_cnt = 0; + int batch_cnt = 0; + int rem_qlen = 0, *qlen_p= NULL; uint64_t cycles; /* Measure duration for polling and processing rx burst. */ @@ -3276,20 +3402,37 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, pmd->ctx.last_rxq = rxq; dp_packet_batch_init(&batch); - error = netdev_rxq_recv(rxq->rx, &batch, NULL); + /* Fetch the rx queue length only for vhostuser ports. */ + if (pmd_perf_metrics_enabled(pmd) && rxq->is_vhost) { + qlen_p = &rem_qlen; + } + + error = netdev_rxq_recv(rxq->rx, &batch, qlen_p); if (!error) { /* At least one packet received. */ *recirc_depth_get() = 0; pmd_thread_ctx_time_update(pmd); - batch_cnt = batch.count; + if (pmd_perf_metrics_enabled(pmd)) { + /* Update batch histogram. */ + s->current.batches++; + histogram_add_sample(&s->pkts_per_batch, batch_cnt); + /* Update the maximum vhost rx queue fill level. */ + if (rxq->is_vhost && rem_qlen >= 0) { + uint32_t qfill = batch_cnt + rem_qlen; + if (qfill > s->current.max_vhost_qfill) { + s->current.max_vhost_qfill = qfill; + } + } + } + /* Process packet batch. */ dp_netdev_input(pmd, &batch, port_no); /* Assign processing cycles to rx queue. */ cycles = cycle_timer_stop(&pmd->perf_stats, &timer); dp_netdev_rxq_add_cycles(rxq, RXQ_CYCLES_PROC_CURR, cycles); - output_cnt = dp_netdev_pmd_flush_output_packets(pmd, false); + dp_netdev_pmd_flush_output_packets(pmd, false); } else { /* Discard cycles. */ cycle_timer_stop(&pmd->perf_stats, &timer); @@ -3303,7 +3446,7 @@ dp_netdev_process_rxq_port(struct dp_netdev_pmd_thread *pmd, pmd->ctx.last_rxq = NULL; - return batch_cnt + output_cnt; + return batch_cnt; } static struct tx_port * @@ -3359,6 +3502,7 @@ port_reconfigure(struct dp_netdev_port *port) } port->rxqs[i].port = port; + port->rxqs[i].is_vhost = !strncmp(port->type, "dpdkvhost", 9); err = netdev_rxq_open(netdev, &port->rxqs[i].rx, i); if (err) { @@ -4137,23 +4281,26 @@ reload: pmd->intrvl_tsc_prev = 0; atomic_store_relaxed(&pmd->intrvl_cycles, 0); cycles_counter_update(s); + /* Protect pmd stats from external clearing while polling. */ + ovs_mutex_lock(&pmd->perf_stats.stats_mutex); for (;;) { - uint64_t iter_packets = 0; + uint64_t rx_packets = 0, tx_packets = 0; pmd_perf_start_iteration(s); + for (i = 0; i < poll_cnt; i++) { process_packets = dp_netdev_process_rxq_port(pmd, poll_list[i].rxq, poll_list[i].port_no); - iter_packets += process_packets; + rx_packets += process_packets; } - if (!iter_packets) { + if (!rx_packets) { /* We didn't receive anything in the process loop. * Check if we need to send something. * There was no time updates on current iteration. */ pmd_thread_ctx_time_update(pmd); - iter_packets += dp_netdev_pmd_flush_output_packets(pmd, false); + tx_packets = dp_netdev_pmd_flush_output_packets(pmd, false); } if (lc++ > 1024) { @@ -4172,8 +4319,10 @@ reload: break; } } - pmd_perf_end_iteration(s, iter_packets); + pmd_perf_end_iteration(s, rx_packets, tx_packets, + pmd_perf_metrics_enabled(pmd)); } + ovs_mutex_unlock(&pmd->perf_stats.stats_mutex); poll_cnt = pmd_load_queues_and_ports(pmd, &poll_list); exiting = latch_is_set(&pmd->exit_latch); @@ -5068,6 +5217,7 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, struct match match; ovs_u128 ufid; int error; + uint64_t cycles = cycles_counter_update(&pmd->perf_stats); match.tun_md.valid = false; miniflow_expand(&key->mf, &match.flow); @@ -5121,6 +5271,14 @@ handle_packet_upcall(struct dp_netdev_pmd_thread *pmd, ovs_mutex_unlock(&pmd->flow_mutex); emc_probabilistic_insert(pmd, key, netdev_flow); } + if (pmd_perf_metrics_enabled(pmd)) { + /* Update upcall stats. */ + cycles = cycles_counter_update(&pmd->perf_stats) - cycles; + struct pmd_perf_stats *s = &pmd->perf_stats; + s->current.upcalls++; + s->current.upcall_cycles += cycles; + histogram_add_sample(&s->cycles_per_upcall, cycles); + } return error; } diff --git a/manpages.mk b/manpages.mk index d4bf0ec..aaf8bc2 100644 --- a/manpages.mk +++ b/manpages.mk @@ -250,6 +250,7 @@ vswitchd/ovs-vswitchd.8: \ lib/coverage-unixctl.man \ lib/daemon.man \ lib/dpctl.man \ + lib/dpif-netdev-unixctl.man \ lib/memory-unixctl.man \ lib/netdev-dpdk-unixctl.man \ lib/service.man \ @@ -266,6 +267,7 @@ lib/common.man: lib/coverage-unixctl.man: lib/daemon.man: lib/dpctl.man: +lib/dpif-netdev-unixctl.man: lib/memory-unixctl.man: lib/netdev-dpdk-unixctl.man: lib/service.man: diff --git a/vswitchd/ovs-vswitchd.8.in b/vswitchd/ovs-vswitchd.8.in index 80e5f53..8b4034d 100644 --- a/vswitchd/ovs-vswitchd.8.in +++ b/vswitchd/ovs-vswitchd.8.in @@ -256,32 +256,7 @@ type). .. .so lib/dpctl.man . -.SS "DPIF-NETDEV COMMANDS" -These commands are used to expose internal information (mostly statistics) -about the ``dpif-netdev'' userspace datapath. If there is only one datapath -(as is often the case, unless \fBdpctl/\fR commands are used), the \fIdp\fR -argument can be omitted. -.IP "\fBdpif-netdev/pmd-stats-show\fR [\fIdp\fR]" -Shows performance statistics for each pmd thread of the datapath \fIdp\fR. -The special thread ``main'' sums up the statistics of every non pmd thread. -The sum of ``emc hits'', ``masked hits'' and ``miss'' is the number of -packets received by the datapath. Cycles are counted using the TSC or similar -facilities (when available on the platform). To reset these counters use -\fBdpif-netdev/pmd-stats-clear\fR. The duration of one cycle depends on the -measuring infrastructure. ``idle cycles'' refers to cycles spent polling -devices but not receiving any packets. ``processing cycles'' refers to cycles -spent polling devices and successfully receiving packets, plus the cycles -spent processing said packets. -.IP "\fBdpif-netdev/pmd-stats-clear\fR [\fIdp\fR]" -Resets to zero the per pmd thread performance numbers shown by the -\fBdpif-netdev/pmd-stats-show\fR command. It will NOT reset datapath or -bridge statistics, only the values shown by the above command. -.IP "\fBdpif-netdev/pmd-rxq-show\fR [\fIdp\fR]" -For each pmd thread of the datapath \fIdp\fR shows list of queue-ids with -port names, which this thread polls. -.IP "\fBdpif-netdev/pmd-rxq-rebalance\fR [\fIdp\fR]" -Reassigns rxqs to pmds in the datapath \fIdp\fR based on their current usage. -. +.so lib/dpif-netdev-unixctl.man .so lib/netdev-dpdk-unixctl.man .so ofproto/ofproto-dpif-unixctl.man .so ofproto/ofproto-unixctl.man diff --git a/vswitchd/vswitch.xml b/vswitchd/vswitch.xml index f899a19..aac663f 100644 --- a/vswitchd/vswitch.xml +++ b/vswitchd/vswitch.xml @@ -375,6 +375,18 @@

+ +

+ Enables recording of detailed PMD performance metrics for analysis + and trouble-shooting. This can have a performance impact in the + order of 1%. +

+

+ Defaults to false but can be changed at any time. +

+
+

From patchwork Thu Mar 15 16:03:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Scheurich X-Patchwork-Id: 886319 X-Patchwork-Delegate: ian.stokes@intel.com Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=openvswitch.org (client-ip=140.211.169.12; helo=mail.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=ericsson.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=ericsson.com header.i=@ericsson.com header.b="abGaiZcn"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=ericsson.com header.i=@ericsson.com header.b="FnwQvL2P"; dkim-atps=neutral Received: from mail.linuxfoundation.org (mail.linuxfoundation.org [140.211.169.12]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 402D3h0rPmz9sVK for ; Fri, 16 Mar 2018 03:06:24 +1100 (AEDT) Received: from mail.linux-foundation.org (localhost [127.0.0.1]) by mail.linuxfoundation.org (Postfix) with ESMTP id 1CF0A1130; Thu, 15 Mar 2018 16:05:21 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@mail.linuxfoundation.org Received: from smtp1.linuxfoundation.org (smtp1.linux-foundation.org [172.17.192.35]) by mail.linuxfoundation.org (Postfix) with ESMTPS id 77EB810F6 for ; Thu, 15 Mar 2018 16:05:19 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from sesbmg22.ericsson.net (sesbmg22.ericsson.net [193.180.251.48]) by smtp1.linuxfoundation.org (Postfix) with ESMTPS id C808237A for ; Thu, 15 Mar 2018 16:05:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; d=ericsson.com; s=mailgw201801; c=relaxed/simple; q=dns/txt; i=@ericsson.com; t=1521129916; h=From:Sender:Reply-To:Subject:Date:Message-ID:To:CC:MIME-Version:Content-Type: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=Qn4scLkyEDnRCCmrta8WcDLaiq4xZe4pD/WtknAF42w=; b=abGaiZcncewL4ZJKPOzkjfdfoV7sKVtg6U1vN7ylMU41DcfRE0EekyG/ZYv9TbCi 9oE1Xn8XPn+RyzIFIJLuLO5Q3j1EMWqvW1G2EQIESsdEKxE2SQJv64oxXXW1Fe10 6CZ1uZpsiNKiVya3yuDyPeBxjfqpopiqTySfuc/qqpA=; X-AuditID: c1b4fb30-703ff7000000095a-c8-5aaa99bc2905 Received: from ESESSHC005.ericsson.se (Unknown_Domain [153.88.183.33]) by sesbmg22.ericsson.net (Symantec Mail Security) with SMTP id E4.2C.02394.CB99AAA5; Thu, 15 Mar 2018 17:05:16 +0100 (CET) Received: from ESESBMR505.ericsson.se (153.88.183.201) by ESESSHC005.ericsson.se (153.88.183.33) with Microsoft SMTP Server (TLS) id 14.3.382.0; Thu, 15 Mar 2018 17:05:16 +0100 Received: from ESESSMB504.ericsson.se (153.88.183.165) by ESESBMR505.ericsson.se (153.88.183.201) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1034.26; Thu, 15 Mar 2018 17:05:15 +0100 Received: from EUR03-VE1-obe.outbound.protection.outlook.com (153.88.183.157) by ESESSMB504.ericsson.se (153.88.183.165) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1034.26 via Frontend Transport; Thu, 15 Mar 2018 17:05:15 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=IQxysBxE2kUPicL5e41AA4wj5WmslcXbAVVOQrIiGBE=; b=FnwQvL2Pr9Azijba5j+AVuvBmA1nQq6NJ8hNtUKYIRGTSrJvyf6OtNAHxg066UEImH3n1vQKot/icGtNg5v59A/mU3UMFuhaAorSM5cMWtsnTI/iESChULNy9oINmRH7AFrXOaEl4aB884+YqxJHbheJwww1zC9xmyRsZwhGkPM= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=jan.scheurich@ericsson.com; Received: from ubuntu.eed.ericsson.se (129.192.10.2) by DB3PR07MB0650.eurprd07.prod.outlook.com (2a01:111:e400:943d::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.588.7; Thu, 15 Mar 2018 16:05:12 +0000 From: Jan Scheurich To: Date: Thu, 15 Mar 2018 17:03:13 +0100 Message-ID: <1521129793-27851-4-git-send-email-jan.scheurich@ericsson.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1521129793-27851-1-git-send-email-jan.scheurich@ericsson.com> References: <1521129793-27851-1-git-send-email-jan.scheurich@ericsson.com> MIME-Version: 1.0 X-Originating-IP: [129.192.10.2] X-ClientProxiedBy: HE1PR07CA0009.eurprd07.prod.outlook.com (2603:10a6:7:67::19) To DB3PR07MB0650.eurprd07.prod.outlook.com (2a01:111:e400:943d::16) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 4871c0f2-e6d1-4790-4e28-08d58a8e8948 X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(5600026)(4604075)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7153060)(7193020); SRVR:DB3PR07MB0650; X-Microsoft-Exchange-Diagnostics: 1; DB3PR07MB0650; 3:joyHLuB4XpiuUUPmnEqiPaUAmNmvwh44zRvejVe0886KUdOYfmJ3OO8YO46xkfsXk9dPvV5sEj254f99bNGHsHn1Q4hiyvdFLPFPrK5u7U8KYJpbOKhf6xI7q7z3vWJ+MEDDdDwPptDkTCTpV+lfbDn7KEqAg4Jg3DV5Jq2HZjfWYGh25Wlk4r5o7Hx5Cd1Jv0rrD2EPK/DcMDUEYypwx1A2TwsVjrdtw7xKebvcuKRslyIvdFhnpkgPMyDGQ6l3; 25:V5PTiGXd375BSYkxwEbzEpeKtmMpCAHLgEGJEUelkea0ZX4n0MRjrtKzaec5DtX4ISfI68p1LGGUqFWv16zrADK22IbovrXXg4AaqyUCkW4r9KTWH9LlwzSDdAWFFnO96LrUooJ6FfRP8CW1N/QuHdgU2h90ioOASiNyYU8FZNBBZ2IEHBCMfaH8Y2BCvKFnza2AB34vpvxZtOdOEuK7QjVNMGpj99WVnWQiXfsNQE0l2gt5S2WQqHXoH5GqTWvjjxSo68RIdE0OTCw8RStOK2Hqi0DYjmXBXEBecVszutHhTlLxBxA4ymTGWCLSXKiarVBmvv8WbOTZTSy6Mee/fQ==; 31:6ZPXv8d7zjsS1ymHMwbObQBVkmTO116L1Mk3iUwgL7FMgX47X2AZ5nlzwIEzD/zk+0ZLHgURBVKLuDS8v0BM9wT7QASih1B8xBzvBXO18v+BavWTss/hhSuPzIj1xwBHsKPPADvscfpXJ5kbtlvOoFAAune7Vp4jD2aaBeHNha7VvtAc5HjT1csxNEf7p3NLyVA9xdcpwVLUsysRRwo9Q0SNpCxkapa73wB7rcKlnTg= X-MS-TrafficTypeDiagnostic: DB3PR07MB0650: X-Microsoft-Exchange-Diagnostics: 1; DB3PR07MB0650; 20:W1IqoB8XCY2B1EEVWnbyT1ElJ+FshpfsEhxisOCynKg5TiMz0uz0TZwfDnFdDArjcmKkNzPPRA1+VltNtLuk6pHudoSmUap2zYLGAlfyPiKokmwtNO05l334BlpckWopjsznKwjUQ5sd0C3jsiTMdQ1hmd+BeunpBuD6cKYVTENyvkPzVKgOtglFHVK/Lb6HzIlMsyYCPUZhhlsRTM2lzBY+FFR1LAdYtZhVPHQsowBk2WECsp4drErLIzUF+AR4wAVhMDqz8TkjBhve/vWVS35wVOOw1EZHpHh0f/YzuZ8oosz0VTbF4VIBKl0pK8htfo+Hu9lR86+xOEE6K5iid/PBkiXnLBMzdhV/wCPNSjG/8f6vHtPCHefky5nWZEtoAn+SyMas4NMMW38C/cTnbazTW7tyR+1ymC/Tk40x70Oc3KWR+Hwt7gzI4j5UzV10UNstqWRJi+u6RmXss2f4mwYYsaOL9MhZ0Lxk8TM9c0mKMjNO5J4y+GXDEvO86VYi; 4:6vnk6vhFzJazEYo7F0+4xlxtiyh04Z+yt8cOOp+NXAZV1NEhNg1gYJD1QX0D8esd4qfCOxQY7Z/ZmmHcbnenBJNmZU6bu24rsLSayzC71BgSASQLtBLoZVY7Ri8/02McXz9sgV1DZIeLl9jhWPrW17Z6iNaWqHNh5hlURy3/vB00XIS9mym22liVapYHJ5zHtksWVNpUMkMRfGtmTj60pX/NcTqIQJvo2yqGgqRSImy5wiMBqEhCzTMP41VjDGmKQeOhaABXLDZr4VRnh0sddwVhCt+MNJOIsVLy8ABphCoh4ZLFPr7TGVKlHrZJOUMsEf7Zljw3HeHXy3DIowtZ4FMmJMr+vag/ere3NBPwj7M= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(37575265505322)(17755550239193); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(5005006)(8121501046)(3231221)(944501244)(52105095)(3002001)(93006095)(93001095)(10201501046)(6041310)(20161123560045)(20161123564045)(20161123562045)(20161123558120)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(6072148)(201708071742011); SRVR:DB3PR07MB0650; BCL:0; PCL:0; RULEID:; SRVR:DB3PR07MB0650; X-Forefront-PRVS: 0612E553B4 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(376002)(396003)(366004)(39860400002)(39380400002)(346002)(189003)(199004)(6116002)(316002)(16586007)(59450400001)(16526019)(478600001)(6512007)(106356001)(50226002)(4326008)(305945005)(186003)(3846002)(86362001)(2351001)(66066001)(8936002)(105586002)(2950100002)(50466002)(48376002)(6916009)(76176011)(52116002)(7736002)(97736004)(36756003)(81156014)(51416003)(6506007)(386003)(81166006)(8676002)(5660300001)(6486002)(25786009)(2361001)(68736007)(47776003)(53936002)(2906002)(107886003)(26005); DIR:OUT; SFP:1101; SCL:1; SRVR:DB3PR07MB0650; H:ubuntu.eed.ericsson.se; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Received-SPF: None (protection.outlook.com: ericsson.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; DB3PR07MB0650; 23:txJLTGApvfTzOCjexGT2mPOGuL+I4PXQcklnZW3P/?= P6Ejh3WE4Q5bah/omlA1moOhRiGGK0LWERJELhm9njQaUns6FTU5vOn/cygzsXRGpNO8enIBZht6jcMVt8M8Q0LCNI9jLw0LeVvLpSfou7Ltskq8gV/VOs32p8WGLrHZT4bLoCuDQmHl0dXdP6UiW4er1Tq4eZFtxDmFQGL5qgOpsbEVgpD0JFuHi7XbTC4ddvcwxQpp5DtOnNVhZtZVzuV+NgIem78NbE6DpKqbN5vzP7KmazMrOaV+9IwREsTZaOzpbYwmkoMOCNaTwqVvMNzkiKZw1iZeepTlWHFqhmd/F8fg3EUWZu5NrYAqFdMDM71eEUDzTlF2JabHHbWTlpp9cUDAm+4Ies7bK9jCLfupCDMBXhc4zHApZmsQKbr/le6xBSy+6TUmxDvHruCExT3FJN79fcBE/QrfblRF3kpxiRl9J/zw47kPOrXijWyRKC6hQhz2bHGagM6IyeDAwW84fFADLrBi7LkanXK8D4ybMMz3acn/+RQZCbphBRqYyCXKUDs/W7C5mIzkEtTgQ4rLyoBhV7LYpnji+GAXkSpQhWC3Ij9HFsH6Qy7butonsFlgFwgSxjvEmSMXIrNPuV0pLaAI+uWQ8wHlJk8Fd3tF0o0TY//QgIZLrVxoQwTdb4W9gZ4YmNyFAhV5y9ONXBcNnAMNuMPF4Koo5Pm9k94OEVHtsXv7CmSIEU4QWOFMavtplOrErWv7X/1dAtRAqICLoZH4shXGyx+2vxWuGQ2FP6OUlNClqZWfYC/5SbRUR+pd6vVB/1gwBSBP9ig+iS6UqnLEsaV7fyMtPszroNw+sQWZYtiTEJLOWW8ly//MaMxpqLEV3wlsQ5EAzx1NIh/ZYwzr1DFEn3dXtF/hyG9D35xvoDW1xway2bdO3WQwccY6G/ODIeUeFkRVcH76JSlisOp4x2yusNu7xPab3xrvkTZbm4jk6aePXR4jCc4zsPqzw5lLRmEqAyowhLpwBzf7NydkW07+5KbHSVEWaELxilzIqMANeYjhwSxkGSDO7osS+93ny4xecipGmwKN/LlNhOiwbcqT8BLd0WfYcEkkWWATe4m/6IYMpjIXlJ1ElN2IjUKu0INAgIBRBk14v5AYAu/FkqS34QtWH1NzjpbEg== X-Microsoft-Antispam-Message-Info: 6Oqk53mgjbsKgOE6PfjGo2OVTtuN9DV4O/0zGzQPFUVUwm4SiL9F2tztvIakXVRnz+niwbFXmKNt1gPAog6G1AqXEen4uXIxUr8CY7N2xjoHbP5FJxqhHY5oghfd4qCcgBJTVB1Ky3WQ9RiGV/U4aCjVejICPH4unBJfGygjRsIxXQzzlLumMW5JZAk0QecG X-Microsoft-Exchange-Diagnostics: 1; DB3PR07MB0650; 6:cK/9Gm31ySGokkltOmL9yfjIsZYly7QcSu5kZA51xfvQXVF36IlJ/VjbUtWQfJxczfFiSWbVCiHitUCZ+TUJMtjtGM1RwSSU8Y9MMbshC+zFe6+KGfDqzR2OpgLyUhY7HQXuTDt0+IXBWZuIqsEkgpk7wjF83n8o0X5kVF1t1kui5PbKTYldMgaCQA3Wc5v/GQOSU5UYoJGGZTSBJMZgLa1OrvpkWFKKbVIgO+5R5pZUj4bvyAE50aCtQQRyeLKPiOOrwNl/SRY2IoZIDTZoXF/Ok8aSfBDB2by2ytJnuzBSy3kcgI6Kl2r5nOfzHntts7s79bosZDaiKb258JMB/15ltkhAc6FHHp8DM3WMFbM=; 5:AhtZn5E0+HnIGMnhF2/K4pNFRtC2r5WshpZhajmqI64pe8sGXvH22aloqjY1DmTPToLTLB1k5ksp2o8hGC4JmP+q48w1dj+rA5IU7qwQSH6+MmPLRj9U74+xtNSxGsqx1n471IoCuDAPtmPjz/PDhCDffYHqq7VKp9ECrXPkkTw=; 24:ckb4DMxBc193feeOXlvtJt9o59mAgpUL8+2qmaFS3ahwh9D9ThcLGC7Hm6D8Fw/EJD0U601o2aQ9n0BPkCgJCbIgbXl9iTf7U9jil0OaMC0=; 7:z8Jtsahl7i5j4c2A+pb0c/PKKpRDrG1ICgt8O516Rhn03yxypdlt4+GkB7M+TmWu+z7k9V83Yfv2/90uq18vNUZEj8/5sRMtXbjBehInYBcP9JenHFaLyo/LtEgIQFjBno/Fmzn84ZKzoailUWpH71/rrdBghjxyBRL8rcZLraF7Nym79q09WYpjlUo9B6/UqZ1wMbLTmOGF0h5vLKeL6bzZaH7XUQ5rMnOuW5t9WLfmokf/CzdTUaqMfEF2uQTx SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-MS-Exchange-CrossTenant-OriginalArrivalTime: 15 Mar 2018 16:05:12.8343 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 4871c0f2-e6d1-4790-4e28-08d58a8e8948 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB3PR07MB0650 X-OriginatorOrg: ericsson.com X-Brightmail-Tracker: H4sIAAAAAAAAA02Se0hTURzHO/feuevqwmmZ/bQMnYSgzmfFiggTqmUYUUS1ilx6U1GnbSZZ WJaipdmSEp2aPUwsw0y7oeFMnamtVApnUagViQ8E81FGPvN6F/jf9/H5cs6BQ5PSayJHOlIT z2o16miZjYQyHKl2kRsNZSqf7l8eipryEqRofmckFZb0v2JF5fd2kaLcNCoOECmLjUOEsv/z PFL+fNVlo7zBlaH9lEqyLYyNjkxgtd7bQyQRbTPDKO5ZFjp3Wf9VnIy40xmIpgFvhJKKfRlI QktxE4KP1cmEYDgEFZVdpGD+INCXDlOCKSGAq+0U84bC4wS873mwYGxphE9A6osc6z6DgMyG IZI/xAb7QEFtAM/YYQeYK+ZseIbEOQhSBiYJvliFD0P2/dciXlN4A4xlNS9uGRwEd+uP8THg 9WBuuSXiY1u8F0bHPPhYukCk9qZTvGbwSjAb+hY1iQEaBwdJYeoMlZxh8QGAcxF8mchFwtgN Zk2/KQFygub2QqsOBnPDdeugHkHrzFurSRZDT+ccIVDu0JndKxaKh2LozutHQhEFJku5WNBB 8JgzEgJUQkJPbo4VWgdVfZzoJpLnL7l7/pK730NEGVqtY3WnYsL9/LxYbWSoTher8dKw8VVo 4YM0ctM+NWhoYIcJYRrJVjARWWUqqUidoEuMMSGgSZkdY8lciJgwdeJ5Vht7Uns2mtWZ0Fqa kq1hzHsYlRSHq+PZKJaNY7X/W4K2dUxG7nJ0tW7w0fiU+bjrIbLp22TMBX3HrsK8JtJ5XqSK OLi8a3rnpuGj3s+Db6e5vkmhnWQyJrO1P/SA71ZEMga53j/KLWSiIzXTwf5lUVidW2Hpk7YR i+fsU0t/vOen8jOBLheLCsa9eq/0Zv/Yciltd0tSSmD+VNIH/813ZD0j9stklC5C7etOanXq f194AY0cAwAA X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, RCVD_IN_DNSWL_MED autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on smtp1.linux-foundation.org Cc: i.maximets@samsung.com Subject: [ovs-dev] [PATCH v9 3/3] dpif-netdev: Detection and logging of suspicious PMD iterations X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: ovs-dev-bounces@openvswitch.org Errors-To: ovs-dev-bounces@openvswitch.org This patch enhances dpif-netdev-perf to detect iterations with suspicious statistics according to the following criteria: - iteration lasts longer than US_THR microseconds (default 250). This can be used to capture events where a PMD is blocked or interrupted for such a period of time that there is a risk for dropped packets on any of its Rx queues. - max vhost qlen exceeds a threshold Q_THR (default 128). This can be used to infer virtio queue overruns and dropped packets inside a VM, which are not visible in OVS otherwise. Such suspicious iterations can be logged together with their iteration statistics to be able to correlate them to packet drop or other events outside OVS. A new command is introduced to enable/disable logging at run-time and to adjust the above thresholds for suspicious iterations: ovs-appctl dpif-netdev/pmd-perf-log-set on | off [-b before] [-a after] [-e|-ne] [-us usec] [-q qlen] Turn logging on or off at run-time (on|off). -b before: The number of iterations before the suspicious iteration to be logged (default 5). -a after: The number of iterations after the suspicious iteration to be logged (default 5). -e: Extend logging interval if another suspicious iteration is detected before logging occurs. -ne: Do not extend logging interval (default). -q qlen: Suspicious vhost queue fill level threshold. Increase this to 512 if the Qemu supports 1024 virtio queue length. (default 128). -us usec: change the duration threshold for a suspicious iteration (default 250 us). Note: Logging of suspicious iterations itself consumes a considerable amount of processing cycles of a PMD which may be visible in the iteration history. In the worst case this can lead OVS to detect another suspicious iteration caused by logging. If more than 100 iterations around a suspicious iteration have been logged once, OVS falls back to the safe default values (-b 5/-a 5/-ne) to avoid that logging itself causes continuos further logging. Signed-off-by: Jan Scheurich --- NEWS | 2 + lib/dpif-netdev-perf.c | 201 ++++++++++++++++++++++++++++++++++++++++++++ lib/dpif-netdev-perf.h | 42 +++++++++ lib/dpif-netdev-unixctl.man | 59 +++++++++++++ lib/dpif-netdev.c | 5 ++ 5 files changed, 309 insertions(+) diff --git a/NEWS b/NEWS index 8f66fd3..61148b1 100644 --- a/NEWS +++ b/NEWS @@ -76,6 +76,8 @@ v2.9.0 - 19 Feb 2018 * Commands ovs-appctl dpif-netdev/pmd-*-show can now work on a single PMD * Detailed PMD performance metrics available with new command ovs-appctl dpif-netdev/pmd-perf-show + * Supervision of PMD performance metrics and logging of suspicious + iterations - vswitchd: * Datapath IDs may now be specified as 0x1 (etc.) instead of 16 digits. * Configuring a controller, or unconfiguring all controllers, now deletes diff --git a/lib/dpif-netdev-perf.c b/lib/dpif-netdev-perf.c index 43f537e..e6afd07 100644 --- a/lib/dpif-netdev-perf.c +++ b/lib/dpif-netdev-perf.c @@ -25,6 +25,24 @@ VLOG_DEFINE_THIS_MODULE(pmd_perf); +#define ITER_US_THRESHOLD 250 /* Warning threshold for iteration duration + in microseconds. */ +#define VHOST_QUEUE_FULL 128 /* Size of the virtio TX queue. */ +#define LOG_IT_BEFORE 5 /* Number of iterations to log before + suspicious iteration. */ +#define LOG_IT_AFTER 5 /* Number of iterations to log after + suspicious iteration. */ + +bool log_enabled = false; +bool log_extend = false; +static uint32_t log_it_before = LOG_IT_BEFORE; +static uint32_t log_it_after = LOG_IT_AFTER; +static uint32_t log_us_thr = ITER_US_THRESHOLD; +uint32_t log_q_thr = VHOST_QUEUE_FULL; +uint64_t iter_cycle_threshold; + +static struct vlog_rate_limit latency_rl = VLOG_RATE_LIMIT_INIT(600, 600); + #ifdef DPDK_NETDEV static uint64_t get_tsc_hz(void) @@ -127,6 +145,10 @@ pmd_perf_stats_init(struct pmd_perf_stats *s) histogram_walls_set_log(&s->cycles_per_upcall, 1000, 1000000); histogram_walls_set_log(&s->max_vhost_qfill, 0, 512); s->start_ms = time_msec(); + s->log_susp_it = UINT32_MAX; + s->log_begin_it = UINT32_MAX; + s->log_end_it = UINT32_MAX; + s->log_reason = NULL; } void @@ -382,6 +404,10 @@ pmd_perf_stats_clear_lock(struct pmd_perf_stats *s) history_init(&s->milliseconds); s->start_ms = time_msec(); s->milliseconds.sample[0].timestamp = s->start_ms; + s->log_susp_it = UINT32_MAX; + s->log_begin_it = UINT32_MAX; + s->log_end_it = UINT32_MAX; + s->log_reason = NULL; /* Clearing finished. */ s->clear = false; ovs_mutex_unlock(&s->clear_mutex); @@ -402,3 +428,178 @@ pmd_perf_stats_clear(struct pmd_perf_stats *s) s->clear = true; } } + +/* Delay logging of the suspicious iteration and the range of iterations + * around it until after the last iteration in the range to be logged. + * This avoids any distortion of the measurements through the cost of + * logging itself. */ + +void +pmd_perf_set_log_susp_iteration(struct pmd_perf_stats *s, + char *reason) +{ + if (s->log_susp_it == UINT32_MAX) { + /* No logging scheduled yet. */ + s->log_susp_it = s->iterations.idx; + s->log_reason = reason; + s->log_begin_it = history_sub(s->iterations.idx, log_it_before); + s->log_end_it = history_add(s->iterations.idx, log_it_after + 1); + } else if (log_extend) { + /* Logging was initiated earlier, we the previous suspicious iteration + * now and extend the logging interval, if possible. */ + struct iter_stats *susp = &s->iterations.sample[s->log_susp_it]; + uint32_t new_end_it, old_range, new_range; + + VLOG_WARN_RL(&latency_rl, + "Suspicious iteration (%s): tsc=%"PRIu64 + " duration=%"PRIu64" us\n", + s->log_reason, + susp->timestamp, + (1000000L * susp->cycles) / get_tsc_hz()); + + new_end_it = history_add(s->iterations.idx, log_it_after + 1); + new_range = history_sub(new_end_it, s->log_begin_it); + old_range = history_sub(s->log_end_it, s->log_begin_it); + if (new_range < old_range) { + /* Extended range exceeds history length. */ + new_end_it = s->log_begin_it; + } + s->log_susp_it = s->iterations.idx; + s->log_reason = reason; + s->log_end_it = new_end_it; + } +} + +void +pmd_perf_log_susp_iteration_neighborhood(struct pmd_perf_stats *s) +{ + ovs_assert(s->log_reason != NULL); + ovs_assert(s->log_susp_it != UINT32_MAX); + + struct ds log = DS_EMPTY_INITIALIZER; + struct iter_stats *susp = &s->iterations.sample[s->log_susp_it]; + uint32_t range = history_sub(s->log_end_it, s->log_begin_it); + + VLOG_WARN_RL(&latency_rl, + "Suspicious iteration (%s): tsc=%"PRIu64 + " duration=%"PRIu64" us\n", + s->log_reason, + susp->timestamp, + (1000000L * susp->cycles) / get_tsc_hz()); + + pmd_perf_format_iteration_history(&log, s, range); + VLOG_WARN_RL(&latency_rl, + "Neighborhood of suspicious iteration:\n" + "%s", ds_cstr(&log)); + ds_destroy(&log); + s->log_susp_it = s->log_end_it = s->log_begin_it = UINT32_MAX; + s->log_reason = NULL; + + if (range > 100) { + /* Reset to safe default values to avoid disturbance. */ + log_it_before = LOG_IT_BEFORE; + log_it_after = LOG_IT_AFTER; + log_extend = false; + } +} + +void +pmd_perf_log_set_cmd(struct unixctl_conn *conn, + int argc, const char *argv[], + void *aux OVS_UNUSED) +{ + unsigned int it_before, it_after, us_thr, q_thr; + bool on, extend; + bool usage = false; + + on = log_enabled; + extend = log_extend; + it_before = log_it_before; + it_after = log_it_after; + q_thr = log_q_thr; + us_thr = log_us_thr; + + while (argc > 1) { + if (!strcmp(argv[1], "on")) { + on = true; + argc--; + argv++; + } else if (!strcmp(argv[1], "off")) { + on = false; + argc--; + argv++; + } else if (!strcmp(argv[1], "-e")) { + extend = true; + argc--; + argv++; + } else if (!strcmp(argv[1], "-ne")) { + extend = false; + argc--; + argv++; + } else if (!strcmp(argv[1], "-a") && argc > 2) { + if (str_to_uint(argv[2], 10, &it_after)) { + if (it_after > HISTORY_LEN - 2) { + it_after = HISTORY_LEN - 2; + } + } else { + usage = true; + break; + } + argc -= 2; + argv += 2; + } else if (!strcmp(argv[1], "-b") && argc > 2) { + if (str_to_uint(argv[2], 10, &it_before)) { + if (it_before > HISTORY_LEN - 2) { + it_before = HISTORY_LEN - 2; + } + } else { + usage = true; + break; + } + argc -= 2; + argv += 2; + } else if (!strcmp(argv[1], "-q") && argc > 2) { + if (!str_to_uint(argv[2], 10, &q_thr)) { + usage = true; + break; + } + argc -= 2; + argv += 2; + } else if (!strcmp(argv[1], "-us") && argc > 2) { + if (!str_to_uint(argv[2], 10, &us_thr)) { + usage = true; + break; + } + argc -= 2; + argv += 2; + } else { + usage = true; + break; + } + } + if (it_before + it_after > HISTORY_LEN - 2) { + it_after = HISTORY_LEN - 2 - it_before; + } + + if (usage) { + unixctl_command_reply_error(conn, + "Usage: ovs-appctl dpif-netdev/pmd-perf-log-set " + "[on|off] [-b before] [-a after] [-e|-ne] " + "[-us usec] [-q qlen]"); + return; + } + + VLOG_INFO("pmd-perf-log-set: %s, before=%d, after=%d, extend=%s, " + "us_thr=%d, q_thr=%d\n", + on ? "on" : "off", it_before, it_after, + extend ? "true" : "false", us_thr, q_thr); + log_enabled = on; + log_extend = extend; + log_it_before = it_before; + log_it_after = it_after; + log_q_thr = q_thr; + log_us_thr = us_thr; + iter_cycle_threshold = (log_us_thr * get_tsc_hz()) / 1000000L; + + unixctl_command_reply(conn, ""); +} diff --git a/lib/dpif-netdev-perf.h b/lib/dpif-netdev-perf.h index b91cb30..334908f 100644 --- a/lib/dpif-netdev-perf.h +++ b/lib/dpif-netdev-perf.h @@ -171,6 +171,14 @@ struct pmd_perf_stats { struct history iterations; /* Millisecond history buffer. */ struct history milliseconds; + /* Suspicious iteration log. */ + uint32_t log_susp_it; + /* Start of iteration range to log. */ + uint32_t log_begin_it; + /* End of iteration range to log. */ + uint32_t log_end_it; + /* Reason for logging suspicious iteration. */ + char *log_reason; }; /* Support for accurate timing of PMD execution on TSC clock cycle level. @@ -341,6 +349,15 @@ history_store(struct history *h, struct iter_stats *is) return history_next(h); } +/* Data and function related to logging of suspicious iterations. */ + +extern bool log_enabled; +extern uint32_t log_q_thr; +extern uint64_t iter_cycle_threshold; + +void pmd_perf_set_log_susp_iteration(struct pmd_perf_stats *s, char *reason); +void pmd_perf_log_susp_iteration_neighborhood(struct pmd_perf_stats *s); + /* Functions recording PMD metrics per iteration. */ static inline void @@ -370,6 +387,7 @@ pmd_perf_end_iteration(struct pmd_perf_stats *s, int rx_packets, uint64_t now_tsc = cycles_counter_update(s); struct iter_stats *cum_ms; uint64_t cycles, cycles_per_pkt = 0; + char *reason = NULL; cycles = now_tsc - s->current.timestamp; s->current.cycles = cycles; @@ -421,6 +439,27 @@ pmd_perf_end_iteration(struct pmd_perf_stats *s, int rx_packets, /* Store in iteration history. This advances the iteration idx and * clears the next slot in the iteration history. */ history_store(&s->iterations, &s->current); + + if (log_enabled) { + /* Log suspicious iterations. */ + if (cycles > iter_cycle_threshold) { + reason = "Excessive total cycles"; + } else if (s->current.max_vhost_qfill >= log_q_thr) { + reason = "Vhost RX queue full"; + } + if (OVS_UNLIKELY(reason)) { + pmd_perf_set_log_susp_iteration(s, reason); + cycles_counter_update(s); + } + + /* Log iteration interval around suspicious iteration when reaching + * the end of the range to be logged. */ + if (OVS_UNLIKELY(s->log_end_it == s->iterations.idx)) { + pmd_perf_log_susp_iteration_neighborhood(s); + cycles_counter_update(s); + } + } + if (now_tsc > s->next_check_tsc) { /* Check if ms is completed and store in milliseconds history. */ uint64_t now = time_msec(); @@ -458,6 +497,9 @@ void pmd_perf_format_iteration_history(struct ds *str, int n_iter); void pmd_perf_format_ms_history(struct ds *str, struct pmd_perf_stats *s, int n_ms); +void pmd_perf_log_set_cmd(struct unixctl_conn *conn, + int argc, const char *argv[], + void *aux OVS_UNUSED); #ifdef __cplusplus } diff --git a/lib/dpif-netdev-unixctl.man b/lib/dpif-netdev-unixctl.man index 76c3e4e..ab8619e 100644 --- a/lib/dpif-netdev-unixctl.man +++ b/lib/dpif-netdev-unixctl.man @@ -149,6 +149,65 @@ reported by the \fBdpif-netdev/pmd-stats-show\fR command. To reset the counters and start a new measurement use \fBdpif-netdev/pmd-stats-clear\fR. . +.IP "\fBdpif-netdev/pmd-perf-log-set\fR \fBon\fR|\fBoff\fR \ +[\fB-b\fR \fIbefore\fR] [\fB-a\fR \fIafter\fR] [\fB-e\fR|\fB-ne\fR] \ +[\fB-us\fR \fIusec\fR] [\fB-q\fR \fIqlen\fR]" +. +The userspace "netdev" datapath is able to supervise the PMD performance +metrics and detect iterations with suspicious statistics according to the +following criteria: +.RS +.IP \(em +The iteration lasts longer than \fIusec\fR microseconds (default 250). +This can be used to capture events where a PMD is blocked or interrupted for +such a period of time that there is a risk for dropped packets on any of its Rx +queues. +.IP \(em +The max vhost qlen exceeds a threshold \fIqlen\fR (default 128). This can be +used to infer virtio queue overruns and dropped packets inside a VM, which are +not visible in OVS otherwise. +.RE +.IP +Such suspicious iterations can be logged together with their iteration +statistics in the \fBovs-vswitchd.log\fR to be able to correlate them to +packet drop or other events outside OVS. + +The above command enables (\fBon\fR) or disables (\fBoff\fR) supervision and +logging at run-time and can be used to adjust the above thresholds for +detecting suspicious iterations. By default supervision and logging is +disabled. + +The command options are: +.RS +.IP "\fB-b\fR \fIbefore\fR" +The number of iterations before the suspicious iteration to be logged +(default 5). +.IP "\fB-a\fR \fIafter\fR" +The number of iterations after the suspicious iteration to be logged +(default 5). +.IP "\fB-e\fR" +Extend logging interval if another suspicious iteration is detected +before logging occurs. +.IP "\fB-ne\fR" +Do not extend logging interval if another suspicious iteration is detected +before logging occurs (default). +.IP "\fB-q\fR \fIqlen\fR" +Suspicious vhost queue fill level threshold. Increase this to 512 if the Qemu +supports 1024 virtio queue length (default 128). +.IP "\fB-us\fR \fIusec\fR" +Change the duration threshold for a suspicious iteration (default 250 us). +.RE + +Note: Logging of suspicious iterations itself consumes a considerable amount +of processing cycles of a PMD which may be visible in the iteration history. +In the worst case this can lead OVS to detect another suspicious iteration +caused by logging. + +If more than 100 iterations around a suspicious iteration have been logged +once, OVS falls back to the safe default values (-b 5 -a 5 -ne) to avoid +that logging itself continuously causes logging of further suspicious +iterations. +. .IP "\fBdpif-netdev/pmd-rxq-show\fR [\fB-pmd\fR \fIcore\fR] [\fIdp\fR]" For one or all pmd threads of the datapath \fIdp\fR show the list of queue-ids with port names, which this thread polls. diff --git a/lib/dpif-netdev.c b/lib/dpif-netdev.c index c4ac176..f059940 100644 --- a/lib/dpif-netdev.c +++ b/lib/dpif-netdev.c @@ -1235,6 +1235,11 @@ dpif_netdev_init(void) unixctl_command_register("dpif-netdev/pmd-rxq-rebalance", "[dp]", 0, 1, dpif_netdev_pmd_rebalance, NULL); + unixctl_command_register("dpif-netdev/pmd-perf-log-set", + "on|off [-b before] [-a after] [-e|-ne] " + "[-us usec] [-q qlen]", + 0, 10, pmd_perf_log_set_cmd, + NULL); return 0; }