From patchwork Fri Oct 14 08:58:22 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Koehrer Mathias (ETAS/ESW5)" X-Patchwork-Id: 682171 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3swM2l0s7lz9ryQ for ; Fri, 14 Oct 2016 19:58:38 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 4726433AE9; Fri, 14 Oct 2016 08:58:37 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XafnKiZG0pbF; Fri, 14 Oct 2016 08:58:30 +0000 (UTC) Received: from ash.osuosl.org (ash.osuosl.org [140.211.166.34]) by silver.osuosl.org (Postfix) with ESMTP id 39A2733AB9; Fri, 14 Oct 2016 08:58:30 +0000 (UTC) X-Original-To: intel-wired-lan@lists.osuosl.org Delivered-To: intel-wired-lan@lists.osuosl.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by ash.osuosl.org (Postfix) with ESMTP id 987851CE508 for ; Fri, 14 Oct 2016 08:58:29 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 8FDA7949BF for ; Fri, 14 Oct 2016 08:58:29 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EPi7jtvKRJlU for ; Fri, 14 Oct 2016 08:58:26 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from smtp6-v.fe.bosch.de (smtp6-v.fe.bosch.de [139.15.237.11]) by hemlock.osuosl.org (Postfix) with ESMTPS id 4CC4894B50 for ; Fri, 14 Oct 2016 08:58:26 +0000 (UTC) Received: from vsmta12.fe.internet.bosch.com (unknown [10.4.98.52]) by imta23.fe.bosch.de (Postfix) with ESMTP id 589AC15801B4; Fri, 14 Oct 2016 10:58:23 +0200 (CEST) Received: from SI-MBX1012.de.bosch.com (vsgw22.fe.internet.bosch.com [10.4.98.11]) by vsmta12.fe.internet.bosch.com (Postfix) with ESMTP id 247001B80903; Fri, 14 Oct 2016 10:58:23 +0200 (CEST) Received: from FE-MBX1012.de.bosch.com (10.3.230.70) by SI-MBX1012.de.bosch.com (10.3.230.45) with Microsoft SMTP Server (TLS) id 15.0.1178.4; Fri, 14 Oct 2016 10:58:22 +0200 Received: from FE-MBX1012.de.bosch.com ([fe80::9db6:aaff:1980:b83a]) by FE-MBX1012.de.bosch.com ([fe80::9db6:aaff:1980:b83a%16]) with mapi id 15.00.1178.000; Fri, 14 Oct 2016 10:58:22 +0200 From: "Koehrer Mathias (ETAS/ESW5)" To: Julia Cartwright Thread-Topic: Kernel 4.6.7-rt13: Intel Ethernet driver igb causes huge latencies in cyclictest Thread-Index: AdIUzS/19ba6z4msSlWtJALmxToZ5gABboeAACOjRvAACqwnIP//8zsA///T88CAAFArAP/7akzAgAzGM4D/9s4gcAJTnT2A//8hO/D//crngP/6fRQQ//RfcAD/56V84P/J/q8A/5AQCXD/H8k5kP4/V+GA/H2Dv2A= Date: Fri, 14 Oct 2016 08:58:22 +0000 Message-ID: References: <487032ca81f84e70bdacc39a024eff5e@FE-MBX1012.de.bosch.com> <20161004193445.GF10625@jcartwri.amer.corp.natinst.com> <584755c2766e4b94a604ece16760fe14@FE-MBX1012.de.bosch.com> <20161005155959.GH10625@jcartwri.amer.corp.natinst.com> <13c3cd3ffee4490fb22b8de383e51361@FE-MBX1012.de.bosch.com> <29250f87b1d84aacb8aa312935582291@FE-MBX1012.de.bosch.com> <20161010193958.GE22235@jcartwri.amer.corp.natinst.com> <20161013161839.GV10625@jcartwri.amer.corp.natinst.com> In-Reply-To: <20161013161839.GV10625@jcartwri.amer.corp.natinst.com> Accept-Language: de-DE, en-US Content-Language: de-DE X-MS-Has-Attach: yes X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.4.162.35] MIME-Version: 1.0 X-TM-AS-MML: disable X-TM-AS-Product-Ver: IMSS-7.1.0.1679-8.0.0.1202-22634.006 X-TMASE-MatchedRID: H0/uSqZo4D7o2d3orePV3fHkpkyUphL90w14HFJQjaNh2fnHe1cil7W2 M8jUv+p7VDW+ChjptrBw44GvJnZxOQ382XfhF2FqhsE+u3nnCfCvC9kAVDo5w9nu97SXKBPYqM8 PYJGuHYpcE6/Qtv696D4bjCOOTSJ9epAgRaj4rltjoaO27r+3fYsHixNGk6cJdd4kFevsINTfyy MmbNfTqt2aC4Uz4N5w/6HyGpOkSxqiwkztVCsqb85sKZgBbe5FQPCWRE0Lo8K/7bplhbPCQhxmU t9aQ9Ctc0RMgOcSS32aDNrwXtJOnNCjuUhsbmmkrDFtme53Kvs7r2Gtb9iBYdHOIxsF5Mj55pf3 ZiBJgsF/gcobgztI4dmaA1809m5OAnUHme6hYhmPVEZA4HZW+dFeh0vTZ8Nya0TOsL14A2lxhLk PnTAGm5iXUTPYiDnX2wzejUhq4Mcbf3ZICpMg5XypjffgdgMPVBDQSDMig9Ekt9BigJAcVngQ2d iIs7FhepYdDkfIivq3vCJoRErf0UaSIe7bsmlt7spMO3HwKCA6En2bnefhoLUTaB9IDCvU/rZzL 50JflMZ/bd2pkmMTsazEk/dlYlg0T6H5qdr9MKJDLgwb/1K2cidYBYDjITpAQd9i8mgm21c3fPa ZDeJ4iAHAopEd76vgIB1YZomNP2RJd+ztH/6mh6/C9sf7783UohlblTACvZ8w43qZLwElw== Cc: "linux-rt-users@vger.kernel.org" , Sebastian Andrzej Siewior , "netdev@vger.kernel.org" , "intel-wired-lan@lists.osuosl.org" , Greg Subject: Re: [Intel-wired-lan] Kernel 4.6.7-rt13: Intel Ethernet driver igb causes huge latencies in cyclictest X-BeenThere: intel-wired-lan@lists.osuosl.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: Intel Wired Ethernet Linux Kernel Driver Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: intel-wired-lan-bounces@lists.osuosl.org Sender: "Intel-wired-lan" Hi Julia, > Have you tested on a vanilla (non-RT) kernel? I doubt there is anything RT specific > about what you are seeing, but it might be nice to get confirmation. Also, bisection > would probably be easier if you confirm on a vanilla kernel. > > I find it unlikely that it's a kernel config option that changed which regressed you, but > instead was a code change to a driver. Which driver is now the question, and the > surface area is still big (processor mapping attributes for this region, PCI root > complex configuration, PCI brige configuration, igb driver itself, etc.). > > Big enough that I'd recommend a bisection. It looks like a bisection between 3.18 > and 4.8 would take you about 18 tries to narrow down, assuming all goes well. > I have now repeated my tests using the vanilla kernel. There I got the very same issue. Using kernel 4.0 is fine, however starting with kernel 4.1, the issue appears. Here is my exact (reproducible) test description: I applied the following patch to the kernel to get the igb trace. This patch instruments the igb_rd32() function to measure the call to readl() which is used to access registers of the igb NIC. ++++++++++++++++++ BEGIN PATCH ++++++++++++++++++++++++++++++++++++ ++++++++++++++++++ END PATCH ++++++++++++++++++++++++++++++++++++ I build the kernel with this patch applied, rebooted the PC to run this kernel and used the following script for my test. ++++++++++++++++++ BEGIN SCRIPT +++++++++++++++++++++++++++++++++ #!/bin/bash for f in /sys/devices/system/cpu/cpu[0-9]*/cpufreq/scaling_governor ; do if [ -w $f ]; then echo "performance" > $f fi done if true; then rmmod igb modprobe igb ethtool -L eth2 combined 1 ifconfig eth2 up 192.168.100.111 fi ifconfig mount /sys/kernel/debug ( cd /sys/kernel/debug/tracing echo 0 > tracing_on echo 0 > events/enable echo 1 > events/igb/enable echo "print-parent" > trace_options echo "latency-format" > trace_options echo 1 > tracing_on sleep 4 cat trace ) ++++++++++++++++++ END SCRIPT +++++++++++++++++++++++++++++++++ The results of this for kernel 4.0: [...] kworker/-1239 3...1 49699046us : igb: val: 801 kworker/-1239 3...1 49699047us : igb: val: 802 kworker/-1239 3...1 49699047us : igb: val: 801 kworker/-1239 3...1 49699048us+: igb: val: 802 kworker/-1239 3...1 49699099us : igb: val: 801 kworker/-1239 3...1 49699100us : igb: val: 802 kworker/-1239 3...1 49699100us : igb: val: 801 kworker/-1239 3...1 49699102us : igb: val: 802 kworker/-1239 3...1 49699102us : igb: val: 801 kworker/-1239 3...1 49699103us : igb: val: 802 kworker/-1239 3...1 49699103us : igb: val: 801 kworker/-1239 3...1 49699104us : igb: val: 802 kworker/-1239 3...1 49699104us : igb: val: 801 kworker/-1239 3...1 49699105us : igb: val: 802 kworker/-1239 3...1 49699105us : igb: val: 801 kworker/-1239 3...1 49699107us : igb: val: 802 kworker/-1239 3...1 49699107us : igb: val: 801 kworker/-1239 3...1 49699108us : igb: val: 802 kworker/-1239 3...1 49699108us : igb: val: 801 kworker/-1239 3...1 49699109us : igb: val: 802 kworker/-1239 3...1 49699109us : igb: val: 801 kworker/-1239 3...1 49699110us : igb: val: 802 kworker/-1239 3...1 49699110us : igb: val: 801 kworker/-1239 3...1 49699111us : igb: val: 802 kworker/-1239 3...1 49699111us : igb: val: 801 kworker/-1239 3...1 49699113us+: igb: val: 802 kworker/-1239 3...1 49699163us : igb: val: 801 kworker/-1239 3...1 49699164us : igb: val: 802 kworker/-1239 3...1 49699164us : igb: val: 801 kworker/-1239 3...1 49699166us : igb: val: 802 kworker/-1239 3...1 49699166us : igb: val: 801 kworker/-1239 3...1 49699167us : igb: val: 802 kworker/-1239 3...1 49699167us : igb: val: 801 kworker/-1239 3...1 49699168us : igb: val: 802 kworker/-1239 3...1 49699168us : igb: val: 801 kworker/-1239 3...1 49699169us : igb: val: 802 kworker/-1239 3...1 49699169us : igb: val: 801 kworker/-1239 3...1 49699170us : igb: val: 802 kworker/-1239 3...1 49699171us : igb: val: 801 kworker/-1239 3...1 49699173us : igb: val: 802 kworker/-1239 3...1 49699173us : igb: val: 801 kworker/-1239 3...1 49699174us : igb: val: 802 The results of this for kernel 4.1: [...] kworker/-1333 2...1 75697302us+: igb: val: 801 kworker/-1333 2...1 75697323us+: igb: val: 802 kworker/-1333 2...1 75697373us+: igb: val: 801 kworker/-1333 2...1 75697395us : igb: val: 802 kworker/-1333 2...1 75697395us+: igb: val: 801 kworker/-1333 2...1 75697415us : igb: val: 802 kworker/-1333 2...1 75697416us+: igb: val: 801 kworker/-1333 2...1 75697436us : igb: val: 802 kworker/-1333 2...1 75697436us+: igb: val: 801 kworker/-1333 2...1 75697457us : igb: val: 802 kworker/-1333 2...1 75697457us+: igb: val: 801 kworker/-1333 2...1 75697478us : igb: val: 802 kworker/-1333 2...1 75697478us+: igb: val: 801 kworker/-1333 2...1 75697499us : igb: val: 802 kworker/-1333 2...1 75697499us+: igb: val: 801 kworker/-1333 2...1 75697520us : igb: val: 802 kworker/-1333 2...1 75697520us+: igb: val: 801 kworker/-1333 2...1 75697541us : igb: val: 802 kworker/-1333 2...1 75697541us+: igb: val: 801 kworker/-1333 2...1 75697562us : igb: val: 802 kworker/-1333 2...1 75697562us+: igb: val: 801 kworker/-1333 2...1 75697583us : igb: val: 802 kworker/-1333 2...1 75697583us+: igb: val: 801 kworker/-1333 2...1 75697604us+: igb: val: 802 kworker/-1333 2...1 75697654us+: igb: val: 801 kworker/-1333 2...1 75697675us : igb: val: 802 kworker/-1333 2...1 75697675us+: igb: val: 801 kworker/-1333 2...1 75697696us : igb: val: 802 kworker/-1333 2...1 75697696us+: igb: val: 801 kworker/-1333 2...1 75697717us : igb: val: 802 kworker/-1333 2...1 75697717us+: igb: val: 801 kworker/-1333 2...1 75697738us : igb: val: 802 kworker/-1333 2...1 75697738us+: igb: val: 801 kworker/-1333 2...1 75697759us : igb: val: 802 kworker/-1333 2...1 75697759us+: igb: val: 801 kworker/-1333 2...1 75697780us : igb: val: 802 kworker/-1333 2...1 75697781us+: igb: val: 801 kworker/-1333 2...1 75697801us : igb: val: 802 kworker/-1333 2...1 75697801us+: igb: val: 801 kworker/-1333 2...1 75697822us : igb: val: 802 It can be cleary seen that with the kernel 4.1 the time for the execution of the "readl" function is much slower! It tasks always about 21us! In comparison to that the kernel 4.0 is fast. Here the time for "readl" is about 1-2 us. All measurements have been done on the very same PC. The PC is an HP EliteDesk 800 G1, a dual port Intel i350-T2 server adapter has been plugged in. CPU: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz x86_64 bit mode. Please find attached the kernel configuration for the 4.1 kernel. I had a look at the generated .o file of igb_main.o. However, in both cases - kernel 4.0 and kernel 4.1 - the code for the igb_rd32 function looks the same. Especially the call to the readl() function is the very same. This means, that I think that some other stuff in kernel 4.1 has changed, which has impact on the igb accesses. Any idea what component could cause this kind of issue? Thanks for any feedback! Regards Mathias Index: linux-4.8/drivers/net/ethernet/intel/igb/trace.h =================================================================== --- /dev/null +++ linux-4.8/drivers/net/ethernet/intel/igb/trace.h @@ -0,0 +1,34 @@ + +#if !defined(_TRACE_IGB_H_) || defined(TRACE_HEADER_MULTI_READ) +#define _TRACE_IGB_H_ + +#include +#undef TRACE_SYSTEM +#define TRACE_SYSTEM igb + + +#define _TRACE_H_ + + +TRACE_EVENT(igb, + TP_PROTO(u32 val), + TP_ARGS(val), + TP_STRUCT__entry( + __field(u32, val) + ), + TP_fast_assign( + __entry->val = val; + ), + TP_printk("val: %u", + __entry->val) +); + + +#endif + +#undef TRACE_INCLUDE_PATH +#define TRACE_INCLUDE_PATH drivers/net/ethernet/intel/igb +#undef TRACE_INCLUDE_FILE +#define TRACE_INCLUDE_FILE trace + +#include Index: linux-4.8/drivers/net/ethernet/intel/igb/Makefile =================================================================== --- linux-4.8.orig/drivers/net/ethernet/intel/igb/Makefile +++ linux-4.8/drivers/net/ethernet/intel/igb/Makefile @@ -28,6 +28,7 @@ # # Makefile for the Intel(R) 82575 PCI-Express ethernet driver # +ccflags-y += -I. obj-$(CONFIG_IGB) += igb.o Index: linux-4.8/drivers/net/ethernet/intel/igb/igb_main.c =================================================================== --- linux-4.8.orig/drivers/net/ethernet/intel/igb/igb_main.c +++ linux-4.8/drivers/net/ethernet/intel/igb/igb_main.c @@ -57,6 +57,9 @@ #include #include "igb.h" +#define CREATE_TRACE_POINTS +#include "trace.h" + #define MAJ 5 #define MIN 3 #define BUILD 0 @@ -753,7 +756,9 @@ u32 igb_rd32(struct e1000_hw *hw, u32 re if (E1000_REMOVED(hw_addr)) return ~value; + trace_igb(801); value = readl(&hw_addr[reg]); + trace_igb(802); /* reads should not return all F's */ if (!(~value) && (!reg || !(~readl(hw_addr)))) {