From patchwork Sat Aug 26 14:47:50 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kristian Evensen X-Patchwork-Id: 806124 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=lists.infradead.org (client-ip=65.50.211.133; helo=bombadil.infradead.org; envelope-from=lede-dev-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=lists.infradead.org header.i=@lists.infradead.org header.b="J98mCa+w"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="c3oxMjYx"; dkim-atps=neutral Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xfgrp50dYz9t50 for ; Sun, 27 Aug 2017 00:48:40 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:Subject: Content-Type:To:Message-ID:Date:From:References:In-Reply-To:MIME-Version: Reply-To:Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date :Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=VMA0DEOQMycZF5oZqze/4YKNBTsGhG3gYG5PEgLTEts=; b=J98mCa+wCW8ebFKTJa6MP9vQh P0JZZWaRVzsvtyQIEx+Z3WFlB95yHlCM7TOTWllu52Cl7CjeWLlXJOeJvcxqeaOpFy7Uqcg9J2DQk B4CeADDIOTnA/HSQbJtW2l60Hu2+OD5X1GYBSiTBessBfRgdLQ0tWwd82dfiCVUMsMNBhiUws+Dgd VJDI79ztBr3fLq4k6VR8ByrzNldkWmTKw8hfscriuUqZnH+cpqCWOCVIaGcH62FpD50FA7egjlvWy my2KNXcDr41QfxxfOX7Z4ZvWOXhmHBG91gSL9ks6HGCWTYfyqyxha/+k0MB9hUNFh8Yo1oqf8OL9Z oXigbl2DA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1dlcNm-00060v-Nw; Sat, 26 Aug 2017 14:48:18 +0000 Received: from mail-wr0-x229.google.com ([2a00:1450:400c:c0c::229]) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1dlcNi-0005yi-Qu for lede-dev@lists.infradead.org; Sat, 26 Aug 2017 14:48:17 +0000 Received: by mail-wr0-x229.google.com with SMTP id p8so6816554wrf.2 for ; Sat, 26 Aug 2017 07:47:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=BilEnin2LOuzybJAWZHLtm0Bmb+vDjGH+hOxbQjVhjQ=; b=c3oxMjYxdcloLJrLqo8AhKsbZotbUsi0ilBlh+oMxROOmYRajLGiTF/52UoHnssaw1 U+qYv2n+4pPvTr/ALYh1jryhlcVHu4wWPUOsrS0WLx2FNOgfnh0xw/hspsVPuoZ7nGvn aDc8RnnEtZLlm51fzX3W/64gP1NVjVZQMP4wVLN1BBiTCD1F1dPXIJ4Bqq5lgGKQmW6y 4yu4FLSNsf1E4ZPDJqJQxbokswSWAJB7mMgZp3UL5LyRT7Cx/+k944h7GO7zStneFwNz Sf2tmpEs9b84Z30zNLjKFcQ+SkTTnbKx3pLEXFNY1gjCeqzhEmnD+2XJFiNS+vtRDUyD CcRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=BilEnin2LOuzybJAWZHLtm0Bmb+vDjGH+hOxbQjVhjQ=; b=DXRu3I0VSrJqldelDapO21XsKqOsRKiRSUak5fPaUrjLaElgLCF40Kc2mKNoKC/LM6 39v/v1r1YrC0LoLUcR3Ecj+lDOTKQHAyFHFmYG0/Uk7gS2iwC9ZHV/HmfoN1DaWtJ3a9 3BYnU9YZ8GQzX+GTcB24PJ2N11X1WBIn2OkbUuudt/gnYB777qQVEmc6aE84nxEAoo+e Q94RSXRaWW4SnhtwlFy6W9Hg2tJd9vKptwxhCo/5BKbvN8h9zzw8iNGtvkNPLh2FDQVs lO4CHfER+QqxjyuZNJhwNmpE3stIGm6GiG5JvR3p8kAfe9C9IHwNj72ahjCvzOyqRxvL 8qxw== X-Gm-Message-State: AHYfb5jwoHF1q67SB1m0NZPebQoVfAbSRhWsUb+cYyrXnQzyYcz3gW2U qh4rHoL/5HyhuLmL89h56pUfFUs5eQ== X-Received: by 10.223.166.231 with SMTP id t94mr1073136wrc.52.1503758872081; Sat, 26 Aug 2017 07:47:52 -0700 (PDT) MIME-Version: 1.0 Received: by 10.223.179.216 with HTTP; Sat, 26 Aug 2017 07:47:50 -0700 (PDT) In-Reply-To: References: <30f7c246-74bb-62c1-4045-3c3b6539e732@phrozen.org> <6c0ad875-3b52-107a-df11-3af5f377e95e@phrozen.org> From: Kristian Evensen Date: Sat, 26 Aug 2017 16:47:50 +0200 Message-ID: To: Mingyu Li X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170826_074815_180825_2C969292 X-CRM114-Status: GOOD ( 24.72 ) X-Spam-Score: -2.0 (--) X-Spam-Report: SpamAssassin version 3.4.1 on bombadil.infradead.org summary: Content analysis details: (-2.0 points) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at http://www.dnswl.org/, no trust [2a00:1450:400c:c0c:0:0:0:229 listed in] [list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider (kristian.evensen[at]gmail.com) -1.9 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain Subject: Re: [LEDE-DEV] Transmit timeouts with mtk_eth_soc and MT7621 X-BeenThere: lede-dev@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: LEDE Development List , John Crispin Sender: "Lede-dev" Errors-To: lede-dev-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org Hello again, On Sat, Aug 26, 2017 at 12:38 PM, Kristian Evensen wrote: > Hi, > > On Sat, Aug 26, 2017 at 7:43 AM, Mingyu Li wrote: >> Hi. >> >> i check the code again. found xmit_more can cause tx timeout. you can >> reference this. >> https://www.mail-archive.com/netdev@vger.kernel.org/msg123334.html >> so the patch should be like this. edit mtk_eth_soc.c >> >> tx_num = fe_cal_txd_req(skb); >> if (unlikely(fe_empty_txd(ring) <= tx_num)) { >> + if (skb->xmit_more) >> + fe_reg_w32(ring->tx_next_idx, FE_REG_TX_CTX_IDX0); >> netif_stop_queue(dev); >> netif_err(priv, tx_queued, dev, >> "Tx Ring full when queue awake!\n"); >> >> but i am not sure. maybe the pause frame cause the problem. > > Thanks for the patch. I tested it, but I unfortunately still see the > error. I also added a print-statement inside the conditional and can > see that the condition is never hit. I also don't see the "Tx Ring > full"-message. One difference which I noticed now though, is that I > don't see the bursty bandwidth pattern I described earlier (32, 0, 32, > 0, ...). With your patch, it is always 32, 0, crash. I spent some more time looking into this today and think I might have been able to solve the issue. My test has been running for ~2 hours now without any errors (before it would best-case work for 10-15 minutes), and I do not see any performance regressions. Before going into detail, I should probably point out that I am not very familiar with driver development, so my observation changes might not be appropriate/correct :) I guess our common theory is that something is causing TX interrupts not to be enabled again. After reading up on NAPI (https://wiki.linuxfoundation.org/networking/napi), it seems that the recommended way of using NAPI on clear-on-write devices (like the RT5350) is to use NAPI for RX and do TX in the interrupt handler. In the current driver, both TX and RX is handled in the NAPI-callback fe_poll(). I have modified the driver to split RX and TX, so now fe_poll() only deals with RX and TX is dealt with in fe_handle_irq(). I have attached the (messy) patch I am currently testing. If this is an OK solution, I will clean up the patch and submit is to the list. I will leave my tests running overnight and report back if anything pops up. I guess that Johns new driver is the future for mtk_sock_eth, but I believe that fixing this issue for the current driver is worthwhile. As things are now, is it possible to DDOS RT5350-based routers running LEDE 17.01 by just sending the correct type of traffic. -Kristian From 98ce0313cd8654fe69028c19b6f08da1d0671d75 Mon Sep 17 00:00:00 2001 From: Kristian Evensen Date: Sat, 26 Aug 2017 16:05:44 +0200 Subject: [PATCH] [FIX] Move TX out of Napi This is a broken patch only meant for backup. --- .../drivers/net/ethernet/mtk/mtk_eth_soc.c | 36 +++++++++++++++++----- 1 file changed, 28 insertions(+), 8 deletions(-) diff --git a/target/linux/ramips/files-4.9/drivers/net/ethernet/mtk/mtk_eth_soc.c b/target/linux/ramips/files-4.9/drivers/net/ethernet/mtk/mtk_eth_soc.c index 5b354a6563..af865826e8 100644 --- a/target/linux/ramips/files-4.9/drivers/net/ethernet/mtk/mtk_eth_soc.c +++ b/target/linux/ramips/files-4.9/drivers/net/ethernet/mtk/mtk_eth_soc.c @@ -684,10 +684,13 @@ static int fe_tx_map_dma(struct sk_buff *skb, struct net_device *dev, */ wmb(); if (unlikely(fe_empty_txd(ring) <= ring->tx_thresh)) { + pr_info_ratelimited("netif_stop_queue %u %u\n", fe_empty_txd(ring), ring->tx_thresh); netif_stop_queue(dev); smp_mb(); - if (unlikely(fe_empty_txd(ring) > ring->tx_thresh)) + if (unlikely(fe_empty_txd(ring) > ring->tx_thresh)) { + pr_info_ratelimited("netif_wake_queue\n"); netif_wake_queue(dev); + } } if (netif_xmit_stopped(netdev_get_tx_queue(dev, 0)) || !skb->xmit_more) @@ -781,6 +784,10 @@ static int fe_start_xmit(struct sk_buff *skb, struct net_device *dev) tx_num = fe_cal_txd_req(skb); if (unlikely(fe_empty_txd(ring) <= tx_num)) { + if (skb->xmit_more) { + pr_info_ratelimited("SKB XMIT_MORE\n"); + fe_reg_w32(ring->tx_next_idx, FE_REG_TX_CTX_IDX0); + } netif_stop_queue(dev); netif_err(priv, tx_queued, dev, "Tx Ring full when queue awake!\n"); @@ -967,11 +974,12 @@ static int fe_poll(struct napi_struct *napi, int budget) status_reg = FE_REG_FE_INT_STATUS; } + /* if (status & tx_intr) { fe_reg_w32(tx_intr, FE_REG_FE_INT_STATUS); tx_done = fe_poll_tx(priv); status = fe_reg_r32(FE_REG_FE_INT_STATUS); - } + }*/ if (status & rx_intr) rx_done = fe_poll_rx(napi, budget, priv, rx_intr); @@ -1042,18 +1050,30 @@ static irqreturn_t fe_handle_irq(int irq, void *dev) { struct fe_priv *priv = netdev_priv(dev); u32 status, int_mask; + u32 tx_intr, rx_intr; + + tx_intr = priv->soc->tx_int; + rx_intr = priv->soc->rx_int; status = fe_reg_r32(FE_REG_FE_INT_STATUS); if (unlikely(!status)) return IRQ_NONE; - int_mask = (priv->soc->rx_int | priv->soc->tx_int); + int_mask = tx_intr | rx_intr; if (likely(status & int_mask)) { - if (likely(napi_schedule_prep(&priv->rx_napi))) { - fe_int_disable(int_mask); - __napi_schedule(&priv->rx_napi); - } + if (status & rx_intr) { + if (likely(napi_schedule_prep(&priv->rx_napi))) { + fe_int_disable(rx_intr); + __napi_schedule(&priv->rx_napi); + } + } + + if (status & tx_intr) { + fe_reg_w32(tx_intr, FE_REG_FE_INT_STATUS); + fe_poll_tx(priv); + fe_int_enable(tx_intr); + } } else { fe_reg_w32(status, FE_REG_FE_INT_STATUS); } @@ -1549,7 +1569,7 @@ static int fe_probe(struct platform_device *pdev) platform_set_drvdata(pdev, netdev); - netif_info(priv, probe, netdev, "mediatek frame engine at 0x%08lx, irq %d\n", + netif_info(priv, probe, netdev, "mediatek frame engine at 0x%08lx, irq %d - patched\n", netdev->base_addr, netdev->irq); return 0; -- 2.11.0