From patchwork Fri Dec 1 05:32:59 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 843350 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=netronome-com.20150623.gappssmtp.com header.i=@netronome-com.20150623.gappssmtp.com header.b="Xu13hFnS"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3yp2xb0YBhz9sNc for ; Fri, 1 Dec 2017 16:33:39 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752539AbdLAFd2 (ORCPT ); Fri, 1 Dec 2017 00:33:28 -0500 Received: from mail-pf0-f193.google.com ([209.85.192.193]:37809 "EHLO mail-pf0-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751220AbdLAFdV (ORCPT ); Fri, 1 Dec 2017 00:33:21 -0500 Received: by mail-pf0-f193.google.com with SMTP id n6so4190536pfa.4 for ; Thu, 30 Nov 2017 21:33:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=h2ilt31HPzcWNS7HI9erJFVioPwQI/d3d9AukMj3Ows=; b=Xu13hFnSqbo4F9baJxgY/QAm/csHFD8v0xs7+TbaM7cCuxPBksyWERPSvGZ2crF7Ii emvkrKlukERWR6kgFSq+Ny/UkkxLc13bdtNijMDbic97/LdBHky1wwEXzJ6DQxvqq3Ic Qd76zRIeeT2Sf6aNdRktdB61tWiqV72ecLskIDGAZDL9yWz/wVazMbljqXmxaxWtzO6h DlcWPT5nEzrAlKFxgQMMRo0hKettuDP3lF55ZxHJPki31Rpai5kqFodRfaLOOo+dQjRu h+uCAQAL9ORp19BWY0tQNLPop9taLTmWu6x74o/OHhDw238B0SyqJVRqXwMMvtEmsTjm YsvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=h2ilt31HPzcWNS7HI9erJFVioPwQI/d3d9AukMj3Ows=; b=R5f94JUUHUx0WNNhPvQjZJgtc33+5vKy6SN4G2G0GO1iMXqOUqdOpQAyhitFjuMKnz 8EK49VIpWP50NBzGZlGH4VFAFPrltrDWNvpk5KOWVo3nzwFYYSjy1IW0wpBE0NO0noaO p0y55bLcxNXJz3GdTwy69IKvTklC8z7K2jV1ESGJdCeSRaA6bed+RU0Qq7657i70tU6g 36f6alFLY3ddYbQG6hMNXHHnmV8u6ZsgCN2D/eZBwC9ahCPG5mB4bpSHfvehu26o27Js zDagpdIqwl0PGwXVZtVc0uf3PGd43j9TkHNq+HqWI6incagXTAOI0yNYQMywI6Ey8xjH ZBvg== X-Gm-Message-State: AJaThX7XowrdvVvNybAnm6klQ7ZfwJxi7zUgAD/Y2oTzlC7hsSBIhfTU jDj0cZm+HPbkP2Yw8q4zgS2YyK78 X-Google-Smtp-Source: AGs4zMaIc33p5T9khdUXrUsKgsaXLQh8UW45hszwUptbVBGD9QWE/o4RgS/h/pwAmp6uvi1HQzp2LQ== X-Received: by 10.98.194.71 with SMTP id l68mr8951718pfg.221.1512106400722; Thu, 30 Nov 2017 21:33:20 -0800 (PST) Received: from jkicinski-Precision-T1700.netronome.com ([75.53.12.129]) by smtp.gmail.com with ESMTPSA id g2sm9767059pfc.130.2017.11.30.21.33.19 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 30 Nov 2017 21:33:20 -0800 (PST) From: Jakub Kicinski To: netdev@vger.kernel.org Cc: oss-drivers@netronome.com, Jiong Wang Subject: [PATCH net-next 12/13] nfp: bpf: implement memory bulk copy for length bigger than 32-bytes Date: Thu, 30 Nov 2017 21:32:59 -0800 Message-Id: <20171201053300.17503-13-jakub.kicinski@netronome.com> X-Mailer: git-send-email 2.15.0 In-Reply-To: <20171201053300.17503-1-jakub.kicinski@netronome.com> References: <20171201053300.17503-1-jakub.kicinski@netronome.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Jiong Wang When the gathered copy length is bigger than 32-bytes and within 128-bytes (the maximum length a single CPP Pull/Push request can finish), the strategy of read/write are changeed into: * Read. - use direct reference mode when length is within 32-bytes. - use indirect mode when length is bigger than 32-bytes. * Write. - length <= 8-bytes use write8 (direct_ref). - length <= 32-byte and 4-bytes aligned use write32 (direct_ref). - length <= 32-bytes but not 4-bytes aligned use write8 (indirect_ref). - length > 32-bytes and 4-bytes aligned use write32 (indirect_ref). - length > 32-bytes and not 4-bytes aligned and <= 40-bytes use write32 (direct_ref) to finish the first 32-bytes. use write8 (direct_ref) to finish all remaining hanging part. - length > 32-bytes and not 4-bytes aligned use write32 (indirect_ref) to finish those 4-byte aligned parts. use write8 (direct_ref) to finish all remaining hanging part. Signed-off-by: Jiong Wang Reviewed-by: Jakub Kicinski --- drivers/net/ethernet/netronome/nfp/bpf/jit.c | 52 ++++++++++++++++++++++++---- 1 file changed, 45 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c b/drivers/net/ethernet/netronome/nfp/bpf/jit.c index 138568c0eee6..1b98ef239605 100644 --- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c +++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c @@ -544,16 +544,18 @@ static int nfp_cpp_memcpy(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) unsigned int i; u8 xfer_num; - if (WARN_ON_ONCE(len > 32)) - return -EOPNOTSUPP; - off = re_load_imm_any(nfp_prog, meta->insn.off, imm_b(nfp_prog)); src_base = reg_a(meta->insn.src_reg * 2); xfer_num = round_up(len, 4) / 4; + /* Setup PREV_ALU fields to override memory read length. */ + if (len > 32) + wrp_immed(nfp_prog, reg_none(), + CMD_OVE_LEN | FIELD_PREP(CMD_OV_LEN, xfer_num - 1)); + /* Memory read from source addr into transfer-in registers. */ - emit_cmd(nfp_prog, CMD_TGT_READ32_SWAP, CMD_MODE_32b, 0, src_base, off, - xfer_num - 1, true); + emit_cmd_any(nfp_prog, CMD_TGT_READ32_SWAP, CMD_MODE_32b, 0, src_base, + off, xfer_num - 1, true, len > 32); /* Move from transfer-in to transfer-out. */ for (i = 0; i < xfer_num; i++) @@ -566,18 +568,54 @@ static int nfp_cpp_memcpy(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta) emit_cmd(nfp_prog, CMD_TGT_WRITE8_SWAP, CMD_MODE_32b, 0, reg_a(meta->paired_st->dst_reg * 2), off, len - 1, true); - } else if (IS_ALIGNED(len, 4)) { + } else if (len <= 32 && IS_ALIGNED(len, 4)) { /* Use single direct_ref write32. */ emit_cmd(nfp_prog, CMD_TGT_WRITE32_SWAP, CMD_MODE_32b, 0, reg_a(meta->paired_st->dst_reg * 2), off, xfer_num - 1, true); - } else { + } else if (len <= 32) { /* Use single indirect_ref write8. */ wrp_immed(nfp_prog, reg_none(), CMD_OVE_LEN | FIELD_PREP(CMD_OV_LEN, len - 1)); emit_cmd_indir(nfp_prog, CMD_TGT_WRITE8_SWAP, CMD_MODE_32b, 0, reg_a(meta->paired_st->dst_reg * 2), off, len - 1, true); + } else if (IS_ALIGNED(len, 4)) { + /* Use single indirect_ref write32. */ + wrp_immed(nfp_prog, reg_none(), + CMD_OVE_LEN | FIELD_PREP(CMD_OV_LEN, xfer_num - 1)); + emit_cmd_indir(nfp_prog, CMD_TGT_WRITE32_SWAP, CMD_MODE_32b, 0, + reg_a(meta->paired_st->dst_reg * 2), off, + xfer_num - 1, true); + } else if (len <= 40) { + /* Use one direct_ref write32 to write the first 32-bytes, then + * another direct_ref write8 to write the remaining bytes. + */ + emit_cmd(nfp_prog, CMD_TGT_WRITE32_SWAP, CMD_MODE_32b, 0, + reg_a(meta->paired_st->dst_reg * 2), off, 7, + true); + + off = re_load_imm_any(nfp_prog, meta->paired_st->off + 32, + imm_b(nfp_prog)); + emit_cmd(nfp_prog, CMD_TGT_WRITE8_SWAP, CMD_MODE_32b, 8, + reg_a(meta->paired_st->dst_reg * 2), off, len - 33, + true); + } else { + /* Use one indirect_ref write32 to write 4-bytes aligned length, + * then another direct_ref write8 to write the remaining bytes. + */ + u8 new_off; + + wrp_immed(nfp_prog, reg_none(), + CMD_OVE_LEN | FIELD_PREP(CMD_OV_LEN, xfer_num - 2)); + emit_cmd_indir(nfp_prog, CMD_TGT_WRITE32_SWAP, CMD_MODE_32b, 0, + reg_a(meta->paired_st->dst_reg * 2), off, + xfer_num - 2, true); + new_off = meta->paired_st->off + (xfer_num - 1) * 4; + off = re_load_imm_any(nfp_prog, new_off, imm_b(nfp_prog)); + emit_cmd(nfp_prog, CMD_TGT_WRITE8_SWAP, CMD_MODE_32b, + xfer_num - 1, reg_a(meta->paired_st->dst_reg * 2), off, + (len & 0x3) - 1, true); } /* TODO: The following extra load is to make sure data flow be identical