From patchwork Mon Jun 25 03:54:21 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Kicinski X-Patchwork-Id: 934050 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=netronome.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=netronome-com.20150623.gappssmtp.com header.i=@netronome-com.20150623.gappssmtp.com header.b="r/XEdQb4"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 41Db0t00t7z9s2L for ; Mon, 25 Jun 2018 13:55:09 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754128AbeFYDzH (ORCPT ); Sun, 24 Jun 2018 23:55:07 -0400 Received: from mail-wr0-f195.google.com ([209.85.128.195]:41200 "EHLO mail-wr0-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753921AbeFYDzB (ORCPT ); Sun, 24 Jun 2018 23:55:01 -0400 Received: by mail-wr0-f195.google.com with SMTP id h10-v6so12058438wrq.8 for ; Sun, 24 Jun 2018 20:55:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=FdkTASvFAXOVl+IYrhVl2LzUAuN9U3W6QAXxlWeQJSI=; b=r/XEdQb4ChSaNi17q8jSnuixmNDhqPx1kxZeTSmewCDDBDjCa3tpojXdWVw4RThmCV PewIwwuBFxsaa3nuxJjZ7AvowYDXlJhVH5BnVxy50HIhxZDWC7zvfHIxcTV4qsgK8KSJ dE8Mckt+/1yEU9qqmg2jDQOtyCeQT+A9Vjrl56nL4eJWkP3eLSAdH4XauZV6Dkbf6UR+ N+zRVSB5WnFPjcis9amCFgUoN6fCwV7Wp+d3Lf11zXo3yPvIA0l2RwUUcnQeFoLGsibk sNCioiPCCRUXqz9vz+tbTMOfWQ+1CE+7Df46uY/k8w1XzXUbrep3GEIr2qStF1eHWDO1 YY4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=FdkTASvFAXOVl+IYrhVl2LzUAuN9U3W6QAXxlWeQJSI=; b=AkkIScBlPzu0qrMtfzULkTWam3ft15iwhH4tZlH22C/cWwm2UfaZJ6iSQe4PdgRT4L d6qWZObim8A9iGqIIiMVyzMnVNwPgI3RLBPSo2mPNybDU6AkNAQJhiwZw+f3DgFQNYm9 mc7L/KJfsyO8HpSE9S5owaGg4V3t9dPMJ62A9De495DZKSB4rgyr+JikGmOT/CiXNvCb dasgdfTpwfIEBKGoe4D4x9mKRQNxF/R2brqkN2/ry5EQgbEY0PZJG3pCPr+cGQx2YyPP obny/55f9CV7IPA9K5P0BnR6iVjegdbId9N8BllM4ykMmT9X17sjv1j9JeHAkLOhM/vj i49g== X-Gm-Message-State: APt69E0q2n52iz2+0hfS++2tPcrg2ftwvQZ7n14+gzZSvK5dzYpLd8UA 6xkIke47g6JFgEhYP7dForm/Cg== X-Google-Smtp-Source: ADUXVKJC4Oqy60ub2H60cMMGJiKMBhp7oZ+GIHbjg0wi3H581m+pDFD8a5Kw2mdgRZnDUIlmGGIvug== X-Received: by 2002:adf:ee0e:: with SMTP id y14-v6mr9013366wrn.63.1529898900092; Sun, 24 Jun 2018 20:55:00 -0700 (PDT) Received: from jkicinski-Precision-T1700.netronome.com ([75.53.12.129]) by smtp.gmail.com with ESMTPSA id r2-v6sm13299648wrq.55.2018.06.24.20.54.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 24 Jun 2018 20:54:59 -0700 (PDT) From: Jakub Kicinski To: alexei.starovoitov@gmail.com, daniel@iogearbox.net Cc: oss-drivers@netronome.com, netdev@vger.kernel.org, Jiong Wang Subject: [PATCH bpf-next 7/7] nfp: bpf: migrate to advanced reciprocal divide in reciprocal_div.h Date: Sun, 24 Jun 2018 20:54:21 -0700 Message-Id: <20180625035421.2991-8-jakub.kicinski@netronome.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180625035421.2991-1-jakub.kicinski@netronome.com> References: <20180625035421.2991-1-jakub.kicinski@netronome.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Jiong Wang As we are doing JIT, we would want to use the advanced version of the reciprocal divide (reciprocal_value_adv) to trade performance with host. We could reduce the required ALU instructions from 4 to 2 or 1. Signed-off-by: Jiong Wang Reviewed-by: Jakub Kicinski --- drivers/net/ethernet/netronome/nfp/bpf/jit.c | 38 ++++++++++++++----- .../net/ethernet/netronome/nfp/bpf/verifier.c | 16 ++++++-- 2 files changed, 42 insertions(+), 12 deletions(-) diff --git a/drivers/net/ethernet/netronome/nfp/bpf/jit.c b/drivers/net/ethernet/netronome/nfp/bpf/jit.c index d732b6cfc356..f99ac00bd649 100644 --- a/drivers/net/ethernet/netronome/nfp/bpf/jit.c +++ b/drivers/net/ethernet/netronome/nfp/bpf/jit.c @@ -1498,8 +1498,9 @@ static int wrp_div_imm(struct nfp_prog *nfp_prog, u8 dst, u64 imm) { swreg tmp_both = imm_both(nfp_prog), dst_both = reg_both(dst); swreg dst_a = reg_a(dst), dst_b = reg_a(dst); - struct reciprocal_value rvalue; + struct reciprocal_value_adv rvalue; swreg tmp_b = imm_b(nfp_prog); + u8 pre_shift, exp; swreg magic; if (imm > U32_MAX) { @@ -1507,15 +1508,34 @@ static int wrp_div_imm(struct nfp_prog *nfp_prog, u8 dst, u64 imm) return 0; } - rvalue = reciprocal_value(imm); + rvalue = reciprocal_value_adv(imm, 32); + exp = rvalue.exp; + if (rvalue.is_wide_m && !(imm & 1)) { + pre_shift = fls(imm & -imm) - 1; + rvalue = reciprocal_value_adv(imm >> pre_shift, 32 - pre_shift); + } else { + pre_shift = 0; + } magic = re_load_imm_any(nfp_prog, rvalue.m, imm_b(nfp_prog)); - wrp_mul_u32(nfp_prog, tmp_both, tmp_both, dst_a, magic, true); - emit_alu(nfp_prog, dst_both, dst_a, ALU_OP_SUB, tmp_b); - emit_shf(nfp_prog, dst_both, reg_none(), SHF_OP_NONE, dst_b, - SHF_SC_R_SHF, rvalue.sh1); - emit_alu(nfp_prog, dst_both, dst_a, ALU_OP_ADD, tmp_b); - emit_shf(nfp_prog, dst_both, reg_none(), SHF_OP_NONE, dst_b, - SHF_SC_R_SHF, rvalue.sh2); + if (imm == 1 << exp) { + emit_shf(nfp_prog, dst_both, reg_none(), SHF_OP_NONE, dst_b, + SHF_SC_R_SHF, exp); + } else if (rvalue.is_wide_m) { + wrp_mul_u32(nfp_prog, tmp_both, tmp_both, dst_a, magic, true); + emit_alu(nfp_prog, dst_both, dst_a, ALU_OP_SUB, tmp_b); + emit_shf(nfp_prog, dst_both, reg_none(), SHF_OP_NONE, dst_b, + SHF_SC_R_SHF, 1); + emit_alu(nfp_prog, dst_both, dst_a, ALU_OP_ADD, tmp_b); + emit_shf(nfp_prog, dst_both, reg_none(), SHF_OP_NONE, dst_b, + SHF_SC_R_SHF, rvalue.sh - 1); + } else { + if (pre_shift) + emit_shf(nfp_prog, dst_both, reg_none(), SHF_OP_NONE, + dst_b, SHF_SC_R_SHF, pre_shift); + wrp_mul_u32(nfp_prog, dst_both, dst_both, dst_a, magic, true); + emit_shf(nfp_prog, dst_both, reg_none(), SHF_OP_NONE, + dst_b, SHF_SC_R_SHF, rvalue.sh); + } return 0; } diff --git a/drivers/net/ethernet/netronome/nfp/bpf/verifier.c b/drivers/net/ethernet/netronome/nfp/bpf/verifier.c index f0f07e988c46..39c2c24fea11 100644 --- a/drivers/net/ethernet/netronome/nfp/bpf/verifier.c +++ b/drivers/net/ethernet/netronome/nfp/bpf/verifier.c @@ -561,12 +561,22 @@ nfp_bpf_check_alu(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta, /* NFP doesn't have divide instructions, we support divide by constant * through reciprocal multiplication. Given NFP support multiplication * no bigger than u32, we'd require divisor and dividend no bigger than - * that as well. + * that as well. There is a further range requirement on dividend, + * please see the NOTE below. * * Also eBPF doesn't support signed divide and has enforced this on C * language level by failing compilation. However LLVM assembler hasn't * enforced this, so it is possible for negative constant to leak in as * a BPF_K operand through assembly code, we reject such cases as well. + * + * NOTE: because we are using "reciprocal_value_adv" which doesn't + * support dividend with MSB set, so we need to JIT separate NFP + * sequence to handle such case. It could be a simple sequence if there + * is conditional move, however there isn't for NFP. So, we don't bother + * generating compare-if-set-branch sequence by rejecting the program + * straight away when the u32 dividend has MSB set. Divide by such a + * large constant would be rare in practice. Also, the programmer could + * simply rewrite it as "result = divisor >= the_const". */ if (is_mbpf_div(meta)) { if (meta->umax_dst > U32_MAX) { @@ -578,8 +588,8 @@ nfp_bpf_check_alu(struct nfp_prog *nfp_prog, struct nfp_insn_meta *meta, pr_vlog(env, "dividend is not constant\n"); return -EINVAL; } - if (meta->umax_src > U32_MAX) { - pr_vlog(env, "dividend is not within u32 value range\n"); + if (meta->umax_src > U32_MAX / 2) { + pr_vlog(env, "dividend is bigger than U32_MAX/2\n"); return -EINVAL; } }