From patchwork Thu May 14 14:46:17 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jon Maloy X-Patchwork-Id: 472369 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id B307E14027F for ; Fri, 15 May 2015 00:46:54 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=yahoo.com header.i=@yahoo.com header.b=uMti32p8; dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933257AbbENOqt (ORCPT ); Thu, 14 May 2015 10:46:49 -0400 Received: from smtp105.biz.mail.bf1.yahoo.com ([98.139.221.43]:30877 "EHLO smtp105.biz.mail.bf1.yahoo.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933169AbbENOqr (ORCPT ); Thu, 14 May 2015 10:46:47 -0400 Received: (qmail 54151 invoked from network); 14 May 2015 14:46:46 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s1024; t=1431614806; bh=kMSGkmVdwNyKIIEIB4DBrkXjxTVxZrwXjQhAexONxAs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References; b=uMti32p8ujT396GFY+RU8gQuXrcjwl6FPS9aENt7PMsnSziufBJhmM4VDjd/+GpmRypu4ThDEA5werrUxXcs2TcaNtxFw3RUuhGy+JbnRKFx7/1Zk0XSYPUcJ7vi67tyl6kmvolItKhIeu2JXkDT1RVEERkkxLa0sC3LH5HOqWM= X-Yahoo-Newman-Property: ymail-3 X-YMail-OSG: ntvbdgAVM1niuN.Zh8DwrAYwbMZfZxXh5Pmr3Tnl.oKOBzk ObiJa7HVfYV8mRBrf70khGk.y3aWdD3c8eKV01dQdOgcwShucF90g8eApLBX OT82DzQpDYg2jIp78vWAuAbm6pHE9QYRoIkgC4KMjGdjRJ03mXCRUZgIkB7e rUf9t_2FBri0sF5Uwm0gtBANNSUIfWkEetMCCnnVrh2tOvCt1y4QiAZHQat_ aMmqnFto.PlpqphIepZwdamlxKlTfW.aIf.NRRWMNXUAS5Rr45oNxgUZ113z 1KgnyCX0X3fLktHIo.Cu74xgqh48QdBrVJIzzkmZJafHGy60zW5VFl2I5E0_ 0.Y2x._YvQqysp9zpY7xSPOcOoCEpRQwYHrnCj.C493_K1B2YBCVDTGzE7mx emLJUmWyNqfJrAw_pJXEJ1ntWO6DtBS_sqK4aOfX4E4q6YpRUE6BZW3_.5TQ pXopRZib1.P4icVOc3JYbPjW5Y.QiGvUNIas7OkSd5XFqAgW4Y.Z.7rm3KWm TGwItgE2kW_PAEvmq7288eioM89hkzc6nfWs- X-Yahoo-SMTP: gPXIZm2swBAFQJ_Vx0CebjUfUdhJ From: Jon Maloy To: davem@davemloft.net Cc: netdev@vger.kernel.org, Paul Gortmaker , erik.hugne@ericsson.com, ying.xue@windriver.com, maloy@donjonn.com, tipc-discussion@lists.sourceforge.net, Jon Maloy Subject: [PATCH net-next 7/8] tipc: improve link congestion algorithm Date: Thu, 14 May 2015 10:46:17 -0400 Message-Id: <1431614778-17582-8-git-send-email-jon.maloy@ericsson.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1431614778-17582-1-git-send-email-jon.maloy@ericsson.com> References: <1431614778-17582-1-git-send-email-jon.maloy@ericsson.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The link congestion algorithm used until now implies two problems. - It is too generous towards lower-level messages in situations of high load by giving "absolute" bandwidth guarantees to the different priority levels. LOW traffic is guaranteed 10%, MEDIUM is guaranted 20%, HIGH is guaranteed 30%, and CRITICAL is guaranteed 40% of the available bandwidth. But, in the absence of higher level traffic, the ratio between two distinct levels becomes unreasonable. E.g. if there is only LOW and MEDIUM traffic on a system, the former is guaranteed 1/3 of the bandwidth, and the latter 2/3. This again means that if there is e.g. one LOW user and 10 MEDIUM users, the former will have 33.3% of the bandwidth, and the others will have to compete for the remainder, i.e. each will end up with 6.7% of the capacity. - Packets of type MSG_BUNDLER are created at SYSTEM importance level, but only after the packets bundled into it have passed the congestion test for their own respective levels. Since bundled packets don't result in incrementing the level counter for their own importance, only occasionally for the SYSTEM level counter, they do in practice obtain SYSTEM level importance. Hence, the current implementation provides a gap in the congestion algorithm that in the worst case may lead to a link reset. We now refine the congestion algorithm as follows: - A message is accepted to the link backlog only if its own level counter, and all superior level counters, permit it. - The importance of a created bundle packet is set according to its contents. A bundle packet created from messges at levels LOW to CRITICAL is given importance level CRITICAL, while a bundle created from a SYSTEM level message is given importance SYSTEM. In the latter case only subsequent SYSTEM level messages are allowed to be bundled into it. This solves the first problem described above, by making the bandwidth guarantee relative to the total number of users at all levels; only the upper limit for each level remains absolute. In the example described above, the single LOW user would use 1/11th of the bandwidth, the same as each of the ten MEDIUM users, but he still has the same guarantee against starvation as the latter ones. The fix also solves the second problem. If the CRITICAL level is filled up by bundle packets of that level, no lower level packets will be accepted any more. Suggested-by: Gergely Kiss Reviewed-by: Ying Xue Signed-off-by: Jon Maloy --- net/tipc/link.c | 11 ++++++----- net/tipc/msg.c | 7 +++++++ net/tipc/msg.h | 14 +++++++++----- 3 files changed, 22 insertions(+), 10 deletions(-) diff --git a/net/tipc/link.c b/net/tipc/link.c index a5ea19e..c1aba69 100644 --- a/net/tipc/link.c +++ b/net/tipc/link.c @@ -645,7 +645,7 @@ int __tipc_link_xmit(struct net *net, struct tipc_link *link, { struct tipc_msg *msg = buf_msg(skb_peek(list)); unsigned int maxwin = link->window; - unsigned int imp = msg_importance(msg); + unsigned int i, imp = msg_importance(msg); uint mtu = link->mtu; u16 ack = mod(link->rcv_nxt - 1); u16 seqno = link->snd_nxt; @@ -655,10 +655,11 @@ int __tipc_link_xmit(struct net *net, struct tipc_link *link, struct sk_buff_head *backlogq = &link->backlogq; struct sk_buff *skb, *tmp; - /* Match backlog limit against msg importance: */ - if (unlikely(link->backlog[imp].len >= link->backlog[imp].limit)) - return link_schedule_user(link, list); - + /* Match msg importance against this and all higher backlog limits: */ + for (i = imp; i <= TIPC_SYSTEM_IMPORTANCE; i++) { + if (unlikely(link->backlog[i].len >= link->backlog[i].limit)) + return link_schedule_user(link, list); + } if (unlikely(msg_size(msg) > mtu)) { __skb_queue_purge(list); return -EMSGSIZE; diff --git a/net/tipc/msg.c b/net/tipc/msg.c index c3e96e8..ff7362d 100644 --- a/net/tipc/msg.c +++ b/net/tipc/msg.c @@ -365,6 +365,9 @@ bool tipc_msg_bundle(struct sk_buff *bskb, struct sk_buff *skb, u32 mtu) return false; if (unlikely(max < (start + msz))) return false; + if ((msg_importance(msg) < TIPC_SYSTEM_IMPORTANCE) && + (msg_importance(bmsg) == TIPC_SYSTEM_IMPORTANCE)) + return false; skb_put(bskb, pad + msz); skb_copy_to_linear_data_offset(bskb, start, skb->data, msz); @@ -448,6 +451,10 @@ bool tipc_msg_make_bundle(struct sk_buff **skb, u32 mtu, u32 dnode) bmsg = buf_msg(bskb); tipc_msg_init(msg_prevnode(msg), bmsg, MSG_BUNDLER, 0, INT_H_SIZE, dnode); + if (msg_isdata(msg)) + msg_set_importance(bmsg, TIPC_CRITICAL_IMPORTANCE); + else + msg_set_importance(bmsg, TIPC_SYSTEM_IMPORTANCE); msg_set_seqno(bmsg, msg_seqno(msg)); msg_set_ack(bmsg, msg_ack(msg)); msg_set_bcast_ack(bmsg, msg_bcast_ack(msg)); diff --git a/net/tipc/msg.h b/net/tipc/msg.h index 6ca2366..6caf16c 100644 --- a/net/tipc/msg.h +++ b/net/tipc/msg.h @@ -352,18 +352,22 @@ static inline void msg_set_seqno(struct tipc_msg *m, u16 n) */ static inline u32 msg_importance(struct tipc_msg *m) { - if (unlikely(msg_user(m) == MSG_FRAGMENTER)) + int usr = msg_user(m); + + if (likely((usr <= TIPC_CRITICAL_IMPORTANCE) && !msg_errcode(m))) + return usr; + if ((usr == MSG_FRAGMENTER) || (usr == MSG_BUNDLER)) return msg_bits(m, 5, 13, 0x7); - if (likely(msg_isdata(m) && !msg_errcode(m))) - return msg_user(m); return TIPC_SYSTEM_IMPORTANCE; } static inline void msg_set_importance(struct tipc_msg *m, u32 i) { - if (unlikely(msg_user(m) == MSG_FRAGMENTER)) + int usr = msg_user(m); + + if (likely((usr == MSG_FRAGMENTER) || (usr == MSG_BUNDLER))) msg_set_bits(m, 5, 13, 0x7, i); - else if (likely(i < TIPC_SYSTEM_IMPORTANCE)) + else if (i < TIPC_SYSTEM_IMPORTANCE) msg_set_user(m, i); else pr_warn("Trying to set illegal importance in message\n");