Message ID | 20190906172941.25136-1-simon.horman@netronome.com |
---|---|
State | Accepted |
Delegated to: | David Miller |
Headers | show |
Series | [net] nfp: flower: cmsg rtnl locks can timeout reify messages | expand |
From: Simon Horman <simon.horman@netronome.com> Date: Fri, 6 Sep 2019 19:29:41 +0200 > From: Fred Lotter <frederik.lotter@netronome.com> > > Flower control message replies are handled in different locations. The truly > high priority replies are handled in the BH (tasklet) context, while the > remaining replies are handled in a predefined Linux work queue. The work > queue handler orders replies into high and low priority groups, and always > start servicing the high priority replies within the received batch first. > > Reply Type: Rtnl Lock: Handler: ... > A subset of control messages can block waiting for an rtnl lock (from both > work queue priority groups). The rtnl lock is heavily contended for by > external processes such as systemd-udevd, systemd-network and libvirtd, > especially during netdev creation, such as when flower VFs and representors > are instantiated. > > Kernel netlink instrumentation shows that external processes (such as > systemd-udevd) often use successive rtnl_trylock() sequences, which can result > in an rtnl_lock() blocked control message to starve for longer periods of time > during rtnl lock contention, i.e. netdev creation. > > In the current design a single blocked control message will block the entire > work queue (both priorities), and introduce a latency which is > nondeterministic and dependent on system wide rtnl lock usage. > > In some extreme cases, one blocked control message at exactly the wrong time, > just before the maximum number of VFs are instantiated, can block the work > queue for long enough to prevent VF representor REIFY replies from getting > handled in time for the 40ms timeout. > > The firmware will deliver the total maximum number of REIFY message replies in > around 300us. > > Only REIFY and MTU update messages require replies within a timeout period (of > 40ms). The MTU-only updates are already done directly in the BH (tasklet) > handler. > > Move the REIFY handler down into the BH (tasklet) in order to resolve timeouts > caused by a blocked work queue waiting on rtnl locks. > > Signed-off-by: Fred Lotter <frederik.lotter@netronome.com> > Signed-off-by: Simon Horman <simon.horman@netronome.com> Applied.
diff --git a/drivers/net/ethernet/netronome/nfp/flower/cmsg.c b/drivers/net/ethernet/netronome/nfp/flower/cmsg.c index d5bbe3d6048b..05981b54eaab 100644 --- a/drivers/net/ethernet/netronome/nfp/flower/cmsg.c +++ b/drivers/net/ethernet/netronome/nfp/flower/cmsg.c @@ -260,9 +260,6 @@ nfp_flower_cmsg_process_one_rx(struct nfp_app *app, struct sk_buff *skb) type = cmsg_hdr->type; switch (type) { - case NFP_FLOWER_CMSG_TYPE_PORT_REIFY: - nfp_flower_cmsg_portreify_rx(app, skb); - break; case NFP_FLOWER_CMSG_TYPE_PORT_MOD: nfp_flower_cmsg_portmod_rx(app, skb); break; @@ -328,8 +325,7 @@ nfp_flower_queue_ctl_msg(struct nfp_app *app, struct sk_buff *skb, int type) struct nfp_flower_priv *priv = app->priv; struct sk_buff_head *skb_head; - if (type == NFP_FLOWER_CMSG_TYPE_PORT_REIFY || - type == NFP_FLOWER_CMSG_TYPE_PORT_MOD) + if (type == NFP_FLOWER_CMSG_TYPE_PORT_MOD) skb_head = &priv->cmsg_skbs_high; else skb_head = &priv->cmsg_skbs_low; @@ -368,6 +364,10 @@ void nfp_flower_cmsg_rx(struct nfp_app *app, struct sk_buff *skb) } else if (cmsg_hdr->type == NFP_FLOWER_CMSG_TYPE_TUN_NEIGH) { /* Acks from the NFP that the route is added - ignore. */ dev_consume_skb_any(skb); + } else if (cmsg_hdr->type == NFP_FLOWER_CMSG_TYPE_PORT_REIFY) { + /* Handle REIFY acks outside wq to prevent RTNL conflict. */ + nfp_flower_cmsg_portreify_rx(app, skb); + dev_consume_skb_any(skb); } else { nfp_flower_queue_ctl_msg(app, skb, cmsg_hdr->type); }