Message ID | 20130924093238.GD18494@eldamar.org.uk |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
On Tue, Sep 24, 2013 at 10:32:38AM +0100, Alexander Frolkin wrote: > Improve the SH fallback realserver selection strategy. > > With sh and sh-fallback, if a realserver is down, this attempts to > distribute the traffic that would have gone to that server evenly > among the remaining servers. > > Signed-off-by: Alexander Frolkin <avf@eldamar.org.uk> Hi Alexander, could you add some comments to the code or at least a description of the algorithm to the above the function. The intent of original code may not have been obvious to the eye but this version certainly isn't obvious to mine. > -- > diff --git a/net/netfilter/ipvs/ip_vs_sh.c b/net/netfilter/ipvs/ip_vs_sh.c > index 3588fae..0db7d01 100644 > --- a/net/netfilter/ipvs/ip_vs_sh.c > +++ b/net/netfilter/ipvs/ip_vs_sh.c > @@ -120,22 +120,33 @@ static inline struct ip_vs_dest * > ip_vs_sh_get_fallback(struct ip_vs_service *svc, struct ip_vs_sh_state *s, > const union nf_inet_addr *addr, __be16 port) > { > - unsigned int offset; > - unsigned int hash; > + unsigned int offset, roffset; > + unsigned int hash, ihash; > struct ip_vs_dest *dest; > > - for (offset = 0; offset < IP_VS_SH_TAB_SIZE; offset++) { > - hash = ip_vs_sh_hashkey(svc->af, addr, port, offset); > - dest = rcu_dereference(s->buckets[hash].dest); > - if (!dest) > - break; > - if (is_unavailable(dest)) > - IP_VS_DBG_BUF(6, "SH: selected unavailable server " > - "%s:%d (offset %d)", > + ihash = ip_vs_sh_hashkey(svc->af, addr, port, 0); > + dest = rcu_dereference(s->buckets[ihash].dest); > + if (!dest) > + return NULL; > + if (is_unavailable(dest)) { > + IP_VS_DBG_BUF(6, "SH: selected unavailable server " > + "%s:%d, reselecting", > + IP_VS_DBG_ADDR(svc->af, &dest->addr), > + ntohs(dest->port)); > + for (offset = 0; offset < IP_VS_SH_TAB_SIZE; offset++) { > + roffset = (offset + ihash) % IP_VS_SH_TAB_SIZE; > + hash = ip_vs_sh_hashkey(svc->af, addr, port, roffset); > + dest = rcu_dereference(s->buckets[hash].dest); > + if (is_unavailable(dest)) > + IP_VS_DBG_BUF(6, "SH: selected unavailable " > + "server %s:%d (offset %d), reselecting", > IP_VS_DBG_ADDR(svc->af, &dest->addr), > - ntohs(dest->port), offset); > - else > - return dest; > + ntohs(dest->port), roffset); > + else > + return dest; > + } > + } else { > + return dest; > } > > return NULL; > > -- > To unsubscribe from this list: send the line "unsubscribe lvs-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, > could you add some comments to the code or at least a description of the > algorithm to the above the function. The intent of original code may not > have been obvious to the eye but this version certainly isn't obvious to > mine. Sure. I have a bad habit of assuming that if I understand something, then others automatically do too. :-) The original code went through the table, starting at the same place as the code without fallback and if that returned an unavailable realserver, it offset the hash by one and repeated the lookup, then added two, etc., up to IP_VS_SH_TAB_SIZE-1. So the hash offset was 0, 1, ..., IP_VS_SH_TAB_SIZE-1. The result is that if a server is down, all traffic destined for it would fall back onto the next server in the list. The new code also starts at the same place as the old code (offset 0), but if that fails, it uses the same fallback strategy as the old code, but the hash offset is now ihash, ihash + 1, ..., IP_VS_SH_TAB_SIZE-1, 0, 1, ..., ihash - 1, i.e., it starts at ihash instead of 0 and loops around the table. ihash could have been a random number, but choosing it to be something based on the source IP and port (in which case it may as well be the same hash [offset 0]) means that the behaviour will be the same on different directors. This spreads the load of an unavailable server across the remaining servers instead of just moving it to the next one in the list. Hope that makes sense... I'll submit a patch with a comment shortly. Alex -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/netfilter/ipvs/ip_vs_sh.c b/net/netfilter/ipvs/ip_vs_sh.c index 3588fae..0db7d01 100644 --- a/net/netfilter/ipvs/ip_vs_sh.c +++ b/net/netfilter/ipvs/ip_vs_sh.c @@ -120,22 +120,33 @@ static inline struct ip_vs_dest * ip_vs_sh_get_fallback(struct ip_vs_service *svc, struct ip_vs_sh_state *s, const union nf_inet_addr *addr, __be16 port) { - unsigned int offset; - unsigned int hash; + unsigned int offset, roffset; + unsigned int hash, ihash; struct ip_vs_dest *dest; - for (offset = 0; offset < IP_VS_SH_TAB_SIZE; offset++) { - hash = ip_vs_sh_hashkey(svc->af, addr, port, offset); - dest = rcu_dereference(s->buckets[hash].dest); - if (!dest) - break; - if (is_unavailable(dest)) - IP_VS_DBG_BUF(6, "SH: selected unavailable server " - "%s:%d (offset %d)", + ihash = ip_vs_sh_hashkey(svc->af, addr, port, 0); + dest = rcu_dereference(s->buckets[ihash].dest); + if (!dest) + return NULL; + if (is_unavailable(dest)) { + IP_VS_DBG_BUF(6, "SH: selected unavailable server " + "%s:%d, reselecting", + IP_VS_DBG_ADDR(svc->af, &dest->addr), + ntohs(dest->port)); + for (offset = 0; offset < IP_VS_SH_TAB_SIZE; offset++) { + roffset = (offset + ihash) % IP_VS_SH_TAB_SIZE; + hash = ip_vs_sh_hashkey(svc->af, addr, port, roffset); + dest = rcu_dereference(s->buckets[hash].dest); + if (is_unavailable(dest)) + IP_VS_DBG_BUF(6, "SH: selected unavailable " + "server %s:%d (offset %d), reselecting", IP_VS_DBG_ADDR(svc->af, &dest->addr), - ntohs(dest->port), offset); - else - return dest; + ntohs(dest->port), roffset); + else + return dest; + } + } else { + return dest; } return NULL;
Improve the SH fallback realserver selection strategy. With sh and sh-fallback, if a realserver is down, this attempts to distribute the traffic that would have gone to that server evenly among the remaining servers. Signed-off-by: Alexander Frolkin <avf@eldamar.org.uk> -- -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html