Message ID | 20091229013401.GE10172@verge.net.au |
---|---|
State | Not Applicable, archived |
Delegated to: | David Miller |
Headers | show |
Simon Horman wrote: > On Mon, Dec 28, 2009 at 07:49:38PM +0100, Mark Bergsma wrote: >> On 03-12-08 01:37, David Miller wrote: >>> From: "Catalin(ux) M. BOIE" <catab@embedromix.ro> >>> Date: Tue, 2 Dec 2008 16:16:04 -0700 (MST) >>>> I was looking for anything that could get me past of 88.000 request per >>>> seconds. >>>> The help text told me to raise that value if I have big number of >>>> connections. I just needed an easy way to test. >>> You're just repeating what I said, you "think" it should be >>> changed and as a result you are wasting everyones time. >>> >>> You don't actually "know", you're just guessing using random >>> snippets from documentation rather than good hard evidence of >>> a need. >> Hello, >> >> I just found this year-old thread about a patch allowing the IPVS >> connection hash table size to be set at load time by a module parameter. >> Apparently the conclusion reached was that allowing this configuration >> setting to be changed would be useless, and that the poster's >> performance problems would likely lie elsewhere, since he had no >> evidence it was caused by the hash table size. > > Hi Mark, > > thanks for your test results. I have added them to the patch. > Feel free to edit the text. Just wondering because of this comment - do you want me to apply this patch? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Jan 04, 2010 at 02:57:22PM +0100, Patrick McHardy wrote: > Simon Horman wrote: > > On Mon, Dec 28, 2009 at 07:49:38PM +0100, Mark Bergsma wrote: > >> On 03-12-08 01:37, David Miller wrote: > >>> From: "Catalin(ux) M. BOIE" <catab@embedromix.ro> > >>> Date: Tue, 2 Dec 2008 16:16:04 -0700 (MST) > >>>> I was looking for anything that could get me past of 88.000 request per > >>>> seconds. > >>>> The help text told me to raise that value if I have big number of > >>>> connections. I just needed an easy way to test. > >>> You're just repeating what I said, you "think" it should be > >>> changed and as a result you are wasting everyones time. > >>> > >>> You don't actually "know", you're just guessing using random > >>> snippets from documentation rather than good hard evidence of > >>> a need. > >> Hello, > >> > >> I just found this year-old thread about a patch allowing the IPVS > >> connection hash table size to be set at load time by a module parameter. > >> Apparently the conclusion reached was that allowing this configuration > >> setting to be changed would be useless, and that the poster's > >> performance problems would likely lie elsewhere, since he had no > >> evidence it was caused by the hash table size. > > > > Hi Mark, > > > > thanks for your test results. I have added them to the patch. > > Feel free to edit the text. > > Just wondering because of this comment - do you want me to apply this > patch? I was fishing for an response from Mark. I'll resubmit it properly as that doesn't seem to be forthcoming. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05-01-10 00:24, Simon Horman wrote: > On Mon, Jan 04, 2010 at 02:57:22PM +0100, Patrick McHardy wrote: >> Simon Horman wrote: >>> On Mon, Dec 28, 2009 at 07:49:38PM +0100, Mark Bergsma wrote: >>>> On 03-12-08 01:37, David Miller wrote: >>>>> From: "Catalin(ux) M. BOIE" <catab@embedromix.ro> >>>>> Date: Tue, 2 Dec 2008 16:16:04 -0700 (MST) >>>>>> I was looking for anything that could get me past of 88.000 request per >>>>>> seconds. >>>>>> The help text told me to raise that value if I have big number of >>>>>> connections. I just needed an easy way to test. >>>>> You're just repeating what I said, you "think" it should be >>>>> changed and as a result you are wasting everyones time. >>>>> >>>>> You don't actually "know", you're just guessing using random >>>>> snippets from documentation rather than good hard evidence of >>>>> a need. >>>> Hello, >>>> >>>> I just found this year-old thread about a patch allowing the IPVS >>>> connection hash table size to be set at load time by a module parameter. >>>> Apparently the conclusion reached was that allowing this configuration >>>> setting to be changed would be useless, and that the poster's >>>> performance problems would likely lie elsewhere, since he had no >>>> evidence it was caused by the hash table size. >>> >>> Hi Mark, >>> >>> thanks for your test results. I have added them to the patch. >>> Feel free to edit the text. >> >> Just wondering because of this comment - do you want me to apply this >> patch? > > I was fishing for an response from Mark. > I'll resubmit it properly as that doesn't seem to be forthcoming. Hi Simon/Patrick, Sorry for the late reply, and yes, this is fine. Thanks for applying the patch!
> Sorry for the late reply, and yes, this is fine. Thanks for applying the > patch! > > -- > Mark Bergsma <mark@wikimedia.org> > Operations Engineer, Wikimedia Foundation I also want to thank Mark for providing the data, allowing this patch to be accepted. Thanks, Mark! -- Catalin(ux) M. BOIE http://kernel.embedromix.ro/ -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: net-next-2.6/include/net/ip_vs.h =================================================================== --- net-next-2.6.orig/include/net/ip_vs.h 2009-12-29 10:18:56.000000000 +0900 +++ net-next-2.6/include/net/ip_vs.h 2009-12-29 10:19:13.000000000 +0900 @@ -26,6 +26,11 @@ #include <linux/ipv6.h> /* for struct ipv6hdr */ #include <net/ipv6.h> /* for ipv6_addr_copy */ + +/* Connections' size value needed by ip_vs_ctl.c */ +extern int ip_vs_conn_tab_size; + + struct ip_vs_iphdr { int len; __u8 protocol; @@ -592,17 +597,6 @@ extern void ip_vs_init_hash_table(struct * (from ip_vs_conn.c) */ -/* - * IPVS connection entry hash table - */ -#ifndef CONFIG_IP_VS_TAB_BITS -#define CONFIG_IP_VS_TAB_BITS 12 -#endif - -#define IP_VS_CONN_TAB_BITS CONFIG_IP_VS_TAB_BITS -#define IP_VS_CONN_TAB_SIZE (1 << IP_VS_CONN_TAB_BITS) -#define IP_VS_CONN_TAB_MASK (IP_VS_CONN_TAB_SIZE - 1) - enum { IP_VS_DIR_INPUT = 0, IP_VS_DIR_OUTPUT, Index: net-next-2.6/net/netfilter/ipvs/Kconfig =================================================================== --- net-next-2.6.orig/net/netfilter/ipvs/Kconfig 2009-12-29 10:18:56.000000000 +0900 +++ net-next-2.6/net/netfilter/ipvs/Kconfig 2009-12-29 10:19:13.000000000 +0900 @@ -68,6 +68,10 @@ config IP_VS_TAB_BITS each hash entry uses 8 bytes, so you can estimate how much memory is needed for your box. + You can overwrite this number setting conn_tab_bits module parameter + or by appending ip_vs.conn_tab_bits=? to the kernel command line + if IP VS was compiled built-in. + comment "IPVS transport protocol load balancing support" config IP_VS_PROTO_TCP Index: net-next-2.6/net/netfilter/ipvs/ip_vs_conn.c =================================================================== --- net-next-2.6.orig/net/netfilter/ipvs/ip_vs_conn.c 2009-12-29 10:18:56.000000000 +0900 +++ net-next-2.6/net/netfilter/ipvs/ip_vs_conn.c 2009-12-29 10:19:13.000000000 +0900 @@ -40,6 +40,21 @@ #include <net/ip_vs.h> +#ifndef CONFIG_IP_VS_TAB_BITS +#define CONFIG_IP_VS_TAB_BITS 12 +#endif + +/* + * Connection hash size. Default is what was selected at compile time. +*/ +int ip_vs_conn_tab_bits = CONFIG_IP_VS_TAB_BITS; +module_param_named(conn_tab_bits, ip_vs_conn_tab_bits, int, 0444); +MODULE_PARM_DESC(conn_tab_bits, "Set connections' hash size"); + +/* size and mask values */ +int ip_vs_conn_tab_size; +int ip_vs_conn_tab_mask; + /* * Connection hash table: for input and output packets lookups of IPVS */ @@ -125,11 +140,11 @@ static unsigned int ip_vs_conn_hashkey(i if (af == AF_INET6) return jhash_3words(jhash(addr, 16, ip_vs_conn_rnd), (__force u32)port, proto, ip_vs_conn_rnd) - & IP_VS_CONN_TAB_MASK; + & ip_vs_conn_tab_mask; #endif return jhash_3words((__force u32)addr->ip, (__force u32)port, proto, ip_vs_conn_rnd) - & IP_VS_CONN_TAB_MASK; + & ip_vs_conn_tab_mask; } @@ -760,7 +775,7 @@ static void *ip_vs_conn_array(struct seq int idx; struct ip_vs_conn *cp; - for(idx = 0; idx < IP_VS_CONN_TAB_SIZE; idx++) { + for (idx = 0; idx < ip_vs_conn_tab_size; idx++) { ct_read_lock_bh(idx); list_for_each_entry(cp, &ip_vs_conn_tab[idx], c_list) { if (pos-- == 0) { @@ -797,7 +812,7 @@ static void *ip_vs_conn_seq_next(struct idx = l - ip_vs_conn_tab; ct_read_unlock_bh(idx); - while (++idx < IP_VS_CONN_TAB_SIZE) { + while (++idx < ip_vs_conn_tab_size) { ct_read_lock_bh(idx); list_for_each_entry(cp, &ip_vs_conn_tab[idx], c_list) { seq->private = &ip_vs_conn_tab[idx]; @@ -976,8 +991,8 @@ void ip_vs_random_dropentry(void) /* * Randomly scan 1/32 of the whole table every second */ - for (idx = 0; idx < (IP_VS_CONN_TAB_SIZE>>5); idx++) { - unsigned hash = net_random() & IP_VS_CONN_TAB_MASK; + for (idx = 0; idx < (ip_vs_conn_tab_size>>5); idx++) { + unsigned hash = net_random() & ip_vs_conn_tab_mask; /* * Lock is actually needed in this loop. @@ -1029,7 +1044,7 @@ static void ip_vs_conn_flush(void) struct ip_vs_conn *cp; flush_again: - for (idx=0; idx<IP_VS_CONN_TAB_SIZE; idx++) { + for (idx = 0; idx < ip_vs_conn_tab_size; idx++) { /* * Lock is actually needed in this loop. */ @@ -1060,10 +1075,15 @@ int __init ip_vs_conn_init(void) { int idx; + /* Compute size and mask */ + ip_vs_conn_tab_size = 1 << ip_vs_conn_tab_bits; + ip_vs_conn_tab_mask = ip_vs_conn_tab_size - 1; + /* * Allocate the connection hash table and initialize its list heads */ - ip_vs_conn_tab = vmalloc(IP_VS_CONN_TAB_SIZE*sizeof(struct list_head)); + ip_vs_conn_tab = vmalloc(ip_vs_conn_tab_size * + sizeof(struct list_head)); if (!ip_vs_conn_tab) return -ENOMEM; @@ -1078,12 +1098,12 @@ int __init ip_vs_conn_init(void) pr_info("Connection hash table configured " "(size=%d, memory=%ldKbytes)\n", - IP_VS_CONN_TAB_SIZE, - (long)(IP_VS_CONN_TAB_SIZE*sizeof(struct list_head))/1024); + ip_vs_conn_tab_size, + (long)(ip_vs_conn_tab_size*sizeof(struct list_head))/1024); IP_VS_DBG(0, "Each connection entry needs %Zd bytes at least\n", sizeof(struct ip_vs_conn)); - for (idx = 0; idx < IP_VS_CONN_TAB_SIZE; idx++) { + for (idx = 0; idx < ip_vs_conn_tab_size; idx++) { INIT_LIST_HEAD(&ip_vs_conn_tab[idx]); } Index: net-next-2.6/net/netfilter/ipvs/ip_vs_ctl.c =================================================================== --- net-next-2.6.orig/net/netfilter/ipvs/ip_vs_ctl.c 2009-12-29 10:18:56.000000000 +0900 +++ net-next-2.6/net/netfilter/ipvs/ip_vs_ctl.c 2009-12-29 10:19:13.000000000 +0900 @@ -1843,7 +1843,7 @@ static int ip_vs_info_seq_show(struct se if (v == SEQ_START_TOKEN) { seq_printf(seq, "IP Virtual Server version %d.%d.%d (size=%d)\n", - NVERSION(IP_VS_VERSION_CODE), IP_VS_CONN_TAB_SIZE); + NVERSION(IP_VS_VERSION_CODE), ip_vs_conn_tab_size); seq_puts(seq, "Prot LocalAddress:Port Scheduler Flags\n"); seq_puts(seq, @@ -2374,7 +2374,7 @@ do_ip_vs_get_ctl(struct sock *sk, int cm char buf[64]; sprintf(buf, "IP Virtual Server version %d.%d.%d (size=%d)", - NVERSION(IP_VS_VERSION_CODE), IP_VS_CONN_TAB_SIZE); + NVERSION(IP_VS_VERSION_CODE), ip_vs_conn_tab_size); if (copy_to_user(user, buf, strlen(buf)+1) != 0) { ret = -EFAULT; goto out; @@ -2387,7 +2387,7 @@ do_ip_vs_get_ctl(struct sock *sk, int cm { struct ip_vs_getinfo info; info.version = IP_VS_VERSION_CODE; - info.size = IP_VS_CONN_TAB_SIZE; + info.size = ip_vs_conn_tab_size; info.num_services = ip_vs_num_services; if (copy_to_user(user, &info, sizeof(info)) != 0) ret = -EFAULT; @@ -3231,7 +3231,7 @@ static int ip_vs_genl_get_cmd(struct sk_ case IPVS_CMD_GET_INFO: NLA_PUT_U32(msg, IPVS_INFO_ATTR_VERSION, IP_VS_VERSION_CODE); NLA_PUT_U32(msg, IPVS_INFO_ATTR_CONN_TAB_SIZE, - IP_VS_CONN_TAB_SIZE); + ip_vs_conn_tab_size); break; }