diff mbox

[net-next,02/12] tipc: Add "max_ports" configuration parameter

Message ID 1369942577-39563-3-git-send-email-paul.gortmaker@windriver.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Paul Gortmaker May 30, 2013, 7:36 p.m. UTC
From: Erik Hugne <erik.hugne@ericsson.com>

Introduce the "max_ports" module parameter, which allows the maximum
number of ports supported by TIPC to be changed from the default value
at boot, or at module load time. Because of the way the port reference
table is structured and initiated, this value must be known at module
start time, and can not be changed later.

Until now this value has been set via a macro, and hence things
have to be recompiled if the value is to be changed. The Kconfig
knob and the dead code intended to change this parameter at runtime
are dropped.

Considering TIPC node addresses are unique on the entire node, the
64k port limit has proven to be a little too strict.  We increase the
allowed max to 128k. This is safe since the protocol headers allow
for up to 2^32 -1 ports.

Usage for module: "insmod tipc.ko max_ports=<value>" ; at boot, append
"tipc.max_ports=<value>" to the kernel command line.

Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
---
 net/tipc/Kconfig  | 12 ------------
 net/tipc/config.c | 20 +-------------------
 net/tipc/core.c   |  9 +++++++--
 net/tipc/core.h   |  5 ++++-
 4 files changed, 12 insertions(+), 34 deletions(-)

Comments

David Miller May 30, 2013, 10:49 p.m. UTC | #1
From: Paul Gortmaker <paul.gortmaker@windriver.com>
Date: Thu, 30 May 2013 15:36:07 -0400

> From: Erik Hugne <erik.hugne@ericsson.com>
> 
> Introduce the "max_ports" module parameter, which allows the maximum
> number of ports supported by TIPC to be changed from the default value
> at boot, or at module load time. Because of the way the port reference
> table is structured and initiated, this value must be known at module
> start time, and can not be changed later.
> 
> Until now this value has been set via a macro, and hence things
> have to be recompiled if the value is to be changed. The Kconfig
> knob and the dead code intended to change this parameter at runtime
> are dropped.
> 
> Considering TIPC node addresses are unique on the entire node, the
> 64k port limit has proven to be a little too strict.  We increase the
> allowed max to 128k. This is safe since the protocol headers allow
> for up to 2^32 -1 ports.
> 
> Usage for module: "insmod tipc.ko max_ports=<value>" ; at boot, append
> "tipc.max_ports=<value>" to the kernel command line.
> 
> Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>

View compile time constants and module parameters as artificial
limits, they are terrible and unnecessary.

There is no reason you cannot restructure this table so that you
can dynamically size it at run time.

Please reimplement it in that way.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Erik Hugne May 31, 2013, 8:25 a.m. UTC | #2
On Thu, May 30, 2013 at 03:49:25PM -0700, David Miller wrote:
> View compile time constants and module parameters as artificial
> limits, they are terrible and unnecessary.
> 
> There is no reason you cannot restructure this table so that you
> can dynamically size it at run time.

The TIPC ref table index is used directly as the port identity in the 
TIPC publications. When a socket is bound, this ID is published to all 
other nodes in the cluster.
If we where to allow the table to be changed dynamically, we would need
to change the port identities for already bound sockets/ports, withdraw
the old identity and publish the new one.
In the best case, this will lead to a temporary interruption for all
TIPC services until the new port ID's have been propagated out to the cluster.

> 
> Please reimplement it in that way.
> 

To allow dynamic resizing without the problem mentioned above, we would need 
to invent a new way of handling port ID's. Changing the direct indexing by 
port ID to a more generic method independant of the table size will cause 
an additional overhead in the data path, and I'm not sure it's worth the 
performance penalty to be able to change this limit dynamically. 

//E
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller May 31, 2013, 8:29 a.m. UTC | #3
From: Erik Hugne <erik.hugne@ericsson.com>
Date: Fri, 31 May 2013 10:25:38 +0200

> On Thu, May 30, 2013 at 03:49:25PM -0700, David Miller wrote:
>> View compile time constants and module parameters as artificial
>> limits, they are terrible and unnecessary.
>> 
>> There is no reason you cannot restructure this table so that you
>> can dynamically size it at run time.
> 
> The TIPC ref table index is used directly as the port identity in the 
> TIPC publications. When a socket is bound, this ID is published to all 
> other nodes in the cluster.
> If we where to allow the table to be changed dynamically, we would need
> to change the port identities for already bound sockets/ports, withdraw
> the old identity and publish the new one.

No you do not, simply grow the table just like we dynamically grow
hash tables in response to network/socket activity elsewhere in the
kernel.  You'll only allocate new indexes from the newly allocated
area, the existing indexes will remain the same.

I really will accept no excuses for this limitation, especially if
the response is an ugly module paramter.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Erik Hugne May 31, 2013, 8:34 a.m. UTC | #4
On Fri, May 31, 2013 at 01:29:22AM -0700, David Miller wrote:
> From: Erik Hugne <erik.hugne@ericsson.com>
> Date: Fri, 31 May 2013 10:25:38 +0200
> 
> > On Thu, May 30, 2013 at 03:49:25PM -0700, David Miller wrote:
> >> View compile time constants and module parameters as artificial
> >> limits, they are terrible and unnecessary.
> >> 
> >> There is no reason you cannot restructure this table so that you
> >> can dynamically size it at run time.
> > 
> > The TIPC ref table index is used directly as the port identity in the 
> > TIPC publications. When a socket is bound, this ID is published to all 
> > other nodes in the cluster.
> > If we where to allow the table to be changed dynamically, we would need
> > to change the port identities for already bound sockets/ports, withdraw
> > the old identity and publish the new one.
> 
> No you do not, simply grow the table just like we dynamically grow
> hash tables in response to network/socket activity elsewhere in the
> kernel.  You'll only allocate new indexes from the newly allocated
> area, the existing indexes will remain the same.

And if someone tries to reduce the table size?
Should we simply disallow that?

//E
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller May 31, 2013, 8:40 a.m. UTC | #5
From: Erik Hugne <erik.hugne@ericsson.com>
Date: Fri, 31 May 2013 10:34:55 +0200

> On Fri, May 31, 2013 at 01:29:22AM -0700, David Miller wrote:
>> From: Erik Hugne <erik.hugne@ericsson.com>
>> Date: Fri, 31 May 2013 10:25:38 +0200
>> 
>> > On Thu, May 30, 2013 at 03:49:25PM -0700, David Miller wrote:
>> >> View compile time constants and module parameters as artificial
>> >> limits, they are terrible and unnecessary.
>> >> 
>> >> There is no reason you cannot restructure this table so that you
>> >> can dynamically size it at run time.
>> > 
>> > The TIPC ref table index is used directly as the port identity in the 
>> > TIPC publications. When a socket is bound, this ID is published to all 
>> > other nodes in the cluster.
>> > If we where to allow the table to be changed dynamically, we would need
>> > to change the port identities for already bound sockets/ports, withdraw
>> > the old identity and publish the new one.
>> 
>> No you do not, simply grow the table just like we dynamically grow
>> hash tables in response to network/socket activity elsewhere in the
>> kernel.  You'll only allocate new indexes from the newly allocated
>> area, the existing indexes will remain the same.
> 
> And if someone tries to reduce the table size?
> Should we simply disallow that?

We never shrink the hash tables once we've grown them.  That's a
reasonable way to behave.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight May 31, 2013, 9:06 a.m. UTC | #6
> > View compile time constants and module parameters as artificial
> > limits, they are terrible and unnecessary.
> >
> > There is no reason you cannot restructure this table so that you
> > can dynamically size it at run time.
> 
> The TIPC ref table index is used directly as the port identity in the
> TIPC publications. When a socket is bound, this ID is published to all
> other nodes in the cluster.
> If we where to allow the table to be changed dynamically, we would need
> to change the port identities for already bound sockets/ports, withdraw
> the old identity and publish the new one.
> In the best case, this will lead to a temporary interruption for all
> TIPC services until the new port ID's have been propagated out to the cluster.

Eh?
Doubling the size of the array doesn't require that the old index
be invalidated. At most it requires an rcu before the old index
array is discarded.

If you use the low bits of the port indentity (32bits ?) as the
table index and the higher bits as a seq value to identify stale
references then you need to distribute the old entries into the
correct places in the new table.
This isn't hard and there could be generic 'reference allocator'
that would do this for you.

	David



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Erik Hugne May 31, 2013, 9:23 a.m. UTC | #7
On Fri, May 31, 2013 at 01:40:28AM -0700, David Miller wrote:
> We never shrink the hash tables once we've grown them.  That's a
> reasonable way to behave.

Very well.
Currently, the refs/portID's are built up of a random part, and an index
part. The index is obtained by masking against the table size.
If the table is allowed to grow, we must remove the random part from the 
portID. Otherwise we would index out a nonexistant or wrong port
from a received packet..

But removing the random part would be a violation of the protocol spec, and
potentially break interop between other implementations (like link selection..)

//E
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Laight May 31, 2013, 9:25 a.m. UTC | #8
> > We never shrink the hash tables once we've grown them.  That's a
> > reasonable way to behave.
> 
> Very well.
> Currently, the refs/portID's are built up of a random part, and an index
> part. The index is obtained by masking against the table size.
> If the table is allowed to grow, we must remove the random part from the
> portID. Otherwise we would index out a nonexistant or wrong port
> from a received packet..

No - you just copy the entry into the correct location in the
new array.
One of the 'random' bits becomes an 'index' bit.

	David



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller May 31, 2013, 9:26 a.m. UTC | #9
From: Erik Hugne <erik.hugne@ericsson.com>
Date: Fri, 31 May 2013 11:23:02 +0200

> On Fri, May 31, 2013 at 01:40:28AM -0700, David Miller wrote:
>> We never shrink the hash tables once we've grown them.  That's a
>> reasonable way to behave.
> 
> Very well.
> Currently, the refs/portID's are built up of a random part, and an index
> part. The index is obtained by masking against the table size.
> If the table is allowed to grow, we must remove the random part from the 
> portID. Otherwise we would index out a nonexistant or wrong port
> from a received packet..
> 
> But removing the random part would be a violation of the protocol spec, and
> potentially break interop between other implementations (like link selection..)

We allocate ports randomly for IPV4/IPV6 sockets, you can just fine too.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paul Gortmaker May 31, 2013, 5:48 p.m. UTC | #10
On 13-05-30 06:49 PM, David Miller wrote:
> From: Paul Gortmaker <paul.gortmaker@windriver.com>
> Date: Thu, 30 May 2013 15:36:07 -0400
> 
>> From: Erik Hugne <erik.hugne@ericsson.com>
>>
>> Introduce the "max_ports" module parameter, which allows the maximum
>> number of ports supported by TIPC to be changed from the default value
>> at boot, or at module load time. Because of the way the port reference
>> table is structured and initiated, this value must be known at module
>> start time, and can not be changed later.
>>
>> Until now this value has been set via a macro, and hence things
>> have to be recompiled if the value is to be changed. The Kconfig
>> knob and the dead code intended to change this parameter at runtime
>> are dropped.
>>
>> Considering TIPC node addresses are unique on the entire node, the
>> 64k port limit has proven to be a little too strict.  We increase the
>> allowed max to 128k. This is safe since the protocol headers allow
>> for up to 2^32 -1 ports.
>>
>> Usage for module: "insmod tipc.ko max_ports=<value>" ; at boot, append
>> "tipc.max_ports=<value>" to the kernel command line.
>>
>> Signed-off-by: Erik Hugne <erik.hugne@ericsson.com>
>> Reviewed-by: Jon Maloy <jon.maloy@ericsson.com>
>> Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
> 
> View compile time constants and module parameters as artificial
> limits, they are terrible and unnecessary.

I can't argue with that; I was thinking that the module param was
better than a recompile, but as you say, not having it at all is
yet better again.  I'll drop this patch, and if the reimplementation
isn't ready before 3.10-rc6, I'll just resend the series without it.

Thanks,
Paul.
--

> 
> There is no reason you cannot restructure this table so that you
> can dynamically size it at run time.
> 
> Please reimplement it in that way.
> 
> Thanks.
> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/tipc/Kconfig b/net/tipc/Kconfig
index c890848..91c8a8e 100644
--- a/net/tipc/Kconfig
+++ b/net/tipc/Kconfig
@@ -20,18 +20,6 @@  menuconfig TIPC
 
 	  If in doubt, say N.
 
-config TIPC_PORTS
-	int "Maximum number of ports in a node"
-	depends on TIPC
-	range 127 65535
-	default "8191"
-	help
-	  Specifies how many ports can be supported by a node.
-	  Can range from 127 to 65535 ports; default is 8191.
-
-	  Setting this to a smaller value saves some memory,
-	  setting it to higher allows for more ports.
-
 config TIPC_MEDIA_IB
 	bool "InfiniBand media type support"
 	depends on TIPC && INFINIBAND_IPOIB
diff --git a/net/tipc/config.c b/net/tipc/config.c
index f67866c..79cada1 100644
--- a/net/tipc/config.c
+++ b/net/tipc/config.c
@@ -208,22 +208,6 @@  static struct sk_buff *cfg_set_remote_mng(void)
 	return tipc_cfg_reply_none();
 }
 
-static struct sk_buff *cfg_set_max_ports(void)
-{
-	u32 value;
-
-	if (!TLV_CHECK(req_tlv_area, req_tlv_space, TIPC_TLV_UNSIGNED))
-		return tipc_cfg_reply_error_string(TIPC_CFG_TLV_ERROR);
-	value = ntohl(*(__be32 *)TLV_DATA(req_tlv_area));
-	if (value == tipc_max_ports)
-		return tipc_cfg_reply_none();
-	if (value < 127 || value > 65535)
-		return tipc_cfg_reply_error_string(TIPC_CFG_INVALID_VALUE
-						   " (max ports must be 127-65535)");
-	return tipc_cfg_reply_error_string(TIPC_CFG_NOT_SUPPORTED
-		" (cannot change max ports while TIPC is active)");
-}
-
 static struct sk_buff *cfg_set_netid(void)
 {
 	u32 value;
@@ -324,9 +308,6 @@  struct sk_buff *tipc_cfg_do_cmd(u32 orig_node, u16 cmd, const void *request_area
 	case TIPC_CMD_SET_REMOTE_MNG:
 		rep_tlv_buf = cfg_set_remote_mng();
 		break;
-	case TIPC_CMD_SET_MAX_PORTS:
-		rep_tlv_buf = cfg_set_max_ports();
-		break;
 	case TIPC_CMD_SET_NETID:
 		rep_tlv_buf = cfg_set_netid();
 		break;
@@ -356,6 +337,7 @@  struct sk_buff *tipc_cfg_do_cmd(u32 orig_node, u16 cmd, const void *request_area
 	case TIPC_CMD_SET_MAX_PUBL:
 	case TIPC_CMD_GET_MAX_PUBL:
 	case TIPC_CMD_SET_LOG_SIZE:
+	case TIPC_CMD_SET_MAX_PORTS:
 	case TIPC_CMD_DUMP_LOG:
 		rep_tlv_buf = tipc_cfg_reply_error_string(TIPC_CFG_NOT_SUPPORTED
 							  " (obsolete command)");
diff --git a/net/tipc/core.c b/net/tipc/core.c
index 7ec2c1e..f8abe8e 100644
--- a/net/tipc/core.c
+++ b/net/tipc/core.c
@@ -47,10 +47,11 @@  int tipc_random __read_mostly;
 
 /* configurable TIPC parameters */
 u32 tipc_own_addr __read_mostly;
-int tipc_max_ports __read_mostly;
+unsigned int tipc_max_ports __read_mostly;
 int tipc_net_id __read_mostly;
 int tipc_remote_management __read_mostly;
 
+static unsigned int max_ports = TIPC_DEFAULT_PORTS;
 
 /**
  * tipc_buf_acquire - creates a TIPC message buffer
@@ -157,7 +158,8 @@  static int __init tipc_init(void)
 
 	tipc_own_addr = 0;
 	tipc_remote_management = 1;
-	tipc_max_ports = CONFIG_TIPC_PORTS;
+	tipc_max_ports = clamp_t(unsigned int, max_ports,
+				 TIPC_MIN_PORTS, TIPC_MAX_PORTS);
 	tipc_net_id = 4711;
 
 	res = tipc_core_start();
@@ -181,3 +183,6 @@  module_exit(tipc_exit);
 MODULE_DESCRIPTION("TIPC: Transparent Inter Process Communication");
 MODULE_LICENSE("Dual BSD/GPL");
 MODULE_VERSION(TIPC_MOD_VER);
+
+module_param(max_ports, uint, S_IRUGO);
+MODULE_PARM_DESC(max_ports, "Maximum number of ports (127 - 128K)");
diff --git a/net/tipc/core.h b/net/tipc/core.h
index 0207db0..9d6a47e 100644
--- a/net/tipc/core.h
+++ b/net/tipc/core.h
@@ -63,6 +63,9 @@ 
 #define ULTRA_STRING_MAX_LEN	32768
 #define TIPC_MAX_SUBSCRIPTIONS	65535
 #define TIPC_MAX_PUBLICATIONS	65535
+#define TIPC_DEFAULT_PORTS	8192
+#define TIPC_MIN_PORTS		127
+#define TIPC_MAX_PORTS		131072
 
 struct tipc_msg;	/* msg.h */
 
@@ -77,7 +80,7 @@  int tipc_snprintf(char *buf, int len, const char *fmt, ...);
  * Global configuration variables
  */
 extern u32 tipc_own_addr __read_mostly;
-extern int tipc_max_ports __read_mostly;
+extern unsigned int tipc_max_ports __read_mostly;
 extern int tipc_net_id __read_mostly;
 extern int tipc_remote_management __read_mostly;