diff mbox

netconsole: implement ipv4 tos support

Message ID 1322702330-31325-1-git-send-email-zenczykowski@gmail.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Maciej Żenczykowski Dec. 1, 2011, 1:18 a.m. UTC
From: Maciej Żenczykowski <maze@google.com>

Signed-off-by: Maciej Żenczykowski <maze@google.com>
---
 Documentation/networking/netconsole.txt |    1 +
 drivers/net/netconsole.c                |   28 ++++++++++++++++++++++++++++
 include/linux/netpoll.h                 |    1 +
 net/core/netpoll.c                      |    4 +++-
 4 files changed, 33 insertions(+), 1 deletions(-)

Comments

stephen hemminger Dec. 1, 2011, 1:34 a.m. UTC | #1
On Wed, 30 Nov 2011 17:18:50 -0800
Maciej Żenczykowski <zenczykowski@gmail.com> wrote:

> From: Maciej Żenczykowski <maze@google.com>
> 
> Signed-off-by: Maciej Żenczykowski <maze@google.com>

Why make it an option? TOS should always be set to interactive
traffic (like telnet and slogin)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Maciej Żenczykowski Dec. 1, 2011, 2:11 a.m. UTC | #2
> Why make it an option? TOS should always be set to interactive
> traffic (like telnet and slogin)

Because 'interactive' doesn't mean anything.  You don't know what
value of tos defines 'interactive' traffic on my network.
I may also consider network console/debug/dump/etc traffic less
important then say serving web traffic, and thus not even want it to
be considered 'interactive' in the first place.

TOS values are relevant within a LAN/WAN/Organization/AS, but are not
relevant internet-wide.
Almost all AS-boundary gateways/routers will do TOS remarking to their
own internal specifications.

- Maciej
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
stephen hemminger Dec. 1, 2011, 2:27 a.m. UTC | #3
On Wed, 30 Nov 2011 18:11:32 -0800
Maciej Żenczykowski <zenczykowski@gmail.com> wrote:

> > Why make it an option? TOS should always be set to interactive
> > traffic (like telnet and slogin)
> 
> Because 'interactive' doesn't mean anything.  You don't know what
> value of tos defines 'interactive' traffic on my network.
> I may also consider network console/debug/dump/etc traffic less
> important then say serving web traffic, and thus not even want it to
> be considered 'interactive' in the first place.
> 
> TOS values are relevant within a LAN/WAN/Organization/AS, but are not
> relevant internet-wide.
> Almost all AS-boundary gateways/routers will do TOS remarking to their
> own internal specifications.
> 
> - Maciej

Giving the user choice in general is good, but network configuration
is already confusing enough.

Although interpretation of TOS is per organization, in practice the
values are standardized in places like RFC4594. For example, routing protocols
all use IPTOS_PREC_INTERNETCONTROL = 0xc0


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Dec. 1, 2011, 3:37 a.m. UTC | #4
From: Stephen Hemminger <shemminger@vyatta.com>
Date: Wed, 30 Nov 2011 17:34:35 -0800

> On Wed, 30 Nov 2011 17:18:50 -0800
> Maciej Żenczykowski <zenczykowski@gmail.com> wrote:
> 
>> From: Maciej Żenczykowski <maze@google.com>
>> 
>> Signed-off-by: Maciej Żenczykowski <maze@google.com>
> 
> Why make it an option? TOS should always be set to interactive
> traffic (like telnet and slogin)

Next they will ask for TTL as well.

Nope, sorry, we'll not be turning netconsole into a cluster-f*ck of
every configuration option anyone can come up with.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Eric Dumazet Dec. 1, 2011, 3:50 a.m. UTC | #5
Le mercredi 30 novembre 2011 à 18:11 -0800, Maciej Żenczykowski a
écrit :

> TOS values are relevant within a LAN/WAN/Organization/AS, but are not
> relevant internet-wide.
> Almost all AS-boundary gateways/routers will do TOS remarking to their
> own internal specifications.

Thats a shame, especially if ECN is screwed up.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Maciej Żenczykowski Dec. 1, 2011, 8:59 a.m. UTC | #6
> Although interpretation of TOS is per organization, in practice the

I think you meant 'in theory' :-)

> values are standardized in places like RFC4594. For example, routing protocols
> all use IPTOS_PREC_INTERNETCONTROL = 0xc0

Heh, ran into a bug based on this assumption just earlier today.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Maciej Żenczykowski Dec. 1, 2011, 9:04 a.m. UTC | #7
> Next they will ask for TTL as well.

Interesting suggestion, could certainly see the use, but not coming
from me... :-)

> Nope, sorry, we'll not be turning netconsole into a cluster-f*ck of
> every configuration option anyone can come up with.

More configurability is in general not a bad thing, provided that sane
defaults are provided.
Especially when the cost is small (which I'd say is the case here).

--

Side note:

The Linux kernel is in general very inconsistent in how it handles tos.
There are still tons of places where 'TOS & 0x1E' is used which has
very little meaning (long obsoleted RFC).

IPv4 TOS and IPv6 TCLASS are also not handled similarly - for example
setsockopt(SOL_IP, IP_TOS) will also update sk->sk_priority (according
to an obsolete mapping), but setsockopt(SOL_IPV6, IPV6_TCLASS) doesn't
touch sk->sk_priority.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Maciej Żenczykowski Dec. 1, 2011, 9:07 a.m. UTC | #8
> Thats a shame, especially if ECN is screwed up.

The ECN bits will in general not be touched (or will even be correctly
updated as required for ECN to function).

However, standard ECN isn't actually particularly usable really anyway
(see Microsoft's DC TCP paper [I believe] for a better alternative).

(There are of course still places in the internet - ran into one at a
motel just recently - where ECN is just plain outright blackholed)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Maciej Żenczykowski Dec. 1, 2011, 9:25 a.m. UTC | #9
>> Although interpretation of TOS is per organization, in practice the
>> values are standardized in places like RFC4594. For example, routing protocols
>> all use IPTOS_PREC_INTERNETCONTROL = 0xc0

One more thing should probably be mentioned here.
TOS interpretation is directly influenced by low level hardware
aspects of an organization's network.
Things like the number of hw priority queues and their scheduling/drop
types supported by all hosts, switches and routers across the TOS
domain.  What sort of prioritization can be done, how to map these
levels onto older hardware which supports only a subset of features
(or a smaller number of queues).  Etc.

Furthermore as we well know TOS isn't just a linear priority, from
lowest to highest.  There are very different aspects one may desire.
Latency (web pages), Jitter (voip), Packet loss (kernel crash dumps),
Monetary cost (data copies), Bandwith (video), Failure domains (ie.
presence of single points of failure), Seperation of loss-tolerant
from loss-intolerant traffic, etc, are all still very much desirable
in certain circumstances.

Furthermore it may be desirable to have tos have many unusual
behaviours - for example tos priority increase post traversal of
expensive links (for example cross-oceanic links) - since it would be
a pity to drop traffic close to its destination in favour of traffic
which is still close to its source.

Different people/orgs are bound to care about different aspects (axes).

Anyway... enough beating a dead horse.

- Maciej
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/Documentation/networking/netconsole.txt b/Documentation/networking/netconsole.txt
index 8d02207..a02374f 100644
--- a/Documentation/networking/netconsole.txt
+++ b/Documentation/networking/netconsole.txt
@@ -94,6 +94,7 @@  The interface exposes these parameters of a netconsole target to userspace:
 	remote_ip	Remote agent's IP address		(read-write)
 	local_mac	Local interface's MAC address		(read-only)
 	remote_mac	Remote agent's MAC address		(read-write)
+	tos		TOS byte value to utilize		(read-write)
 
 The "enabled" attribute is also used to control whether the parameters of
 a target can be updated or not -- you can modify the parameters of only
diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index e888202..d24d5df 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -90,6 +90,7 @@  static DEFINE_SPINLOCK(target_list_lock);
  *		remote_ip	(read-write)
  *		local_mac	(read-only)
  *		remote_mac	(read-write)
+ *		tos		(read-write)
  */
 struct netconsole_target {
 	struct list_head	list;
@@ -221,6 +222,7 @@  static void free_param_target(struct netconsole_target *nt)
  *				|	remote_ip
  *				|	local_mac
  *				|	remote_mac
+ *				|	tos
  *				|
  *				<target>/...
  */
@@ -288,6 +290,11 @@  static ssize_t show_remote_mac(struct netconsole_target *nt, char *buf)
 	return snprintf(buf, PAGE_SIZE, "%pM\n", nt->np.remote_mac);
 }
 
+static ssize_t show_tos(struct netconsole_target *nt, char *buf)
+{
+	return snprintf(buf, PAGE_SIZE, "%d\n", nt->np.tos);
+}
+
 /*
  * This one is special -- targets created through the configfs interface
  * are not enabled (and the corresponding netpoll activated) by default.
@@ -451,6 +458,25 @@  static ssize_t store_remote_mac(struct netconsole_target *nt,
 	return strnlen(buf, count);
 }
 
+static ssize_t store_tos(struct netconsole_target *nt,
+			 const char *buf,
+			 size_t count)
+{
+	int rv;
+
+	if (nt->enabled) {
+		printk(KERN_ERR "netconsole: target (%s) is enabled, "
+				"disable to update parameters\n",
+				config_item_name(&nt->item));
+		return -EINVAL;
+	}
+
+	rv = kstrtou8(buf, 10, &nt->np.tos);
+	if (rv < 0)
+		return rv;
+	return strnlen(buf, count);
+}
+
 /*
  * Attribute definitions for netconsole_target.
  */
@@ -471,6 +497,7 @@  NETCONSOLE_TARGET_ATTR_RW(local_ip);
 NETCONSOLE_TARGET_ATTR_RW(remote_ip);
 NETCONSOLE_TARGET_ATTR_RO(local_mac);
 NETCONSOLE_TARGET_ATTR_RW(remote_mac);
+NETCONSOLE_TARGET_ATTR_RW(tos);
 
 static struct configfs_attribute *netconsole_target_attrs[] = {
 	&netconsole_target_enabled.attr,
@@ -481,6 +508,7 @@  static struct configfs_attribute *netconsole_target_attrs[] = {
 	&netconsole_target_remote_ip.attr,
 	&netconsole_target_local_mac.attr,
 	&netconsole_target_remote_mac.attr,
+	&netconsole_target_tos.attr,
 	NULL,
 };
 
diff --git a/include/linux/netpoll.h b/include/linux/netpoll.h
index 5dfa091..d659c8b 100644
--- a/include/linux/netpoll.h
+++ b/include/linux/netpoll.h
@@ -21,6 +21,7 @@  struct netpoll {
 	__be32 local_ip, remote_ip;
 	u16 local_port, remote_port;
 	u8 remote_mac[ETH_ALEN];
+	u8 tos;
 
 	struct list_head rx; /* rx_np list element */
 };
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 0d38808..b01ce07 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -388,7 +388,7 @@  void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
 
 	/* iph->version = 4; iph->ihl = 5; */
 	put_unaligned(0x45, (unsigned char *)iph);
-	iph->tos      = 0;
+	iph->tos      = np->tos;
 	put_unaligned(htons(ip_len), &(iph->tot_len));
 	iph->id       = 0;
 	iph->frag_off = 0;
@@ -639,6 +639,8 @@  void netpoll_print_options(struct netpoll *np)
 			 np->name, &np->remote_ip);
 	printk(KERN_INFO "%s: remote ethernet address %pM\n",
 	                 np->name, np->remote_mac);
+	printk(KERN_INFO "%s: tos value %d (0x%02x)\n",
+	                 np->name, np->tos, np->tos);
 }
 EXPORT_SYMBOL(netpoll_print_options);