Patchwork net: fix /proc/net/ip_mr_cache display

login
register
mail settings
Submitter Benjamin Thery
Date Dec. 1, 2008, 4:02 p.m.
Message ID <20081201160214.965805516@localhost.localdomain>
Download mbox | patch
Permalink /patch/11614/
State Superseded
Delegated to: David Miller
Headers show

Comments

Benjamin Thery - Dec. 1, 2008, 4:02 p.m.
/proc/net/ip6_mr_cache seems to display garbage when showing  unresolved
mfc6_cache entries.

$ cat /proc/net/ip6_mr_cache
Group        Origin    Iif    Pkts  Bytes     Wrong  Oifs
ff05::1     2003::1      1        4   4132        2  2:1    3:1  
ff05::3     2003::1  65535      514      2 -559067475
(addresses modified to increase  readability)

The first line is correct. It is a resolved cache entry, 4 packets used it, etc
The second line represents an unresolved entry, and the columns Pkts(4th),
Bytes(5th) and Wrong(6th) just show garbage.

In struct mfc6_cache, there's an union to store data for resolved and
unresolved cases. And what ipmr_mfc_seq_show() is printing in these 
columns for the unresolved entries is some bytes from mfc6_cache.mfc_un.res.
Bad.
(eg. In our case -559067475 is in fact 0xdead4ead which is the spinlock
magic from mfc6_cache.mfc_un.unres.unresolved.lock.magic).

This patch only prints pkt, bytes and wrong_if when its relevant, ie. 
when showing a resolved cache entry.

Also, mfc->mfc_un.res.pkt, mfc->mfc_un.res.bytes, mfc->mfc_un.res.wrong_if
are unsigned.

The patch also modifies IPv4 ipmr.c that contains similar code.

This patch applies on top of net-next-2.6.

The patch for net-2.6 is slightly different because of the NIP6_FMT to
%pI6 conversion that was made in the seq_printf.

Regards,
Benjamin

Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
---
 net/ipv4/ipmr.c  |   11 ++++++-----
 net/ipv6/ip6mr.c |   11 ++++++-----
 2 files changed, 12 insertions(+), 10 deletions(-)



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Thery - Dec. 1, 2008, 4:49 p.m.
Argh, my patch breaks iproute2 command: ip mroute show

iproute2 uses sscanf() to read /proc/net/ip_mr_cache and expects 6
fields to be read. For the unresolved entries my patch only displays
the first three fields. As a consequence, 'ip mroute show' skips the
unresolved entries. :(

So we can
- either forget this patch and keep displaying garbage information
   in /proc/net/ip[6]_mr_cache for the unresolved entries. (No one
   complained before)
- or may be we can just print '-' or 0 for the fields that have no
   data associated in the unresolved case.

Tell me what you prefer.

Benjamin

Benjamin Thery wrote:
> /proc/net/ip6_mr_cache seems to display garbage when showing  unresolved
> mfc6_cache entries.
> 
> $ cat /proc/net/ip6_mr_cache
> Group        Origin    Iif    Pkts  Bytes     Wrong  Oifs
> ff05::1     2003::1      1        4   4132        2  2:1    3:1  
> ff05::3     2003::1  65535      514      2 -559067475
> (addresses modified to increase  readability)
> 
> The first line is correct. It is a resolved cache entry, 4 packets used it, etc
> The second line represents an unresolved entry, and the columns Pkts(4th),
> Bytes(5th) and Wrong(6th) just show garbage.
> 
> In struct mfc6_cache, there's an union to store data for resolved and
> unresolved cases. And what ipmr_mfc_seq_show() is printing in these 
> columns for the unresolved entries is some bytes from mfc6_cache.mfc_un.res.
> Bad.
> (eg. In our case -559067475 is in fact 0xdead4ead which is the spinlock
> magic from mfc6_cache.mfc_un.unres.unresolved.lock.magic).
> 
> This patch only prints pkt, bytes and wrong_if when its relevant, ie. 
> when showing a resolved cache entry.
> 
> Also, mfc->mfc_un.res.pkt, mfc->mfc_un.res.bytes, mfc->mfc_un.res.wrong_if
> are unsigned.
> 
> The patch also modifies IPv4 ipmr.c that contains similar code.
> 
> This patch applies on top of net-next-2.6.
> 
> The patch for net-2.6 is slightly different because of the NIP6_FMT to
> %pI6 conversion that was made in the seq_printf.
> 
> Regards,
> Benjamin
> 
> Signed-off-by: Benjamin Thery <benjamin.thery@bull.net>
> ---
>  net/ipv4/ipmr.c  |   11 ++++++-----
>  net/ipv6/ip6mr.c |   11 ++++++-----
>  2 files changed, 12 insertions(+), 10 deletions(-)
> 
> Index: net-next-2.6/net/ipv4/ipmr.c
> ===================================================================
> --- net-next-2.6.orig/net/ipv4/ipmr.c
> +++ net-next-2.6/net/ipv4/ipmr.c
> @@ -1879,15 +1879,16 @@ static int ipmr_mfc_seq_show(struct seq_
>  		const struct mfc_cache *mfc = v;
>  		const struct ipmr_mfc_iter *it = seq->private;
>  
> -		seq_printf(seq, "%08lX %08lX %-3d %8ld %8ld %8ld",
> +		seq_printf(seq, "%08lX %08lX %-3d",
>  			   (unsigned long) mfc->mfc_mcastgrp,
>  			   (unsigned long) mfc->mfc_origin,
> -			   mfc->mfc_parent,
> -			   mfc->mfc_un.res.pkt,
> -			   mfc->mfc_un.res.bytes,
> -			   mfc->mfc_un.res.wrong_if);
> +			   mfc->mfc_parent);
>  
>  		if (it->cache != &mfc_unres_queue) {
> +			seq_printf(seq, " %8lu %8lu %8lu",
> +				   mfc->mfc_un.res.pkt,
> +				   mfc->mfc_un.res.bytes,
> +				   mfc->mfc_un.res.wrong_if);
>  			for (n = mfc->mfc_un.res.minvif;
>  			     n < mfc->mfc_un.res.maxvif; n++ ) {
>  				if (VIF_EXISTS(n)
> Index: net-next-2.6/net/ipv6/ip6mr.c
> ===================================================================
> --- net-next-2.6.orig/net/ipv6/ip6mr.c
> +++ net-next-2.6/net/ipv6/ip6mr.c
> @@ -297,14 +297,15 @@ static int ipmr_mfc_seq_show(struct seq_
>  		const struct mfc6_cache *mfc = v;
>  		const struct ipmr_mfc_iter *it = seq->private;
>  
> -		seq_printf(seq, "%pI6 %pI6 %-3d %8ld %8ld %8ld",
> +		seq_printf(seq, "%pI6 %pI6 %-3d",
>  			   &mfc->mf6c_mcastgrp, &mfc->mf6c_origin,
> -			   mfc->mf6c_parent,
> -			   mfc->mfc_un.res.pkt,
> -			   mfc->mfc_un.res.bytes,
> -			   mfc->mfc_un.res.wrong_if);
> +			   mfc->mf6c_parent);
>  
>  		if (it->cache != &mfc_unres_queue) {
> +			seq_printf(seq, " %8lu %8lu %8lu",
> +				   mfc->mfc_un.res.pkt,
> +				   mfc->mfc_un.res.bytes,
> +				   mfc->mfc_un.res.wrong_if);
>  			for (n = mfc->mfc_un.res.minvif;
>  			     n < mfc->mfc_un.res.maxvif; n++) {
>  				if (MIF_EXISTS(n) &&
> 
> 
> 
>
stephen hemminger - Dec. 1, 2008, 5:33 p.m.
On Mon, 01 Dec 2008 17:49:56 +0100
Benjamin Thery <benjamin.thery@bull.net> wrote:

> Argh, my patch breaks iproute2 command: ip mroute show
> 
> iproute2 uses sscanf() to read /proc/net/ip_mr_cache and expects 6
> fields to be read. For the unresolved entries my patch only displays
> the first three fields. As a consequence, 'ip mroute show' skips the
> unresolved entries. :(
> 
> So we can
> - either forget this patch and keep displaying garbage information
>    in /proc/net/ip[6]_mr_cache for the unresolved entries. (No one
>    complained before)
> - or may be we can just print '-' or 0 for the fields that have no
>    data associated in the unresolved case.
> 
> Tell me what you prefer.
> 
> Benjamin

/proc file formats are part of the Linux ABI and can't change!
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Thery - Dec. 1, 2008, 8:17 p.m.
Stephen Hemminger <shemminger@vyatta.com> a écrit :

> On Mon, 01 Dec 2008 17:49:56 +0100
> Benjamin Thery <benjamin.thery@bull.net> wrote:
>
>> Argh, my patch breaks iproute2 command: ip mroute show
>>
>> iproute2 uses sscanf() to read /proc/net/ip_mr_cache and expects 6
>> fields to be read. For the unresolved entries my patch only displays
>> the first three fields. As a consequence, 'ip mroute show' skips the
>> unresolved entries. :(
>>
>> So we can
>> - either forget this patch and keep displaying garbage information
>>    in /proc/net/ip[6]_mr_cache for the unresolved entries. (No one
>>    complained before)
>> - or may be we can just print '-' or 0 for the fields that have no
>>    data associated in the unresolved case.
>>
>> Tell me what you prefer.
>>
>> Benjamin
>
> /proc file formats are part of the Linux ABI and can't change!

Yes, of course I know that. But as in this case part of the data
displayed in /proc/net/ip_mr_cache and /proc/net/ip6_mr_cache for
unresolved cache entries are complete rubbish (see the patch
description), I thought it can be improved.

The right way to fix it, IMHO, is to print 0 (zero) in the columns
that have no meaning for the unresolved entries. That way we don't
break the ABI: the userspace expects to get at least 6 numbers for
each entries, it gets 6 numbers. It's easy to figure what zeros
represent and this prevent people from wasting time trying to figure
what to do with these "random" numbers on the unresolved entries, no?


Regards,
Benjamin




>
>



----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller - Dec. 2, 2008, 11:03 p.m.
From: "Benjamin Thery " <Benjamin.Thery@bull.net>
Date: Mon, 01 Dec 2008 21:17:02 +0100

> The right way to fix it, IMHO, is to print 0 (zero) in the columns
> that have no meaning for the unresolved entries. That way we don't
> break the ABI: the userspace expects to get at least 6 numbers for
> each entries, it gets 6 numbers. It's easy to figure what zeros
> represent and this prevent people from wasting time trying to figure
> what to do with these "random" numbers on the unresolved entries, no?

Probably, this is correct.

However, we could run into problems if userland parsers expect
6 entries and then expect an immediate newline.  We'd break that.

The only thing that really works for extending files like this
is if they are already exporting a "key: value" interface, then
you can add new lines safely.

Doing this horizontal expansion as you are proposing here is,
on the other hand, very risky and dangerous.

I really don't think it's worth it.

Fix the garbage values, or flush them to zero if we can't
represent them properly.  But don't add new stuff horizontally
to "fill in the gaps", as I don't think it can be done %100
safely.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Benjamin Thery - Dec. 3, 2008, 1:48 p.m.
David Miller wrote:
> From: "Benjamin Thery " <Benjamin.Thery@bull.net>
> Date: Mon, 01 Dec 2008 21:17:02 +0100
> 
>> The right way to fix it, IMHO, is to print 0 (zero) in the columns
>> that have no meaning for the unresolved entries. That way we don't
>> break the ABI: the userspace expects to get at least 6 numbers for
>> each entries, it gets 6 numbers. It's easy to figure what zeros
>> represent and this prevent people from wasting time trying to figure
>> what to do with these "random" numbers on the unresolved entries, no?
> 
> Probably, this is correct.
> 
> However, we could run into problems if userland parsers expect
> 6 entries and then expect an immediate newline.  We'd break that.
> 
> The only thing that really works for extending files like this
> is if they are already exporting a "key: value" interface, then
> you can add new lines safely.
> 
> Doing this horizontal expansion as you are proposing here is,
> on the other hand, very risky and dangerous.
 >
 > I really don't think it's worth it.
 >
 > Fix the garbage values, or flush them to zero if we can't
 > represent them properly.  But don't add new stuff horizontally
 > to "fill in the gaps", as I don't think it can be done %100
 > safely.

I won't do any horizontal expansion.
Maybe my explanations where not very clear (due to my bad english I
guess). Let's try to reformulate :)

Today, /proc/net/ip_mr_cache output looks like this:

[root@qemu tests]# cat /proc/net/ip_mr_cache
Group    Origin   Iif     Pkts    Bytes    Wrong Oifs
014C00EF 010014AC 1         10    10050        0  2:1    3:1
024C00EF 010014AC 65535      514        2 -559067475

It displays resolved and unresolved mfc_caches.
Both type of entries have at least 6 fields printed (resolved have
more).
1st line is a resolved cache entry.
2nd line is an unresolved cache entry.

In the case of unresolved entries, columns Pkts, Bytes and Wrong
displays garbage data (due to the union in struct mfc_cache).

My first patch was completely wrong as it suppressed these fields
for the unresolved lines, but my proposal now is _only_ to replace the
garbage values with zeros. Like this:

[root@qemu tests]# cat /proc/net/ip_mr_cache
Group    Origin   Iif     Pkts    Bytes    Wrong Oifs
014C00EF 010014AC 1          8     8040        0  2:1    3:1
024C00EF 010014AC 65535      0        0        0

It shouldn't break the ABI.
I will send an updated patch for this.



Also, there is another bug.
Today, iproute2 already fails to show unresolved entries because it
expects to see -1 in 'Iif' column to identify unresolved entries but the
kernel output 65535. It's a signed/unsigned issue:

'Iif', the source interface, is retrieved from member mfc_parent in
struct mfc_cache. mfc_parent is a vifi_t: unsigned short, but is
displayed in ipmr_mfc_seq_show() as "%-3d", signed integer.

In unresolevd entries, the 65535 value (0xFFFF) comes from this define:
#define ALL_VIFS	((vifi_t)(-1))

That may explains why the guy who added support for this in iproute2
thought a -1 should be expected.

I don't know if this must be fixed in kernel or in iproute2. Who is
right?

I will send another patch for this one, which fix 'Iif' display
and let you or Stephen decide if it should be merged in kernel
or fixed in iproute2.

Regards,
Benjamin

Patch

Index: net-next-2.6/net/ipv4/ipmr.c
===================================================================
--- net-next-2.6.orig/net/ipv4/ipmr.c
+++ net-next-2.6/net/ipv4/ipmr.c
@@ -1879,15 +1879,16 @@  static int ipmr_mfc_seq_show(struct seq_
 		const struct mfc_cache *mfc = v;
 		const struct ipmr_mfc_iter *it = seq->private;
 
-		seq_printf(seq, "%08lX %08lX %-3d %8ld %8ld %8ld",
+		seq_printf(seq, "%08lX %08lX %-3d",
 			   (unsigned long) mfc->mfc_mcastgrp,
 			   (unsigned long) mfc->mfc_origin,
-			   mfc->mfc_parent,
-			   mfc->mfc_un.res.pkt,
-			   mfc->mfc_un.res.bytes,
-			   mfc->mfc_un.res.wrong_if);
+			   mfc->mfc_parent);
 
 		if (it->cache != &mfc_unres_queue) {
+			seq_printf(seq, " %8lu %8lu %8lu",
+				   mfc->mfc_un.res.pkt,
+				   mfc->mfc_un.res.bytes,
+				   mfc->mfc_un.res.wrong_if);
 			for (n = mfc->mfc_un.res.minvif;
 			     n < mfc->mfc_un.res.maxvif; n++ ) {
 				if (VIF_EXISTS(n)
Index: net-next-2.6/net/ipv6/ip6mr.c
===================================================================
--- net-next-2.6.orig/net/ipv6/ip6mr.c
+++ net-next-2.6/net/ipv6/ip6mr.c
@@ -297,14 +297,15 @@  static int ipmr_mfc_seq_show(struct seq_
 		const struct mfc6_cache *mfc = v;
 		const struct ipmr_mfc_iter *it = seq->private;
 
-		seq_printf(seq, "%pI6 %pI6 %-3d %8ld %8ld %8ld",
+		seq_printf(seq, "%pI6 %pI6 %-3d",
 			   &mfc->mf6c_mcastgrp, &mfc->mf6c_origin,
-			   mfc->mf6c_parent,
-			   mfc->mfc_un.res.pkt,
-			   mfc->mfc_un.res.bytes,
-			   mfc->mfc_un.res.wrong_if);
+			   mfc->mf6c_parent);
 
 		if (it->cache != &mfc_unres_queue) {
+			seq_printf(seq, " %8lu %8lu %8lu",
+				   mfc->mfc_un.res.pkt,
+				   mfc->mfc_un.res.bytes,
+				   mfc->mfc_un.res.wrong_if);
 			for (n = mfc->mfc_un.res.minvif;
 			     n < mfc->mfc_un.res.maxvif; n++) {
 				if (MIF_EXISTS(n) &&