diff mbox

net: ICMPv6 packets transmitted on wrong interface if nfmark is mangled

Message ID 1354538763-2678-1-git-send-email-dries.dewinter@gmail.com
State Deferred
Headers show

Commit Message

Dries De Winter Dec. 3, 2012, 12:46 p.m. UTC
The IPv6 mangle table may change the source/destination address and skb->mark
of a packet. Therefore it may be necessary to "reroute" a packet after it
traversed this table. But this should not happen for some special packets like
neighbour solicitations and MLD reports: they have an explicit destination, not
originating from the routing table. Rerouting these packets may cause them to
go out on the wrong interface or not to go out at all depending on the routing
table.

This patch allows to mark a dst_entry as "non-reroutable". icmp6_dst_alloc()
(used by ndisc and MLD implementation) will always mark the allocated dst_entry
as such. A check is added to netfilter (IPv6-only) so packets heading for a
non-reroutable destination are never rerouted.

Remarks:

(1) dst entries allocated by addrconf_dst_alloc() are added to the routing
table like normal routes and skbuffs get assigned such dst entries by normal
rule lookup / route lookup. Therefore it's not needed to mark those dst
entries as non-reroutable: if an skbuff got assigned such a dst entry by
normal routing in the first place, and the changes done by the mangle table
don't affect routing, rerouting the packet will get it there too.

(2) Similar logic exists in IPv4 so local multicast/broadcast messages are
potentially transmitted on the wrong interface. However, it's a less likely
corner case there because those packets are treated differently by local
output routing: multicast/broadcast messages are by default routed to the
interface with a matching source IP-address. But this logic is invalid because
it is allowed to (1) send messages with a source IP-address different from
your own and (2) to assign the same IP-address on multiple interfaces.
So ideally in IPv4 some dsts should be marked as non-reroutable as well.

Signed-off-by: Dries De Winter <dries.dewinter@gmail.com>
---
 include/net/dst.h    |    1 +
 net/ipv6/netfilter.c |    4 ++++
 net/ipv6/route.c     |    2 +-
 3 files changed, 6 insertions(+), 1 deletion(-)

Comments

David Miller Dec. 3, 2012, 7:11 p.m. UTC | #1
From: Dries De Winter <dries.dewinter@gmail.com>
Date: Mon,  3 Dec 2012 13:46:03 +0100

> The IPv6 mangle table may change the source/destination address and skb->mark
> of a packet. Therefore it may be necessary to "reroute" a packet after it
> traversed this table. But this should not happen for some special packets like
> neighbour solicitations and MLD reports: they have an explicit destination, not
> originating from the routing table. Rerouting these packets may cause them to
> go out on the wrong interface or not to go out at all depending on the routing
> table.
> 
> This patch allows to mark a dst_entry as "non-reroutable". icmp6_dst_alloc()
> (used by ndisc and MLD implementation) will always mark the allocated dst_entry
> as such. A check is added to netfilter (IPv6-only) so packets heading for a
> non-reroutable destination are never rerouted.
> 
> Remarks:
> 
> (1) dst entries allocated by addrconf_dst_alloc() are added to the routing
> table like normal routes and skbuffs get assigned such dst entries by normal
> rule lookup / route lookup. Therefore it's not needed to mark those dst
> entries as non-reroutable: if an skbuff got assigned such a dst entry by
> normal routing in the first place, and the changes done by the mangle table
> don't affect routing, rerouting the packet will get it there too.
> 
> (2) Similar logic exists in IPv4 so local multicast/broadcast messages are
> potentially transmitted on the wrong interface. However, it's a less likely
> corner case there because those packets are treated differently by local
> output routing: multicast/broadcast messages are by default routed to the
> interface with a matching source IP-address. But this logic is invalid because
> it is allowed to (1) send messages with a source IP-address different from
> your own and (2) to assign the same IP-address on multiple interfaces.
> So ideally in IPv4 some dsts should be marked as non-reroutable as well.
> 
> Signed-off-by: Dries De Winter <dries.dewinter@gmail.com>

Thinking about this some more I can't see how this is correct.

What if netfilter modified one of the keys that go into the route
lookup such as the source or destination address?

That's the whole point of this reroute call.

I'm not applying this, sorry.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dries De Winter Dec. 3, 2012, 9:31 p.m. UTC | #2
2012/12/3 David Miller <davem@davemloft.net>:
> Thinking about this some more I can't see how this is correct.
>
> What if netfilter modified one of the keys that go into the route
> lookup such as the source or destination address?
That is a question I have as well. What if the destination address of
a neighbour solicitation is rewritten to some random unicast address
for example? You could say that in that case indeed the routing table
should be followed. But you could also say that ICMPv6 is a
fundamental part of IPv6 and sending out a neighbour solicitation for
instance on a different interface than the one it is intended for is
wrong. Or you could even say that it is a total non-issue because
rewriting the destination address of ICMPv6 is already wrong in the
first place.

Anyway, what if the source address is modified while there is no
source based routing or skb->mark is modified while there is no policy
based routing? In that case routing is not affected but still the
ICMPv6 packet will go out on a different interface than the one you
would expect. This is because the dst of such packet is special in the
sense that it is not referred to by the routing table, so when
rerouting the packet it is impossible to find back the original
destination.

Not fixing this means that skb->mark is unavailable for use on ICMPv6
packets because it will inevitably put those packets on the wrong
interface. I use skb->mark for QoS, not for routing so I don't expect
the outgoing interface to be affected by my markers. Now that I know
this issue, it is easy enough for me to work around, but I suspect
that I'm not the only one in the world that uses skb->mark for other
purposes than routing. Moreover, the road from seeing a neighbour
solicitation or MLD report going out on the wrong interface to finding
this limitation can be quite painful. Anyway, in the end it's up to
you to decide of course.

Kind regards,

Dries.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Dec. 3, 2012, 10:06 p.m. UTC | #3
From: Dries De Winter <dries.dewinter@gmail.com>
Date: Mon, 3 Dec 2012 22:31:51 +0100

> Not fixing this means that skb->mark is unavailable for use on ICMPv6
> packets because it will inevitably put those packets on the wrong
> interface.

Maybe this suggests that a better fix is to simply explicitly check
for protocol ICMPV6 in ip6_route_me_harder().
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Engelhardt Dec. 3, 2012, 11:38 p.m. UTC | #4
On Monday 2012-12-03 22:31, Dries De Winter wrote:
>
>Not fixing this means that skb->mark is unavailable for use on ICMPv6
>packets because it will inevitably put those packets on the wrong
>interface. [...]
>I use skb->mark for QoS, not for routing so I don't expect
>the outgoing interface to be affected by my markers.

Why would it do that, if one has no routes joined to a fwmark NNN
routing rule?
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Pablo Neira Ayuso Dec. 3, 2012, 11:52 p.m. UTC | #5
On Tue, Dec 04, 2012 at 12:38:25AM +0100, Jan Engelhardt wrote:
> 
> On Monday 2012-12-03 22:31, Dries De Winter wrote:
> >
> >Not fixing this means that skb->mark is unavailable for use on ICMPv6
> >packets because it will inevitably put those packets on the wrong
> >interface. [...]
> >
> >I use skb->mark for QoS, not for routing so I don't expect
> >the outgoing interface to be affected by my markers.
> 
> Why would it do that, if one has no routes joined to a fwmark NNN
> routing rule?

iptables_mangle assumes that ip_route_me_harder needs to be called if
the mark has changed.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dries De Winter Dec. 5, 2012, 1:41 p.m. UTC | #6
2012/12/3 David Miller <davem@davemloft.net>:
> From: Dries De Winter <dries.dewinter@gmail.com>
> Date: Mon, 3 Dec 2012 22:31:51 +0100
>
>> Not fixing this means that skb->mark is unavailable for use on ICMPv6
>> packets because it will inevitably put those packets on the wrong
>> interface.
>
> Maybe this suggests that a better fix is to simply explicitly check
> for protocol ICMPV6 in ip6_route_me_harder().

Hmmm, maybe my subject line isn't very well chosen ... I don't really
mean all ICMPv6 traffic. ICMPv6 also includes Destination Unreachable,
Echo request/reply ... and lots of other types that are just regular
unicast packets that should follow the routing table like normal
packets. My concern is mainly about neighbour discovery and MLD. And
after having thought this over and over again, this discussion could
be extended to all link-local traffic in general, not to ICMPv6 in
general.

Routing doesn't make much sense for link-local traffic and typically
the sender specifies on what interface its data has to go out. In the
example of MLD and neighbour discovery this is done using those
special dst entries. But for userspace this can be done by specifying
a non-zero sin6_scope_id or by SO_BINDTODEVICE which sets
sk_bound_dev_if. When sending a message with a link-local destination,
these parameters are taken into account by routing.
ip6_route_me_harder() also takes sk_bound_dev_if into account, but it
has no access to the user supplied sin6_scope_id, so depending on the
method used, you may also hit this issue from user space.

My "noreroute" patch will not fix this. Therefore it's indeed maybe
better to add a simple check to ip6_route_me_harder(): not a check for
ICMPv6, but a check for (ipv6_addr_type(&iph->daddr) &
IPV6_ADDR_LINKLOCAL) instead. What do you think?

Dries.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Dec. 5, 2012, 5:57 p.m. UTC | #7
From: Dries De Winter <dries.dewinter@gmail.com>
Date: Wed, 5 Dec 2012 14:41:59 +0100

> My "noreroute" patch will not fix this. Therefore it's indeed maybe
> better to add a simple check to ip6_route_me_harder(): not a check for
> ICMPv6, but a check for (ipv6_addr_type(&iph->daddr) &
> IPV6_ADDR_LINKLOCAL) instead. What do you think?

What if a packet is rewritten from a non-link-local destination address
into a link-local one?  Or vice versa?

Your test will fail in those cases.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dries De Winter Dec. 6, 2012, 9:11 a.m. UTC | #8
2012/12/5 David Miller <davem@davemloft.net>:
> From: Dries De Winter <dries.dewinter@gmail.com>
> Date: Wed, 5 Dec 2012 14:41:59 +0100
>
>> My "noreroute" patch will not fix this. Therefore it's indeed maybe
>> better to add a simple check to ip6_route_me_harder(): not a check for
>> ICMPv6, but a check for (ipv6_addr_type(&iph->daddr) &
>> IPV6_ADDR_LINKLOCAL) instead. What do you think?
>
> What if a packet is rewritten from a non-link-local destination address
> into a link-local one?  Or vice versa?
>
> Your test will fail in those cases.

You are saying that the decision should be based on the original
destination address rather the modified one? I would say the opposite:

- If a non-link-local destination is changed into a link-local one, it
should certainly not be rerouted because routing doesn't make much
sense for link-local destinations.

- If a link-local destination is changed into a non-link-local one,
why not reroute it according to the new destination?

If you do not agree, we can also put the check in
ip6t_local_out_hook() where the original destination is still
available.

Dries.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/net/dst.h b/include/net/dst.h
index 9a78810..cb6ae51 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -61,6 +61,7 @@  struct dst_entry {
 #define DST_NOPEER		0x0040
 #define DST_FAKE_RTABLE		0x0080
 #define DST_XFRM_TUNNEL		0x0100
+#define DST_NOREROUTE		0x0200
 
 	unsigned short		pending_confirm;
 
diff --git a/net/ipv6/netfilter.c b/net/ipv6/netfilter.c
index 429089c..cf9e871 100644
--- a/net/ipv6/netfilter.c
+++ b/net/ipv6/netfilter.c
@@ -24,6 +24,10 @@  int ip6_route_me_harder(struct sk_buff *skb)
 		.saddr = iph->saddr,
 	};
 
+	dst = skb_dst(skb);
+	if (dst && (dst->flags & DST_NOREROUTE))
+		return 0;
+
 	dst = ip6_route_output(net, skb->sk, &fl6);
 	if (dst->error) {
 		IP6_INC_STATS(net, ip6_dst_idev(dst), IPSTATS_MIB_OUTNOROUTES);
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index b1e6cf0..8fa7db5 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -1225,7 +1225,7 @@  struct dst_entry *icmp6_dst_alloc(struct net_device *dev,
 		}
 	}
 
-	rt->dst.flags |= DST_HOST;
+	rt->dst.flags |= DST_HOST | DST_NOREROUTE;
 	rt->dst.output  = ip6_output;
 	rt->n = neigh;
 	atomic_set(&rt->dst.__refcnt, 1);