Message ID | alpine.LFD.2.00.1106182036001.1425@ja.ssi.bg |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
From: Julian Anastasov <ja@ssi.bg> Date: Sat, 18 Jun 2011 20:53:59 +0300 (EEST) > Hm, if it happens "sometimes", can it be some > problem with tproxy and TIME_WAIT sockets? I see that > tproxy_sk_is_transparent has special treatment for TW > sockets while ip_route_me_harder is different. As result, > may be input route is assigned for TW packets. > > May be inet_sk_flowi_flags() needs fixing, not > sure. But following patch is first step to fix this > problem. I don't have setup to test this patch. TPROXY has special code to make sure that time-wait sockets are not assigned to skb->sk, as explained in commit d503b30bd648b3cb4e5f50b65d27e389960cc6d9, that would cause all kinds of crashes in nfnetlink_log etc. Therefore we would see skb->sk==NULL at ip_route_me_harder() in that case. > =========================================================== > > Avoid creating input routes with ip_route_me_harder. > It does not work for locally generated packets. Instead, > restrict sockets to provide valid saddr for output route (or > unicast saddr for transparent proxy). For other traffic > allow saddr to be unicast or local but if callers forget > to check saddr type use 0 for the output route. > > The resulting handling should be: > > - REJECT TCP: > - in INPUT we can provide addr_type = RTN_LOCAL but > better allow rejecting traffic delivered with > local route (no IP address => use RTN_UNSPEC to > allow also RTN_UNICAST). > - FORWARD: RTN_UNSPEC => allow RTN_LOCAL/RTN_UNICAST > saddr, add fix to ignore RTN_BROADCAST and RTN_MULTICAST > - OUTPUT: RTN_UNSPEC > > - NAT, mangle, ip_queue, nf_ip_reroute: RTN_UNSPEC in LOCAL_OUT > > - IPVS: > - use RTN_LOCAL in LOCAL_OUT and FORWARD after SNAT > to restrict saddr to be local > > Signed-off-by: Julian Anastasov <ja@ssi.bg> Unless someone gives some negative feedback soon I'm going to apply this. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, On Mon, 27 Jun 2011, David Miller wrote: > From: Julian Anastasov <ja@ssi.bg> > Date: Sat, 18 Jun 2011 20:53:59 +0300 (EEST) > > > Hm, if it happens "sometimes", can it be some > > problem with tproxy and TIME_WAIT sockets? I see that > > tproxy_sk_is_transparent has special treatment for TW > > sockets while ip_route_me_harder is different. As result, > > may be input route is assigned for TW packets. > > > > May be inet_sk_flowi_flags() needs fixing, not > > sure. But following patch is first step to fix this > > problem. I don't have setup to test this patch. > > TPROXY has special code to make sure that time-wait sockets > are not assigned to skb->sk, as explained in commit > d503b30bd648b3cb4e5f50b65d27e389960cc6d9, that would cause > all kinds of crashes in nfnetlink_log etc. > > Therefore we would see skb->sk==NULL at ip_route_me_harder() > in that case. Aha, after this clarification other changes should not be needed. If saddr is translated, now we will use FLOWI_FLAG_ANYSRC. As result, if SNAT happens one day in LOCAL_OUT, the new saddr can be unicast because RTN_UNSPEC is provided for addr_type. If saddr is not changed, it should be already validated when the first route for skb is performed, so TPROXY should work. > > =========================================================== > > > > Avoid creating input routes with ip_route_me_harder. > > It does not work for locally generated packets. Instead, > > restrict sockets to provide valid saddr for output route (or > > unicast saddr for transparent proxy). For other traffic > > allow saddr to be unicast or local but if callers forget > > to check saddr type use 0 for the output route. > > > > The resulting handling should be: > > > > - REJECT TCP: > > - in INPUT we can provide addr_type = RTN_LOCAL but > > better allow rejecting traffic delivered with > > local route (no IP address => use RTN_UNSPEC to > > allow also RTN_UNICAST). > > - FORWARD: RTN_UNSPEC => allow RTN_LOCAL/RTN_UNICAST > > saddr, add fix to ignore RTN_BROADCAST and RTN_MULTICAST > > - OUTPUT: RTN_UNSPEC > > > > - NAT, mangle, ip_queue, nf_ip_reroute: RTN_UNSPEC in LOCAL_OUT > > > > - IPVS: > > - use RTN_LOCAL in LOCAL_OUT and FORWARD after SNAT > > to restrict saddr to be local > > > > Signed-off-by: Julian Anastasov <ja@ssi.bg> > > Unless someone gives some negative feedback soon I'm going to > apply this. Regards -- Julian Anastasov <ja@ssi.bg> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 28.06.2011 05:55, David Miller wrote: >> The resulting handling should be: >> >> - REJECT TCP: >> - in INPUT we can provide addr_type = RTN_LOCAL but >> better allow rejecting traffic delivered with >> local route (no IP address => use RTN_UNSPEC to >> allow also RTN_UNICAST). >> - FORWARD: RTN_UNSPEC => allow RTN_LOCAL/RTN_UNICAST >> saddr, add fix to ignore RTN_BROADCAST and RTN_MULTICAST >> - OUTPUT: RTN_UNSPEC >> >> - NAT, mangle, ip_queue, nf_ip_reroute: RTN_UNSPEC in LOCAL_OUT >> >> - IPVS: >> - use RTN_LOCAL in LOCAL_OUT and FORWARD after SNAT >> to restrict saddr to be local >> >> Signed-off-by: Julian Anastasov<ja@ssi.bg> > > Unless someone gives some negative feedback soon I'm going to > apply this. Can you tell me where it will be pushed? I.e. 3.x kernels only, or does it have a chance to go into 2.6.39.x?
From: Tomasz Chmielewski <mangoo@wpkg.org> Date: Tue, 28 Jun 2011 10:30:11 +0200 > Can you tell me where it will be pushed? > > I.e. 3.x kernels only, or does it have a chance to go into 2.6.39.x? I'll apply it for 3.0.0 and also queue it up for -stable. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Julian Anastasov <ja@ssi.bg> Date: Tue, 28 Jun 2011 11:13:25 +0300 (EEST) > On Mon, 27 Jun 2011, David Miller wrote: > >> TPROXY has special code to make sure that time-wait sockets >> are not assigned to skb->sk, as explained in commit >> d503b30bd648b3cb4e5f50b65d27e389960cc6d9, that would cause >> all kinds of crashes in nfnetlink_log etc. >> >> Therefore we would see skb->sk==NULL at ip_route_me_harder() >> in that case. > > Aha, after this clarification other changes should not > be needed. By this do you mean that you think your patch in this thread is completely sufficient? -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, On Tue, 28 Jun 2011, David Miller wrote: > From: Julian Anastasov <ja@ssi.bg> > Date: Tue, 28 Jun 2011 11:13:25 +0300 (EEST) > > > On Mon, 27 Jun 2011, David Miller wrote: > > > >> TPROXY has special code to make sure that time-wait sockets > >> are not assigned to skb->sk, as explained in commit > >> d503b30bd648b3cb4e5f50b65d27e389960cc6d9, that would cause > >> all kinds of crashes in nfnetlink_log etc. > >> > >> Therefore we would see skb->sk==NULL at ip_route_me_harder() > >> in that case. > > > > Aha, after this clarification other changes should not > > be needed. > > By this do you mean that you think your patch in this thread > is completely sufficient? Yes. My worry was for the skb->sk != NULL not being handled by inet_sk_flowi_flags for TW sockets. But it seems it is not needed, so the patch in this form should be ok. Regards -- Julian Anastasov <ja@ssi.bg> -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
=========================================================== Avoid creating input routes with ip_route_me_harder. It does not work for locally generated packets. Instead, restrict sockets to provide valid saddr for output route (or unicast saddr for transparent proxy). For other traffic allow saddr to be unicast or local but if callers forget to check saddr type use 0 for the output route. The resulting handling should be: - REJECT TCP: - in INPUT we can provide addr_type = RTN_LOCAL but better allow rejecting traffic delivered with local route (no IP address => use RTN_UNSPEC to allow also RTN_UNICAST). - FORWARD: RTN_UNSPEC => allow RTN_LOCAL/RTN_UNICAST saddr, add fix to ignore RTN_BROADCAST and RTN_MULTICAST - OUTPUT: RTN_UNSPEC - NAT, mangle, ip_queue, nf_ip_reroute: RTN_UNSPEC in LOCAL_OUT - IPVS: - use RTN_LOCAL in LOCAL_OUT and FORWARD after SNAT to restrict saddr to be local Signed-off-by: Julian Anastasov <ja@ssi.bg> --- diff -urp v2.6.39/linux/net/ipv4/netfilter/ipt_REJECT.c linux/net/ipv4/netfilter/ipt_REJECT.c --- v2.6.39/linux/net/ipv4/netfilter/ipt_REJECT.c 2011-03-20 10:55:56.000000000 +0200 +++ linux/net/ipv4/netfilter/ipt_REJECT.c 2011-06-18 18:22:40.713189957 +0300 @@ -40,7 +40,6 @@ static void send_reset(struct sk_buff *o struct iphdr *niph; const struct tcphdr *oth; struct tcphdr _otcph, *tcph; - unsigned int addr_type; /* IP header checks: fragment. */ if (ip_hdr(oldskb)->frag_off & htons(IP_OFFSET)) @@ -55,6 +54,9 @@ static void send_reset(struct sk_buff *o if (oth->rst) return; + if (skb_rtable(oldskb)->rt_flags & (RTCF_BROADCAST | RTCF_MULTICAST)) + return; + /* Check checksum */ if (nf_ip_checksum(oldskb, hook, ip_hdrlen(oldskb), IPPROTO_TCP)) return; @@ -101,19 +103,11 @@ static void send_reset(struct sk_buff *o nskb->csum_start = (unsigned char *)tcph - nskb->head; nskb->csum_offset = offsetof(struct tcphdr, check); - addr_type = RTN_UNSPEC; - if (hook != NF_INET_FORWARD -#ifdef CONFIG_BRIDGE_NETFILTER - || (nskb->nf_bridge && nskb->nf_bridge->mask & BRNF_BRIDGED) -#endif - ) - addr_type = RTN_LOCAL; - /* ip_route_me_harder expects skb->dst to be set */ skb_dst_set_noref(nskb, skb_dst(oldskb)); nskb->protocol = htons(ETH_P_IP); - if (ip_route_me_harder(nskb, addr_type)) + if (ip_route_me_harder(nskb, RTN_UNSPEC)) goto free_nskb; niph->ttl = ip4_dst_hoplimit(skb_dst(nskb)); diff -urp v2.6.39/linux/net/ipv4/netfilter.c linux/net/ipv4/netfilter.c --- v2.6.39/linux/net/ipv4/netfilter.c 2011-05-20 10:38:08.000000000 +0300 +++ linux/net/ipv4/netfilter.c 2011-06-18 19:13:39.299189310 +0300 @@ -17,51 +17,35 @@ int ip_route_me_harder(struct sk_buff *s const struct iphdr *iph = ip_hdr(skb); struct rtable *rt; struct flowi4 fl4 = {}; - unsigned long orefdst; + __be32 saddr = iph->saddr; + __u8 flags = 0; unsigned int hh_len; - unsigned int type; - type = inet_addr_type(net, iph->saddr); - if (skb->sk && inet_sk(skb->sk)->transparent) - type = RTN_LOCAL; - if (addr_type == RTN_UNSPEC) - addr_type = type; + if (!skb->sk && addr_type != RTN_LOCAL) { + if (addr_type == RTN_UNSPEC) + addr_type = inet_addr_type(net, saddr); + if (addr_type == RTN_LOCAL || addr_type == RTN_UNICAST) + flags |= FLOWI_FLAG_ANYSRC; + else + saddr = 0; + } /* some non-standard hacks like ipt_REJECT.c:send_reset() can cause * packets with foreign saddr to appear on the NF_INET_LOCAL_OUT hook. */ - if (addr_type == RTN_LOCAL) { - fl4.daddr = iph->daddr; - if (type == RTN_LOCAL) - fl4.saddr = iph->saddr; - fl4.flowi4_tos = RT_TOS(iph->tos); - fl4.flowi4_oif = skb->sk ? skb->sk->sk_bound_dev_if : 0; - fl4.flowi4_mark = skb->mark; - fl4.flowi4_flags = skb->sk ? inet_sk_flowi_flags(skb->sk) : 0; - rt = ip_route_output_key(net, &fl4); - if (IS_ERR(rt)) - return -1; - - /* Drop old route. */ - skb_dst_drop(skb); - skb_dst_set(skb, &rt->dst); - } else { - /* non-local src, find valid iif to satisfy - * rp-filter when calling ip_route_input. */ - fl4.daddr = iph->saddr; - rt = ip_route_output_key(net, &fl4); - if (IS_ERR(rt)) - return -1; + fl4.daddr = iph->daddr; + fl4.saddr = saddr; + fl4.flowi4_tos = RT_TOS(iph->tos); + fl4.flowi4_oif = skb->sk ? skb->sk->sk_bound_dev_if : 0; + fl4.flowi4_mark = skb->mark; + fl4.flowi4_flags = skb->sk ? inet_sk_flowi_flags(skb->sk) : flags; + rt = ip_route_output_key(net, &fl4); + if (IS_ERR(rt)) + return -1; - orefdst = skb->_skb_refdst; - if (ip_route_input(skb, iph->daddr, iph->saddr, - RT_TOS(iph->tos), rt->dst.dev) != 0) { - dst_release(&rt->dst); - return -1; - } - dst_release(&rt->dst); - refdst_drop(orefdst); - } + /* Drop old route. */ + skb_dst_drop(skb); + skb_dst_set(skb, &rt->dst); if (skb_dst(skb)->error) return -1;