Message ID | 1492565022-110676-1-git-send-email-gfree.wind@foxmail.com |
---|---|
State | Accepted |
Delegated to: | Pablo Neira |
Headers | show |
On Wed, Apr 19, 2017 at 09:23:42AM +0800, gfree.wind@foxmail.com wrote: > From: Gao Feng <fgao@ikuai8.com> > > The window scale may be enlarged from 14 to 15 according to the itef > draft https://tools.ietf.org/html/draft-nishida-tcpm-maxwin-03. > > Use the macro TCP_MAX_WSCALE to support it easily with TCP stack in > the future. Applied, thanks. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2017-04-19 at 17:58 +0200, Pablo Neira Ayuso wrote: > On Wed, Apr 19, 2017 at 09:23:42AM +0800, gfree.wind@foxmail.com wrote: > > From: Gao Feng <fgao@ikuai8.com> > > > > The window scale may be enlarged from 14 to 15 according to the itef > > draft https://tools.ietf.org/html/draft-nishida-tcpm-maxwin-03. > > > > Use the macro TCP_MAX_WSCALE to support it easily with TCP stack in > > the future. > > Applied, thanks. Note that linux kernel is not ready yet for a TCP_MAX_WSCALE being changed to 15. Signed 32bit sk counters can already be abused with 1GB TCP windows, for malicious peers sending SACK forcing linux to increase its memory usage above 2GB and overflows are pretty bad. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 19, 2017 at 09:22:08AM -0700, Eric Dumazet wrote: > On Wed, 2017-04-19 at 17:58 +0200, Pablo Neira Ayuso wrote: > > On Wed, Apr 19, 2017 at 09:23:42AM +0800, gfree.wind@foxmail.com wrote: > > > From: Gao Feng <fgao@ikuai8.com> > > > > > > The window scale may be enlarged from 14 to 15 according to the itef > > > draft https://tools.ietf.org/html/draft-nishida-tcpm-maxwin-03. > > > > > > Use the macro TCP_MAX_WSCALE to support it easily with TCP stack in > > > the future. > > > > Applied, thanks. > > Note that linux kernel is not ready yet for a TCP_MAX_WSCALE being > changed to 15. > > Signed 32bit sk counters can already be abused with 1GB TCP windows, for > malicious peers sending SACK forcing linux to increase its memory usage > above 2GB and overflows are pretty bad. We have tend to use our own definitions for the TCP connection tracking so far. This one I checked it refers RFC1323 too. If this semantics may change from one way to another in a way that may break conntracking, please let me know, I can toss it here. Thanks Eric! -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Apr 19, 2017 at 09:57:55PM +0200, Pablo Neira Ayuso wrote: > On Wed, Apr 19, 2017 at 09:22:08AM -0700, Eric Dumazet wrote: > > On Wed, 2017-04-19 at 17:58 +0200, Pablo Neira Ayuso wrote: > > > On Wed, Apr 19, 2017 at 09:23:42AM +0800, gfree.wind@foxmail.com wrote: > > > > From: Gao Feng <fgao@ikuai8.com> > > > > > > > > The window scale may be enlarged from 14 to 15 according to the itef > > > > draft https://tools.ietf.org/html/draft-nishida-tcpm-maxwin-03. > > > > > > > > Use the macro TCP_MAX_WSCALE to support it easily with TCP stack in > > > > the future. > > > > > > Applied, thanks. > > > > Note that linux kernel is not ready yet for a TCP_MAX_WSCALE being > > changed to 15. > > > > Signed 32bit sk counters can already be abused with 1GB TCP windows, for > > malicious peers sending SACK forcing linux to increase its memory usage > > above 2GB and overflows are pretty bad. > > We have tend to use our own definitions for the TCP connection > tracking so far. This one I checked it refers RFC1323 too. > > If this semantics may change from one way to another in a way that may > break conntracking, please let me know, I can toss it here. Or I can just amend the commit here to remove the "enlarged from 14 to 15" comment, I was going to push out this now, but I'll wait a bit. -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> On Wed, Apr 19, 2017 at 09:57:55PM +0200, Pablo Neira Ayuso wrote: > > On Wed, Apr 19, 2017 at 09:22:08AM -0700, Eric Dumazet wrote: > > > On Wed, 2017-04-19 at 17:58 +0200, Pablo Neira Ayuso wrote: > > > > On Wed, Apr 19, 2017 at 09:23:42AM +0800, gfree.wind@foxmail.com > wrote: > > > > > From: Gao Feng <fgao@ikuai8.com> > > > > > > > > > > The window scale may be enlarged from 14 to 15 according to the > > > > > itef draft https://tools.ietf.org/html/draft-nishida-tcpm-maxwin-03. > > > > > > > > > > Use the macro TCP_MAX_WSCALE to support it easily with TCP stack > > > > > in the future. > > > > > > > > Applied, thanks. > > > > > > Note that linux kernel is not ready yet for a TCP_MAX_WSCALE being > > > changed to 15. > > > > > > Signed 32bit sk counters can already be abused with 1GB TCP windows, > > > for malicious peers sending SACK forcing linux to increase its > > > memory usage above 2GB and overflows are pretty bad. > > > > We have tend to use our own definitions for the TCP connection > > tracking so far. This one I checked it refers RFC1323 too. > > > > If this semantics may change from one way to another in a way that may > > break conntracking, please let me know, I can toss it here. > > Or I can just amend the commit here to remove the "enlarged from 14 to 15" > comment, I was going to push out this now, but I'll wait a bit. Thanks Eric & Pablo, When the wscale is really enlarged to 15 one day, these Netfilter codes may be modified. Because it would reset the wscale value to the max value which Netfilter support it. if (state->td_scale > 14) state->td_scale = 14; It would cause the receive see a less window size than sender announced actually. Best Regards Feng -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2017-04-20 at 08:44 +0800, Gao Feng wrote: > > On Wed, Apr 19, 2017 at 09:57:55PM +0200, Pablo Neira Ayuso wrote: > > > On Wed, Apr 19, 2017 at 09:22:08AM -0700, Eric Dumazet wrote: > > > > On Wed, 2017-04-19 at 17:58 +0200, Pablo Neira Ayuso wrote: > > > > > On Wed, Apr 19, 2017 at 09:23:42AM +0800, gfree.wind@foxmail.com > > wrote: > > > > > > From: Gao Feng <fgao@ikuai8.com> > > > > > > > > > > > > The window scale may be enlarged from 14 to 15 according to the > > > > > > itef draft > https://tools.ietf.org/html/draft-nishida-tcpm-maxwin-03. > > > > > > > > > > > > Use the macro TCP_MAX_WSCALE to support it easily with TCP stack > > > > > > in the future. > > > > > > > > > > Applied, thanks. > > > > > > > > Note that linux kernel is not ready yet for a TCP_MAX_WSCALE being > > > > changed to 15. > > > > > > > > Signed 32bit sk counters can already be abused with 1GB TCP windows, > > > > for malicious peers sending SACK forcing linux to increase its > > > > memory usage above 2GB and overflows are pretty bad. > > > > > > We have tend to use our own definitions for the TCP connection > > > tracking so far. This one I checked it refers RFC1323 too. > > > > > > If this semantics may change from one way to another in a way that may > > > break conntracking, please let me know, I can toss it here. > > > > Or I can just amend the commit here to remove the "enlarged from 14 to 15" > > comment, I was going to push out this now, but I'll wait a bit. > > Thanks Eric & Pablo, > When the wscale is really enlarged to 15 one day, these Netfilter codes may > be modified. > Because it would reset the wscale value to the max value which Netfilter > support it. > if (state->td_scale > 14) > state->td_scale = 14; > It would cause the receive see a less window size than sender announced > actually. Simply because some middle boxes are enforcing the limit of 14, a change of TCP stacks on peers might be simply not possible without causing serious interoperability issues. This IETF draft assumes TCP peers can freely decide of what they can do, but experience shows that they can not. TCP FastOpen for example hit a bug in linux TCP conntracking, and some linux middle boxes are still having this bug. ( SYN messages with data payload was not really considered in the past ) -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> From: Eric Dumazet [mailto:eric.dumazet@gmail.com] > On Thu, 2017-04-20 at 08:44 +0800, Gao Feng wrote: > > > On Wed, Apr 19, 2017 at 09:57:55PM +0200, Pablo Neira Ayuso wrote: > > > > On Wed, Apr 19, 2017 at 09:22:08AM -0700, Eric Dumazet wrote: > > > > > On Wed, 2017-04-19 at 17:58 +0200, Pablo Neira Ayuso wrote: > > > > > > On Wed, Apr 19, 2017 at 09:23:42AM +0800, > > > > > > gfree.wind@foxmail.com > > > wrote: > > > > > > > From: Gao Feng <fgao@ikuai8.com> > > > > > > > > > > > > > > The window scale may be enlarged from 14 to 15 according to > > > > > > > the itef draft > > https://tools.ietf.org/html/draft-nishida-tcpm-maxwin-03. > > > > > > > > > > > > > > Use the macro TCP_MAX_WSCALE to support it easily with TCP > > > > > > > stack in the future. > > > > > > > > > > > > Applied, thanks. > > > > > > > > > > Note that linux kernel is not ready yet for a TCP_MAX_WSCALE > > > > > being changed to 15. > > > > > > > > > > Signed 32bit sk counters can already be abused with 1GB TCP > > > > > windows, for malicious peers sending SACK forcing linux to > > > > > increase its memory usage above 2GB and overflows are pretty bad. > > > > > > > > We have tend to use our own definitions for the TCP connection > > > > tracking so far. This one I checked it refers RFC1323 too. > > > > > > > > If this semantics may change from one way to another in a way that > > > > may break conntracking, please let me know, I can toss it here. > > > > > > Or I can just amend the commit here to remove the "enlarged from 14 to > 15" > > > comment, I was going to push out this now, but I'll wait a bit. > > > > Thanks Eric & Pablo, > > When the wscale is really enlarged to 15 one day, these Netfilter > > codes may be modified. > > Because it would reset the wscale value to the max value which > > Netfilter support it. > > if (state->td_scale > 14) > > state->td_scale = 14; > > It would cause the receive see a less window size than sender > > announced actually. > > Simply because some middle boxes are enforcing the limit of 14, a change of > TCP stacks on peers might be simply not possible without causing serious > interoperability issues. > > This IETF draft assumes TCP peers can freely decide of what they can do, but > experience shows that they can not. > > TCP FastOpen for example hit a bug in linux TCP conntracking, and some linux > middle boxes are still having this bug. > > ( SYN messages with data payload was not really considered in the past ) > Thanks Eric, I learn a lot. Best Regards Feng -- To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/netfilter/nf_conntrack_proto_tcp.c b/net/netfilter/nf_conntrack_proto_tcp.c index b122e9d..741acdc 100644 --- a/net/netfilter/nf_conntrack_proto_tcp.c +++ b/net/netfilter/nf_conntrack_proto_tcp.c @@ -419,10 +419,9 @@ static void tcp_options(const struct sk_buff *skb, && opsize == TCPOLEN_WINDOW) { state->td_scale = *(u_int8_t *)ptr; - if (state->td_scale > 14) { - /* See RFC1323 */ - state->td_scale = 14; - } + if (state->td_scale > TCP_MAX_WSCALE) + state->td_scale = TCP_MAX_WSCALE; + state->flags |= IP_CT_TCP_FLAG_WINDOW_SCALE; } diff --git a/net/netfilter/nf_synproxy_core.c b/net/netfilter/nf_synproxy_core.c index abe03e8..a504e87 100644 --- a/net/netfilter/nf_synproxy_core.c +++ b/net/netfilter/nf_synproxy_core.c @@ -66,8 +66,8 @@ case TCPOPT_WINDOW: if (opsize == TCPOLEN_WINDOW) { opts->wscale = *ptr; - if (opts->wscale > 14) - opts->wscale = 14; + if (opts->wscale > TCP_MAX_WSCALE) + opts->wscale = TCP_MAX_WSCALE; opts->options |= XT_SYNPROXY_OPT_WSCALE; } break;