Message ID | 1338534026.2760.1451.camel@edumazet-glaptop |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
From: Eric Dumazet <eric.dumazet@gmail.com> Date: Fri, 01 Jun 2012 09:00:26 +0200 > From: Eric Dumazet <edumazet@google.com> > > Another problem on SYNFLOOD/DDOS attack is the inetpeer cache getting > larger and larger, using lots of memory and cpu time. > > tcp_v4_send_synack() > ->inet_csk_route_req() > ->ip_route_output_flow() > ->rt_set_nexthop() > ->rt_init_metrics() > ->inet_getpeer( create = true) > > This is a side effect of commit a4daad6b09230 (net: Pre-COW metrics for > TCP) added in 2.6.39 > > Possible solution : > > Instruct inet_csk_route_req() to remove FLOWI_FLAG_PRECOW_METRICS ... > Signed-off-by: Eric Dumazet <edumazet@google.com> This is definitely the right thing to do. Applied, thanks Eric. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Eric >Another problem on SYNFLOOD/DDOS attack is the inetpeer cache getting >larger and larger, using lots of memory and cpu time. > >>tcp_v4_send_synack() >->inet_csk_route_req() > ->ip_route_output_flow() > ->rt_set_nexthop() > ->rt_init_metrics() > ->inet_getpeer( create = true) > >This is a side effect of commit a4daad6b09230 (net: Pre-COW metrics for >TCP) added in 2.6.39 > >Possible solution : > >Instruct inet_csk_route_req() to remove FLOWI_FLAG_PRECOW_METRICS > It think we are on the right way now, Some results from one of our testers: before applying "reflect SYN queue_mapping into SYNACK" "(The latest one from Eric is not included. I am building with that one right now.) Results were that with the same number of SYN/s, load went down 30% on each of the three Cpus that were handling the SYNs. Great !!!" I'm looking forward to see the results of the latests patch. Then I think conntrack need a little shape up, like a "mini-conntrack" it is way to expensive to alloc a full "coontack for every SYN. I have a bunch of patches and ideas for that... Thanks Eric for a great job /Hans -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2012-06-01 at 23:34 +0200, Hans Schillström wrote: > It think we are on the right way now, > > Some results from one of our testers: > before applying "reflect SYN queue_mapping into SYNACK" > > "(The latest one from Eric is not included. I am building with > that one right now.) > Results were that with the same number of SYN/s, load went down > 30% on each of the three Cpus that were handling the SYNs. > Great !!!" > I am not sure reflecting queue_mapping will help your workload, since you specifically asked to your NIC to queue all SYN packets on one single queue. Eventually not relying on skb->queue_mapping but skb->rxhash to chose an outgoing queue for the SYNACKS to not harm a single tx queue ? Then it might be not needed, if the queue is dedicated to SYN and SYNACK packets, since net_rx_action/net_tx_action should both dequeue 64 packets each round, in a round robin fashion. (I had problems in a standard setup, where you can have a single cpu (CPU0 in my case) servicing all NAPI interrupts, so with 16 queues, the rx_action/tx_action ratio is 16/1 if all synack go to a single queue, while SYN are distributed to all 16 rx queues) > I'm looking forward to see the results of the latests patch. > > Then I think conntrack need a little shape up, like a "mini-conntrack" > it is way to expensive to alloc a full "coontack for every SYN. > > I have a bunch of patches and ideas for that... > Cool ! the conntrack issue is a real one for sure. Given the conntrack current requirement (being protected by a central lock), I guess your best bet would be following setup : One single CPU to handle all SYN packets. Eventually not relying on skb->queue_mapping but skb->rxhash to chose an outgoing queue for the SYNACKS to not harm a single tx queue. > Thanks Eric for a great job > Thanks for giving testing results and ideas ! -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c index 95e61596..f9ee741 100644 --- a/net/ipv4/inet_connection_sock.c +++ b/net/ipv4/inet_connection_sock.c @@ -377,7 +377,8 @@ struct dst_entry *inet_csk_route_req(struct sock *sk, flowi4_init_output(fl4, sk->sk_bound_dev_if, sk->sk_mark, RT_CONN_FLAGS(sk), RT_SCOPE_UNIVERSE, - sk->sk_protocol, inet_sk_flowi_flags(sk), + sk->sk_protocol, + inet_sk_flowi_flags(sk) & ~FLOWI_FLAG_PRECOW_METRICS, (opt && opt->opt.srr) ? opt->opt.faddr : ireq->rmt_addr, ireq->loc_addr, ireq->rmt_port, inet_sk(sk)->inet_sport); security_req_classify_flow(req, flowi4_to_flowi(fl4));