Message ID | 1406189276.3363.63.camel@edumazet-glaptop2.roam.corp.google.com |
---|---|
State | Superseded, archived |
Delegated to: | David Miller |
Headers | show |
On Thu, Jul 24, 2014 at 1:07 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > This patch adds a random amount of perturbation so that IP identifiers > for a given destination [1] are no longer monotonically increasing after > an idle period. This certainly looks good to me. It would be good to have actual testing by Jeffrey &al, but this seems simple and complete. Linus -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/24/2014 12:21 PM, Linus Torvalds wrote: > On Thu, Jul 24, 2014 at 1:07 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: >> >> This patch adds a random amount of perturbation so that IP identifiers >> for a given destination [1] are no longer monotonically increasing after >> an idle period. > > This certainly looks good to me. It would be good to have actual > testing by Jeffrey &al, but this seems simple and complete. I've just tested it, and this easily defeats our implementation of the side-channel attack, as our implementation assumes we're trying to infer the value of a per-destination counter that isn't moving on its own. I've let my thoughts on this problem percolate some more overnight. Commit 73f156a6e8c1 ("inetpeer: get rid of ip_id_count") really does change the problem of inferring the existence of traffic between machines, both for better and for worse. It helps the problem in one way in that the IP id counter corresponding to any destination is now noisier, particularly for high traffic servers, although for servers with a small population of users this isn't so helpful. It actually hurts the problem though in that we don't even need to use our side-channel if we control enough addresses (or at least an address that hashes to the same counter as an address whose traffic to the server we want to measure). So for an attacker controlling a large number of addresses trying to measure which of a server's small population of users are accessing it, the problem actually seems worse since this commit. So now about how this proposed patch changes things. It breaks our use of a side-channel to infer the value of counters, since our binary-search-like approach which we rely on to efficiently find the value of the counter just doesn't handle a randomly moving counter. Moreover, it would seem very difficult to make it robust to that. I suspect that the best someone could ever do is infer a rough idea of the counter at any one time. And then trying to use our side-channel to meaningfully infer anything but the largest of differences to the counter over time would be even more difficult. So I believe that this patch adequately addresses the side-channel attack in our paper. On the other hand, in the post-73f156a6e8c1 world, we also have this new problem of where an attacker doesn't even need to use our side-channel if he controls a large number of addresses and can measure the values of the counters directly. The proposed patch actually helps with this case too. However, the existence of traffic between the server and some other address could probably still be inferred by a sufficiently determined attacker. Certainly it could be if the server is sending the address a very large number packets, as the signal would just overcome the noise. But even if the number of packets being sent is very small, surprisingly, this may still be inferable under this patch, as, e.g., randint(25) + randint(25) + randint(25) + randint(25) has a much different probability distribution than randint(100), and a determined attacker may be able to determine if he is sampling from randint(100) or not with enough samples. So my thoughts in a nutshell: I believe that this proposed patch solves the original problem of attackers being able to infer the value of counters via a side-channel. However, in the post-73f156a6e8c1 world, there's also a new problem which we might care about too where we have an attacker who controls a large number of addresses. This patch also happens to help with this problem too, but maybe not enough, depending on how sophisticated of an attacker we want to protect against. Jeff -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2014-07-25 at 09:55 -0600, Jeffrey Knockel wrote: > So my thoughts in a nutshell: I believe that this proposed patch solves > the original problem of attackers being able to infer the value of > counters via a side-channel. However, in the post-73f156a6e8c1 world, > there's also a new problem which we might care about too where we have > an attacker who controls a large number of addresses. This patch also > happens to help with this problem too, but maybe not enough, depending > on how sophisticated of an attacker we want to protect against. What do you mean by "an attacker who controls a large number of addresses" ? The hash(daddr) -> slot function is not known, as we use a Jenkin hash with a secret ( ip_idents_hashrnd & ip6_idents_hashrnd ) We might change the hash to use both daddr & saddr to increase protection. pre-73f156a6e8c1 was horrible, because it was easy to fill inetpeer table to force a garbage collection. Then when an ID was needed for a peer that had been evicted, we started again from a fixed base ID ( secure_ip_id() ) Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 25, 2014 at 11:09 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > We might change the hash to use both daddr & saddr to increase > protection. .. and maybe protocol too, so that you can't easily use icmp echo packets to do it for udp packets etc. The underlying jhash is jhash_3words(), so that would actually be fairly natural for at least ipv4 (the ipv6 case I didn't look at). Linus -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2014-07-25 at 11:35 -0700, Linus Torvalds wrote: > On Fri, Jul 25, 2014 at 11:09 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > We might change the hash to use both daddr & saddr to increase > > protection. > > .. and maybe protocol too, so that you can't easily use icmp echo > packets to do it for udp packets etc. The underlying jhash is > jhash_3words(), so that would actually be fairly natural for at least > ipv4 (the ipv6 case I didn't look at). Right, in fact saddr is probably not worth it. Its not like servers have dozen of IPv4 addresses anyway... -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 25, 2014 at 08:38:17PM +0200, Eric Dumazet wrote: > On Fri, 2014-07-25 at 11:35 -0700, Linus Torvalds wrote: > > On Fri, Jul 25, 2014 at 11:09 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > > > We might change the hash to use both daddr & saddr to increase > > > protection. > > > > .. and maybe protocol too, so that you can't easily use icmp echo > > packets to do it for udp packets etc. The underlying jhash is > > jhash_3words(), so that would actually be fairly natural for at least > > ipv4 (the ipv6 case I didn't look at). > > Right, in fact saddr is probably not worth it. Yes it is, at least to isolate public and private networks. > Its not like servers have dozen of IPv4 addresses anyway... Actually some have many more, even hundreds sometimes (until people realize they can bind networks to the loopback or do transparent proxy, where the principle is still true). It's especially true with front equipments such as reverse proxies and load balancers. SSL deployed all over the web has made that much worse despite the introduction of SNI which is not supported by all clients, because while hosting providers used to assign just a few IPs on which they bound their servers using virtual hosting, with SSL they tend to offer one IP address per customer. Regards, Willy -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/25/2014 12:09 PM, Eric Dumazet wrote: > What do you mean by "an attacker who controls a large number of > addresses" ? In general, I mean any attacker who can read the packets sent to a large number of different Internet addresses. Even an attacker who has one address but can cycle through different assignments may be a problem. > The hash(daddr) -> slot function is not known, as we use a Jenkin hash > with a secret ( ip_idents_hashrnd & ip6_idents_hashrnd ) That's true, but the secret never changes, right? I may not be able to identify the slot number that any address is hashed to, but I can identify when some victim address hashes to the same slot as one of my addresses whose packets I can read. For instance, if I in short succession 1. Probe value of the IP id counter for each of my addresses 2. Spoof a large number of (e.g.) echo requests from victim address (or something else to the distribution that I can measure) 3. Again probe value of the IP id counter for each of my addresses Then I can tell which of my addresses hash to the same slot as the victim address by whose value of the IP id counter has jumped as a result of the linux machine sending echo replies to the victim. Jeff -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, On Fr, 2014-07-25 at 20:38 +0200, Eric Dumazet wrote: > On Fri, 2014-07-25 at 11:35 -0700, Linus Torvalds wrote: > > On Fri, Jul 25, 2014 at 11:09 AM, Eric Dumazet <eric.dumazet@gmail.com> wrote: > > > > > > We might change the hash to use both daddr & saddr to increase > > > protection. > > > > .. and maybe protocol too, so that you can't easily use icmp echo > > packets to do it for udp packets etc. The underlying jhash is > > jhash_3words(), so that would actually be fairly natural for at least > > ipv4 (the ipv6 case I didn't look at). > > Right, in fact saddr is probably not worth it. > > Its not like servers have dozen of IPv4 addresses anyway... Another idea, maybe worth looking at: Since commit 703133de331a7a ("ip: generate unique IP identificator if local fragmentation is allowed") we started to generate fragmentation ids in the output path for every packet that has ignore_df set, which nearly is every packet. We could try to push that down the stack and only insert the fragmentation id in ip_fragment. We still need to generate the frag_id directly from the socket layer, but we can reuse the ip6_frag_id field in skb_shinfo for IPv4, too. Then we actually only need to generate fragmentation ids for the VJ compression workaround, generated from the socket->inet_id. Do we still need this (I guess, yes)? Does this sound worth a try or are there any unseen protocol specific consequences I am not yet aware of? We would stop leaking too many fragmentation ids with this change. Thanks, Hannes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/net/ip.h b/include/net/ip.h index 0e795df05ec9..7596eb22e1ce 100644 --- a/include/net/ip.h +++ b/include/net/ip.h @@ -309,16 +309,7 @@ static inline unsigned int ip_skb_dst_mtu(const struct sk_buff *skb) } } -#define IP_IDENTS_SZ 2048u -extern atomic_t *ip_idents; - -static inline u32 ip_idents_reserve(u32 hash, int segs) -{ - atomic_t *id_ptr = ip_idents + hash % IP_IDENTS_SZ; - - return atomic_add_return(segs, id_ptr) - segs; -} - +u32 ip_idents_reserve(u32 hash, int segs); void __ip_select_ident(struct iphdr *iph, int segs); static inline void ip_select_ident_segs(struct sk_buff *skb, struct sock *sk, int segs) diff --git a/net/ipv4/route.c b/net/ipv4/route.c index 3162ea923ded..2e9713a8966f 100644 --- a/net/ipv4/route.c +++ b/net/ipv4/route.c @@ -457,8 +457,31 @@ static struct neighbour *ipv4_neigh_lookup(const struct dst_entry *dst, return neigh_create(&arp_tbl, pkey, dev); } -atomic_t *ip_idents __read_mostly; -EXPORT_SYMBOL(ip_idents); +#define IP_IDENTS_SZ 2048u +struct ip_ident_bucket { + atomic_t id; + u32 stamp32; +}; + +static struct ip_ident_bucket *ip_idents __read_mostly; + +/* In order to protect privacy, we add a perturbation to identifiers + * if one generator is seldom used. This makes hard for an attacker + * to infer how many packets were sent between two hosts. + */ +u32 ip_idents_reserve(u32 hash, int segs) +{ + struct ip_ident_bucket *bucket = ip_idents + hash % IP_IDENTS_SZ; + u32 old = ACCESS_ONCE(bucket->stamp32); + u32 now = (u32)jiffies; + u32 delta = 0; + + if (old != now && cmpxchg(&bucket->stamp32, old, now) == old) + delta = prandom_u32_max(now - old); + + return atomic_add_return(segs + delta, &bucket->id) - segs; +} +EXPORT_SYMBOL(ip_idents_reserve); void __ip_select_ident(struct iphdr *iph, int segs) {