diff mbox series

fragment: Improved handling of incorrect IP fragments

Message ID u0QFePiYSfxBeUsNVFRhPjsGViwg-pXLIApJaVLdUICuvLTQg5y5-rdNhh9lPcDsyO24c7wXxy5m6b6dK0aB6kqR0ypk8X9ekiLe3NQ3ICY=@protonmail.com
State Changes Requested
Delegated to: David Miller
Headers show
Series fragment: Improved handling of incorrect IP fragments | expand

Commit Message

Ttttabcd Dec. 31, 2019, 1:19 a.m. UTC
In both ip6_frag_queue and ip6_frag_reasm, it is checked whether it is an
Oversized IPv6 packet, which is duplicated. The original code logic will
only be processed in ip6_frag_queue. The code of ip6_frag_reasm will not
be executed. Now change it to only process in ip6_frag_queue and output
the prompt information.

In ip6_frag_queue, set the prob_offset pointer to notify ipv6_frag_rcv
to send the ICMPv6 parameter problem message. The logic is not concise,
I merged the part that sent ICMPv6 messages into ip6_frag_queue.

In the original logic, receiving oversized IPv6 fragments and receiving
IPv6 fragments whose end is not an integral multiple of 8 bytes both
returns -1. This is inconsistent with other incorrect IPv6 fragmentation
processing. For example, in other logic, end == offset will goto discard_fq
directly. I think that receiving any incorrect IPv6 fragment means that the
fragment processing has failed as a whole, and the data carried in the
fragments is likely to be wrong. I think we should also execute goto
discard_fq to release this fragments queue.

Goto discard_fq at the end of the label send_param_prob, release the
fragment queue. Since icmpv6_param_prob will call kfree_skb, it will also
be kfree_skb in the label err, which will cause repeated free,
so I added skb_get.

I also made similar changes in IPv4 fragmentation processing.

It is not good to use 65535 values directly,
I added the IPV4_MAX_TOT_LEN macro.

The oversized check in IPv4 fragment processing is in the ip_frag_reasm
of the reassembly fragment. This is too late. The incorrect IP fragment
has been inserted into the fragment queue. I modified it in ip_frag_queue.
I changed the original net_info_ratelimited to net_dbg_ratelimited to make
the debugging information more controllable.

I also modified goto discard_qp directly
when the end is not an integer multiple of 8 bytes.

Signed-off-by: AK Deng <ttttabcd@protonmail.com>
---
 include/net/ip.h       |  2 ++
 net/ipv4/ip_fragment.c | 20 +++++++++----------
 net/ipv6/reassembly.c  | 45 ++++++++++++++++++++----------------------
 3 files changed, 33 insertions(+), 34 deletions(-)

--
2.24.0

Comments

Stephen Hemminger Jan. 2, 2020, 7:27 p.m. UTC | #1
On Tue, 31 Dec 2019 01:19:27 +0000
Ttttabcd <ttttabcd@protonmail.com> wrote:

> @@ -267,8 +278,6 @@ static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *skb,
>  	payload_len = ((skb->data - skb_network_header(skb)) -
>  		       sizeof(struct ipv6hdr) + fq->q.len -
>  		       sizeof(struct frag_hdr));
> -	if (payload_len > IPV6_MAXPLEN)
> -		goto out_oversize;

You can not safely drop this check.
With recursive fragmentation it is possible that the initial payload ends
up exceeding the maximum packet length.
Ttttabcd Jan. 3, 2020, 12:44 a.m. UTC | #2
> You can not safely drop this check.
> With recursive fragmentation it is possible that the initial payload ends
> up exceeding the maximum packet length.

Can you give an example? What is "recursive fragmentation"?

In my previous tests, all fragment packets with a payload length exceeding 65535 will be in the ip6_frag_queue

if ((unsigned int) end> IPV6_MAXPLEN)

Was discarded.
Stephen Hemminger Jan. 7, 2020, 12:06 a.m. UTC | #3
On Fri, 03 Jan 2020 00:44:30 +0000
Ttttabcd <ttttabcd@protonmail.com> wrote:

> > You can not safely drop this check.
> > With recursive fragmentation it is possible that the initial payload ends
> > up exceeding the maximum packet length.  
> 
> Can you give an example? What is "recursive fragmentation"?
> 
> In my previous tests, all fragment packets with a payload length exceeding 65535 will be in the ip6_frag_queue
> 
> if ((unsigned int) end> IPV6_MAXPLEN)
> 
> Was discarded.
> 
> 

I get wary of any changes to fragmentation code. It has a long history
of bugs and is complex. See recent FragSmack for some backstory.

You need to split IPv4 and IPv6 parts into two different patches.
In the IPv4 part, you dropped the test for oversize IPv4 packet.

With raw packet tools it is possible to generate a packet that reassembles into
a packet larger than 64K.  An example is:

$ tshark -r oversize-ipv4.pcap 
    1   0.000000    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=0, ID=9b39)
    2   0.001615    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=1440, ID=9b39)
    3   0.004115    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=2920, ID=9b39)
    4   0.006502    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=4400, ID=9b39)
    5   0.008819    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=5880, ID=9b39)
    6   0.011178    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=7360, ID=9b39)
    7   0.013465    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=8840, ID=9b39)
    8   0.016040    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=10320, ID=9b39)
    9   0.018369    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=11800, ID=9b39)
   10   0.020679    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=13280, ID=9b39)
   11   0.022965    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=14760, ID=9b39)
   12   0.025186    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=16240, ID=9b39)
   13   0.027277    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=17720, ID=9b39)
   14   0.028917    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=19200, ID=9b39)
   15   0.030832    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=20680, ID=9b39)
   16   0.032232    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=22160, ID=9b39)
   17   0.033742    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=23640, ID=9b39)
   18   0.035106    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=25120, ID=9b39)
   19   0.036736    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=26600, ID=9b39)
   20   0.037728    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=28080, ID=9b39)
   21   0.038983    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=29560, ID=9b39)
   22   0.040007    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=31040, ID=9b39)
   23   0.041459    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=32520, ID=9b39)
   24   0.042833    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=34000, ID=9b39)
   25   0.044030    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=35480, ID=9b39)
   26   0.044909    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=36960, ID=9b39)
   27   0.045921    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=38440, ID=9b39)
   28   0.046767    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=39920, ID=9b39)
   29   0.047581    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=41400, ID=9b39)
   30   0.048610    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=42880, ID=9b39)
   31   0.049323    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=44360, ID=9b39)
   32   0.050102    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=45840, ID=9b39)
   33   0.051014    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=47320, ID=9b39)
   34   0.051787    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=48800, ID=9b39)
   35   0.052576    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=50280, ID=9b39)
   36   0.053448    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=51760, ID=9b39)
   37   0.054260    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=53240, ID=9b39)
   38   0.055036    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=54720, ID=9b39)
   39   0.055823    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=56200, ID=9b39)
   40   0.056614    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=57680, ID=9b39)
   41   0.057512    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=59160, ID=9b39)
   42   0.058313    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=60640, ID=9b39)
   43   0.059073    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=62120, ID=9b39)
   44   0.059945    127.0.0.1 → 127.0.0.1    IPv4 1514 Fragmented IP protocol (proto=TCP 6, off=63600, ID=9b39)
   45   0.060705    127.0.0.1 → 127.0.0.1    TCP 469 16705 → 16705 [FIN, ECN, NS] Seq=1 Win=16705, bogus TCP header length (16, must be at least 20)

With current (correct) Linux kernel code this gets reassembled and dropped.
As seen in dmesg log and statistics.

With your Ipv4 patch the oversize packet gets passed on up the stack.

Testing this stuff is hard, it requires packet hacker tools.
Ttttabcd Jan. 7, 2020, 12:37 a.m. UTC | #4
> With current (correct) Linux kernel code this gets reassembled and dropped.
> As seen in dmesg log and statistics.
>
> With your Ipv4 patch the oversize packet gets passed on up the stack.
>
> Testing this stuff is hard, it requires packet hacker tools.

I know what you mean. What you are talking about is a "ping of death" attack. The use of fragments to construct packets longer than 65535 made the system buffer overflow and crash.

This situation has been considered in my code. In the original logic of IPv6, the judgment of data packets exceeding 65535 is duplicated, and the judgment in IPv4 is too late.

I have improved this situation, you can see my explanation of the patch at the beginning.

> In both ip6_frag_queue and ip6_frag_reasm, it is checked whether it is an
> Oversized IPv6 packet, which is duplicated. The original code logic will
> only be processed in ip6_frag_queue. The code of ip6_frag_reasm will not
> be executed. Now change it to only process in ip6_frag_queue and output
> the prompt information.

> I also made similar changes in IPv4 fragmentation processing.
>
> It is not good to use 65535 values directly,
> I added the IPV4_MAX_TOT_LEN macro.
>
> The oversized check in IPv4 fragment processing is in the ip_frag_reasm
> of the reassembly fragment. This is too late. The incorrect IP fragment
> has been inserted into the fragment queue. I modified it in ip_frag_queue.
> I changed the original net_info_ratelimited to net_dbg_ratelimited to make
> the debugging information more controllable.


@@ -300,6 +300,12 @@ static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 	end = offset + skb->len - skb_network_offset(skb) - ihl;
 	err = -EINVAL;

+	if ((unsigned int)end + ihl > IPV4_MAX_TOT_LEN) {
+		net_dbg_ratelimited("ip_frag_queue: Oversized IP packet from %pI4, end = %d\n",
+				    &qp->q.key.v4.saddr, end);
+		goto discard_qp;
+	}
+

@@ -121,11 +121,10 @@ static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
 			((u8 *)(fhdr + 1) - (u8 *)(ipv6_hdr(skb) + 1)));

 	if ((unsigned int)end > IPV6_MAXPLEN) {
-		*prob_offset = (u8 *)&fhdr->frag_off - skb_network_header(skb);
-		/* note that if prob_offset is set, the skb is freed elsewhere,
-		 * we do not free it here.
-		 */
-		return -1;
+		prob_offset = (u8 *)&fhdr->frag_off - skb_network_header(skb);
+		net_dbg_ratelimited("ip6_frag_queue: Oversized IPv6 packet from %pI6c, end = %d\n",
+				    &fq->q.key.v6.saddr, end);
+		goto send_param_prob;
 	}

As long as the IP fragment length exceeds 65535, I will discard the entire fragment queue.
Ttttabcd Jan. 12, 2020, 3:14 p.m. UTC | #5
> You need to split IPv4 and IPv6 parts into two different patches.

Forgot to ask, is it necessary to divide this patch into two?
diff mbox series

Patch

diff --git a/include/net/ip.h b/include/net/ip.h
index 5b317c9f4470..43e9dc51852b 100644
--- a/include/net/ip.h
+++ b/include/net/ip.h
@@ -34,6 +34,8 @@ 
 #define IPV4_MAX_PMTU		65535U		/* RFC 2675, Section 5.1 */
 #define IPV4_MIN_MTU		68			/* RFC 791 */

+#define IPV4_MAX_TOT_LEN		65535
+
 extern unsigned int sysctl_fib_sync_mem;
 extern unsigned int sysctl_fib_sync_mem_min;
 extern unsigned int sysctl_fib_sync_mem_max;
diff --git a/net/ipv4/ip_fragment.c b/net/ipv4/ip_fragment.c
index cfeb8890f94e..baee3383d0ac 100644
--- a/net/ipv4/ip_fragment.c
+++ b/net/ipv4/ip_fragment.c
@@ -300,6 +300,12 @@  static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 	end = offset + skb->len - skb_network_offset(skb) - ihl;
 	err = -EINVAL;

+	if ((unsigned int)end + ihl > IPV4_MAX_TOT_LEN) {
+		net_dbg_ratelimited("ip_frag_queue: Oversized IP packet from %pI4, end = %d\n",
+				    &qp->q.key.v4.saddr, end);
+		goto discard_qp;
+	}
+
 	/* Is this the final fragment? */
 	if ((flags & IP_MF) == 0) {
 		/* If we already have some bits beyond end
@@ -311,11 +317,10 @@  static int ip_frag_queue(struct ipq *qp, struct sk_buff *skb)
 		qp->q.flags |= INET_FRAG_LAST_IN;
 		qp->q.len = end;
 	} else {
-		if (end&7) {
-			end &= ~7;
-			if (skb->ip_summed != CHECKSUM_UNNECESSARY)
-				skb->ip_summed = CHECKSUM_NONE;
-		}
+		/* Check if the fragment is rounded to 8 bytes. */
+		if (end & 0x7)
+			goto discard_qp;
+
 		if (end > qp->q.len) {
 			/* Some bits beyond end -> corruption. */
 			if (qp->q.flags & INET_FRAG_LAST_IN)
@@ -423,8 +428,6 @@  static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,

 	len = ip_hdrlen(skb) + qp->q.len;
 	err = -E2BIG;
-	if (len > 65535)
-		goto out_oversize;

 	inet_frag_reasm_finish(&qp->q, skb, reasm_data,
 			       ip_frag_coalesce_ok(qp));
@@ -462,9 +465,6 @@  static int ip_frag_reasm(struct ipq *qp, struct sk_buff *skb,
 out_nomem:
 	net_dbg_ratelimited("queue_glue: no memory for gluing queue %p\n", qp);
 	err = -ENOMEM;
-	goto out_fail;
-out_oversize:
-	net_info_ratelimited("Oversized IP packet from %pI4\n", &qp->q.key.v4.saddr);
 out_fail:
 	__IP_INC_STATS(net, IPSTATS_MIB_REASMFAILS);
 	return err;
diff --git a/net/ipv6/reassembly.c b/net/ipv6/reassembly.c
index 1f5d4d196dcc..302bf6c26c45 100644
--- a/net/ipv6/reassembly.c
+++ b/net/ipv6/reassembly.c
@@ -102,13 +102,13 @@  fq_find(struct net *net, __be32 id, const struct ipv6hdr *hdr, int iif)
 }

 static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
-			  struct frag_hdr *fhdr, int nhoff,
-			  u32 *prob_offset)
+			  struct frag_hdr *fhdr, int nhoff)
 {
 	struct net *net = dev_net(skb_dst(skb)->dev);
 	int offset, end, fragsize;
 	struct sk_buff *prev_tail;
 	struct net_device *dev;
+	u32 prob_offset = 0;
 	int err = -ENOENT;
 	u8 ecn;

@@ -121,11 +121,10 @@  static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
 			((u8 *)(fhdr + 1) - (u8 *)(ipv6_hdr(skb) + 1)));

 	if ((unsigned int)end > IPV6_MAXPLEN) {
-		*prob_offset = (u8 *)&fhdr->frag_off - skb_network_header(skb);
-		/* note that if prob_offset is set, the skb is freed elsewhere,
-		 * we do not free it here.
-		 */
-		return -1;
+		prob_offset = (u8 *)&fhdr->frag_off - skb_network_header(skb);
+		net_dbg_ratelimited("ip6_frag_queue: Oversized IPv6 packet from %pI6c, end = %d\n",
+				    &fq->q.key.v6.saddr, end);
+		goto send_param_prob;
 	}

 	ecn = ip6_frag_ecn(ipv6_hdr(skb));
@@ -155,8 +154,8 @@  static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
 			/* RFC2460 says always send parameter problem in
 			 * this case. -DaveM
 			 */
-			*prob_offset = offsetof(struct ipv6hdr, payload_len);
-			return -1;
+			prob_offset = offsetof(struct ipv6hdr, payload_len);
+			goto send_param_prob;
 		}
 		if (end > fq->q.len) {
 			/* Some bits beyond end -> corruption. */
@@ -236,6 +235,18 @@  static int ip6_frag_queue(struct frag_queue *fq, struct sk_buff *skb,
 err:
 	kfree_skb(skb);
 	return err;
+send_param_prob:
+		__IP6_INC_STATS(net, __in6_dev_get_safely(skb->dev),
+				IPSTATS_MIB_INHDRERRORS);
+		/* icmpv6_param_prob() calls kfree_skb(skb),
+		 * and the label err also calls kfree_skb(skb),
+		 * so skb_get(skb) here increases the reference count
+		 * to avoid duplicate release
+		 */
+		skb_get(skb);
+		icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, prob_offset);
+		err = -1;
+		goto discard_fq;
 }

 /*
@@ -267,8 +278,6 @@  static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *skb,
 	payload_len = ((skb->data - skb_network_header(skb)) -
 		       sizeof(struct ipv6hdr) + fq->q.len -
 		       sizeof(struct frag_hdr));
-	if (payload_len > IPV6_MAXPLEN)
-		goto out_oversize;

 	/* We have to remove fragment header from datagram and to relocate
 	 * header in order to calculate ICV correctly. */
@@ -303,9 +312,6 @@  static int ip6_frag_reasm(struct frag_queue *fq, struct sk_buff *skb,
 	fq->q.last_run_head = NULL;
 	return 1;

-out_oversize:
-	net_dbg_ratelimited("ip6_frag_reasm: payload len = %d\n", payload_len);
-	goto out_fail;
 out_oom:
 	net_dbg_ratelimited("ip6_frag_reasm: no memory for reassembly\n");
 out_fail:
@@ -354,23 +360,14 @@  static int ipv6_frag_rcv(struct sk_buff *skb)
 	iif = skb->dev ? skb->dev->ifindex : 0;
 	fq = fq_find(net, fhdr->identification, hdr, iif);
 	if (fq) {
-		u32 prob_offset = 0;
 		int ret;
-
 		spin_lock(&fq->q.lock);

 		fq->iif = iif;
-		ret = ip6_frag_queue(fq, skb, fhdr, IP6CB(skb)->nhoff,
-				     &prob_offset);
+		ret = ip6_frag_queue(fq, skb, fhdr, IP6CB(skb)->nhoff);

 		spin_unlock(&fq->q.lock);
 		inet_frag_put(&fq->q);
-		if (prob_offset) {
-			__IP6_INC_STATS(net, __in6_dev_get_safely(skb->dev),
-					IPSTATS_MIB_INHDRERRORS);
-			/* icmpv6_param_prob() calls kfree_skb(skb) */
-			icmpv6_param_prob(skb, ICMPV6_HDR_FIELD, prob_offset);
-		}
 		return ret;
 	}