| Message ID | 423b9ce3b45782c09a2fd9c65ad6674a9abb7c72.1778614451.git.michael.bommarito@gmail.com |
|---|---|
| State | Handled Elsewhere, archived |
| Headers | show |
| Series | ipv4: harden against ihl < 5 IP_HDRINCL packets | expand |
On Tue, May 12, 2026 at 04:51:15PM -0400, Michael Bommarito wrote: > > diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c > index 4366cbac3f06..8fa31bdf9792 100644 > --- a/net/ipv4/ah4.c > +++ b/net/ipv4/ah4.c > @@ -137,7 +137,7 @@ static void ah_output_done(void *data, int err) > top_iph->tos = iph->tos; > top_iph->ttl = iph->ttl; > top_iph->frag_off = iph->frag_off; > - if (top_iph->ihl != 5) { > + if (top_iph->ihl > 5) { As I have said before, if ihl is less than 5, then it's invalid to access any fields from the IP header (in fact you can't even access ihl itself if it's that short). So if these packets are getting this far into our stack, then things are very wrong indeed. Now I understand that this is already happening so we have to accept it. But we should try to fix each and one of these issues as other places in our IP stack can very much break if you bombard them with these bogus packets. To further that end, I suggest that you add a WARN_ON_ONCE for the case (top_iph->ihl < 5) and put that at the very start of the AH input function so that i can bail out straight away. Thanks,
diff --git a/net/ipv4/ah4.c b/net/ipv4/ah4.c index 4366cbac3f06..8fa31bdf9792 100644 --- a/net/ipv4/ah4.c +++ b/net/ipv4/ah4.c @@ -137,7 +137,7 @@ static void ah_output_done(void *data, int err) top_iph->tos = iph->tos; top_iph->ttl = iph->ttl; top_iph->frag_off = iph->frag_off; - if (top_iph->ihl != 5) { + if (top_iph->ihl > 5) { top_iph->daddr = iph->daddr; memcpy(top_iph+1, iph+1, top_iph->ihl*4 - sizeof(struct iphdr)); } @@ -197,7 +197,7 @@ static int ah_output(struct xfrm_state *x, struct sk_buff *skb) iph->ttl = top_iph->ttl; iph->frag_off = top_iph->frag_off; - if (top_iph->ihl != 5) { + if (top_iph->ihl > 5) { iph->daddr = top_iph->daddr; memcpy(iph+1, top_iph+1, top_iph->ihl*4 - sizeof(struct iphdr)); err = ip_clear_mutable_options(top_iph, &top_iph->daddr); @@ -253,7 +253,7 @@ static int ah_output(struct xfrm_state *x, struct sk_buff *skb) top_iph->tos = iph->tos; top_iph->ttl = iph->ttl; top_iph->frag_off = iph->frag_off; - if (top_iph->ihl != 5) { + if (top_iph->ihl > 5) { top_iph->daddr = iph->daddr; memcpy(top_iph+1, iph+1, top_iph->ihl*4 - sizeof(struct iphdr)); }
ah_output() and ah_output_done() copy the IPv4 options area with if (top_iph->ihl != 5) { memcpy(dst, src, top_iph->ihl * 4 - sizeof(struct iphdr)); } The "!= 5" guard correctly excludes the no-options case (ihl == 5) and allows ihl > 5 where options are present. It does NOT exclude ihl < 5. For ihl in [0, 4], top_iph->ihl * 4 is less than sizeof(struct iphdr) (20); the subtraction is computed as int, becomes negative, and is then implicitly converted to size_t at the memcpy() call. The resulting length is close to SIZE_MAX and memcpy walks off the slab allocation backing the skb's network header. With the preceding patch ("ipv4: raw: reject IP_HDRINCL packets with ihl < 5") in place, an ihl < 5 packet from a raw IP_HDRINCL socket is rejected before it reaches the local-output path. However, post-LOCAL_OUT hook mangling (nftables payload-set, NFQUEUE reinject) can still rewrite the IPv4 header after the raw_send_hdrinc validation has run and deliver an ihl < 5 packet to ah_output(). Reachability of this path requires CAP_NET_ADMIN in the relevant netns; it is a smaller class than the original CAP_NET_RAW path but it is not zero. Independently of the post-LOCAL_OUT mangling question, the AH consumer should not contain a memcpy whose size is derived from an attacker-influenced field without a floor. Change the guard to "top_iph->ihl > 5" at all three sites: - ah_output_done() (the .complete callback path) - ah_output() (the synchronous options-copy site) - ah_output() (the post-hash restore site) Behavior for valid packets (ihl in {5, 6, ..., 15}) is unchanged. For malformed packets with ihl < 5, the options copy is cleanly skipped; the malformed field no longer becomes a huge memcpy length. This is the defense-in-depth half of the series; the upstream sanity check in the preceding patch is the primary fix. A mirror-pattern audit found no analogous bug in ah_input(), ip_clear_mutable_options(), or net/ipv6/ah6.c (IPv6 has a fixed-length header and no IP_HDRINCL equivalent for crafting an ihl < 5 ipv6hdr). Reproduced on UML + KASAN: kernel-mode fault at addr 0x0 with memcpy_orig at the crash site on a pre-fix kernel. The AH guard was verified by forcing the same packets through xfrm: the xfrm state counter incremented and no KASAN splat or panic occurred. With the preceding patch in this series, the original raw IP_HDRINCL path is rejected before AH. Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Michael Bommarito <michael.bommarito@gmail.com> --- net/ipv4/ah4.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)