diff mbox series

[ovs-dev,v3,1/2] skbuff: Add skb_network_trim

Message ID 1516675339-962-1-git-send-email-eswierk@skyportsystems.com
State Not Applicable
Headers show
Series [ovs-dev,v3,1/2] skbuff: Add skb_network_trim | expand

Commit Message

Kumar, Rohit via dev Jan. 23, 2018, 2:42 a.m. UTC
IPv4 and IPv6 packets may arrive with lower-layer padding that is not
included in the L3 length. For example, a short IPv4 packet may have
up to 6 bytes of padding following the IP payload when received on an
Ethernet device with a minimum packet length of 64 bytes.

Higher-layer processing functions in netfilter (e.g. nf_ip_checksum(),
and help() in nf_conntrack_ftp) assume skb->len reflects the length of
the L3 header and payload, rather than referring back to
ip_hdr->tot_len or ipv6_hdr->payload_len, and get confused by
lower-layer padding.

In the normal IPv4 receive path, ip_rcv() trims the packet to
ip_hdr->tot_len before invoking netfilter hooks. In the IPv6 receive
path, ip6_rcv() does the same using ipv6_hdr->payload_len. Similarly
in the br_netfilter receive path, br_validate_ipv4() and
br_validate_ipv6() trim the packet to the L3 length before invoking
netfilter hooks.

Currently the openvswitch receive path does not perform such trimming,
which will be fixed by the next patch. In preparation, this patch adds
a generic skb_network_trim() function.

Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>
---
 include/linux/skbuff.h |  2 ++
 net/core/skbuff.c      | 44 ++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 46 insertions(+)

Comments

Pravin Shelar Jan. 23, 2018, 5:40 a.m. UTC | #1
On Mon, Jan 22, 2018 at 6:42 PM, Ed Swierk <eswierk@skyportsystems.com> wrote:
> IPv4 and IPv6 packets may arrive with lower-layer padding that is not
> included in the L3 length. For example, a short IPv4 packet may have
> up to 6 bytes of padding following the IP payload when received on an
> Ethernet device with a minimum packet length of 64 bytes.
>
> Higher-layer processing functions in netfilter (e.g. nf_ip_checksum(),
> and help() in nf_conntrack_ftp) assume skb->len reflects the length of
> the L3 header and payload, rather than referring back to
> ip_hdr->tot_len or ipv6_hdr->payload_len, and get confused by
> lower-layer padding.
>
> In the normal IPv4 receive path, ip_rcv() trims the packet to
> ip_hdr->tot_len before invoking netfilter hooks. In the IPv6 receive
> path, ip6_rcv() does the same using ipv6_hdr->payload_len. Similarly
> in the br_netfilter receive path, br_validate_ipv4() and
> br_validate_ipv6() trim the packet to the L3 length before invoking
> netfilter hooks.
>
> Currently the openvswitch receive path does not perform such trimming,
> which will be fixed by the next patch. In preparation, this patch adds
> a generic skb_network_trim() function.
>
> Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>

Acked-by: Pravin B Shelar <pshelar@ovn.org>

Thanks.
David Miller Jan. 24, 2018, 1:05 a.m. UTC | #2
From: Ed Swierk <eswierk@skyportsystems.com>
Date: Mon, 22 Jan 2018 18:42:18 -0800

> IPv4 and IPv6 packets may arrive with lower-layer padding that is not
> included in the L3 length. For example, a short IPv4 packet may have
> up to 6 bytes of padding following the IP payload when received on an
> Ethernet device with a minimum packet length of 64 bytes.
> 
> Higher-layer processing functions in netfilter (e.g. nf_ip_checksum(),
> and help() in nf_conntrack_ftp) assume skb->len reflects the length of
> the L3 header and payload, rather than referring back to
> ip_hdr->tot_len or ipv6_hdr->payload_len, and get confused by
> lower-layer padding.
> 
> In the normal IPv4 receive path, ip_rcv() trims the packet to
> ip_hdr->tot_len before invoking netfilter hooks. In the IPv6 receive
> path, ip6_rcv() does the same using ipv6_hdr->payload_len. Similarly
> in the br_netfilter receive path, br_validate_ipv4() and
> br_validate_ipv6() trim the packet to the L3 length before invoking
> netfilter hooks.
> 
> Currently the openvswitch receive path does not perform such trimming,
> which will be fixed by the next patch. In preparation, this patch adds
> a generic skb_network_trim() function.
> 
> Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>

Unfortunately this helper needs some more work.

It really doesn't belong in generic skbuff.c code, as it adds a whole
bunch of protocol specific header accesses and interpretation.

Also, there is only a very specific context in which it can be used
(receive) and this is not clear at all from the name.

To be quite honest, since there will be no other users of this helper
right now, just put it into the OVS conntrack.c code.

Thank you.
Pravin Shelar Jan. 24, 2018, 4:34 a.m. UTC | #3
On Mon, Jan 22, 2018 at 6:42 PM, Ed Swierk <eswierk@skyportsystems.com> wrote:
> IPv4 and IPv6 packets may arrive with lower-layer padding that is not
> included in the L3 length. For example, a short IPv4 packet may have
> up to 6 bytes of padding following the IP payload when received on an
> Ethernet device with a minimum packet length of 64 bytes.
>
> Higher-layer processing functions in netfilter (e.g. nf_ip_checksum(),
> and help() in nf_conntrack_ftp) assume skb->len reflects the length of
> the L3 header and payload, rather than referring back to
> ip_hdr->tot_len or ipv6_hdr->payload_len, and get confused by
> lower-layer padding.
>
> In the normal IPv4 receive path, ip_rcv() trims the packet to
> ip_hdr->tot_len before invoking netfilter hooks. In the IPv6 receive
> path, ip6_rcv() does the same using ipv6_hdr->payload_len. Similarly
> in the br_netfilter receive path, br_validate_ipv4() and
> br_validate_ipv6() trim the packet to the L3 length before invoking
> netfilter hooks.
>
> Currently the openvswitch receive path does not perform such trimming,
> which will be fixed by the next patch. In preparation, this patch adds
> a generic skb_network_trim() function.
>
> Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>
> ---
>  include/linux/skbuff.h |  2 ++
>  net/core/skbuff.c      | 44 ++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 46 insertions(+)
>
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index 051e093..0478645 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -4018,6 +4018,8 @@ struct sk_buff *skb_checksum_trimmed(struct sk_buff *skb,
>                                      unsigned int transport_len,
>                                      __sum16(*skb_chkf)(struct sk_buff *skb));
>
> +int skb_network_trim(struct sk_buff *skb);
> +
>  /**
>   * skb_head_is_locked - Determine if the skb->head is locked down
>   * @skb: skb to check
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 15fa5ba..cef3d1e 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4743,6 +4743,50 @@ struct sk_buff *skb_checksum_trimmed(struct sk_buff *skb,
>  }
>  EXPORT_SYMBOL(skb_checksum_trimmed);
>
> +/**
> + * skb_network_trim - trim skb to length specified by the network header
> + * @skb: the skb to trim
> + *
> + * Trims the skb to the length specified by the IP/IPv6 header,
> + * removing any trailing lower-layer padding. This prepares the skb
> + * for higher-layer processing that assumes skb->len excludes padding.
> + *
> + * Leaves the skb alone if the protocol is not IP or IPv6. Frees the
> + * skb on error.
> + *
> + * Caller needs to pull the skb to the network header.
> + */
> +int skb_network_trim(struct sk_buff *skb)
> +{
> +       unsigned int len;
> +       int err = -ENOMEM;
> +
> +       switch (skb->protocol) {
> +       case htons(ETH_P_IP):
> +               if (!pskb_may_pull(skb, sizeof(struct iphdr)))
> +                       goto out;
Since you are going to move this to OVS specific code, can you remove
this skb-pull, which is not required in OVS code path.

> +               len = ntohs(ip_hdr(skb)->tot_len);
> +               break;
> +       case htons(ETH_P_IPV6):
> +               if (!pskb_may_pull(skb, sizeof(struct ipv6hdr)))
> +                       goto out;
> +               len = sizeof(struct ipv6hdr)
> +                       + ntohs(ipv6_hdr(skb)->payload_len);
diff mbox series

Patch

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 051e093..0478645 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -4018,6 +4018,8 @@  struct sk_buff *skb_checksum_trimmed(struct sk_buff *skb,
 				     unsigned int transport_len,
 				     __sum16(*skb_chkf)(struct sk_buff *skb));
 
+int skb_network_trim(struct sk_buff *skb);
+
 /**
  * skb_head_is_locked - Determine if the skb->head is locked down
  * @skb: skb to check
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 15fa5ba..cef3d1e 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4743,6 +4743,50 @@  struct sk_buff *skb_checksum_trimmed(struct sk_buff *skb,
 }
 EXPORT_SYMBOL(skb_checksum_trimmed);
 
+/**
+ * skb_network_trim - trim skb to length specified by the network header
+ * @skb: the skb to trim
+ *
+ * Trims the skb to the length specified by the IP/IPv6 header,
+ * removing any trailing lower-layer padding. This prepares the skb
+ * for higher-layer processing that assumes skb->len excludes padding.
+ *
+ * Leaves the skb alone if the protocol is not IP or IPv6. Frees the
+ * skb on error.
+ *
+ * Caller needs to pull the skb to the network header.
+ */
+int skb_network_trim(struct sk_buff *skb)
+{
+	unsigned int len;
+	int err = -ENOMEM;
+
+	switch (skb->protocol) {
+	case htons(ETH_P_IP):
+		if (!pskb_may_pull(skb, sizeof(struct iphdr)))
+			goto out;
+		len = ntohs(ip_hdr(skb)->tot_len);
+		break;
+	case htons(ETH_P_IPV6):
+		if (!pskb_may_pull(skb, sizeof(struct ipv6hdr)))
+			goto out;
+		len = sizeof(struct ipv6hdr)
+			+ ntohs(ipv6_hdr(skb)->payload_len);
+		break;
+	default:
+		len = skb->len;
+	}
+
+	err = pskb_trim_rcsum(skb, len);
+
+out:
+	if (err)
+		kfree_skb(skb);
+
+	return err;
+}
+EXPORT_SYMBOL(skb_network_trim);
+
 void __skb_warn_lro_forwarding(const struct sk_buff *skb)
 {
 	net_warn_ratelimited("%s: received packets cannot be forwarded while LRO is enabled\n",