diff mbox series

[net] tun: correct header offsets in napi frags mode

Message ID 20200528170532.215352-1-willemdebruijn.kernel@gmail.com
State Changes Requested
Delegated to: David Miller
Headers show
Series [net] tun: correct header offsets in napi frags mode | expand

Commit Message

Willem de Bruijn May 28, 2020, 5:05 p.m. UTC
From: Willem de Bruijn <willemb@google.com>

Tun in IFF_NAPI_FRAGS mode calls napi_gro_frags. Unlike netif_rx and
netif_gro_receive, this expects skb->data to point to the mac layer.

But skb_probe_transport_header, __skb_get_hash_symmetric, and
xdp_do_generic in tun_get_user need skb->data to point to the network
header. Flow dissection also needs skb->protocol set, so
eth_type_trans has to be called.

Temporarily pull ETH_HLEN to make control flow the same for frags and
not frags. Then push the header just before calling napi_gro_frags.

Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver")
Signed-off-by: Willem de Bruijn <willemb@google.com>
---
 drivers/net/tun.c | 14 ++++++++++----
 1 file changed, 10 insertions(+), 4 deletions(-)

Comments

Petar Penkov May 28, 2020, 8:34 p.m. UTC | #1
On Thu, May 28, 2020 at 10:07 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> From: Willem de Bruijn <willemb@google.com>
>
> Tun in IFF_NAPI_FRAGS mode calls napi_gro_frags. Unlike netif_rx and
> netif_gro_receive, this expects skb->data to point to the mac layer.
>
> But skb_probe_transport_header, __skb_get_hash_symmetric, and
> xdp_do_generic in tun_get_user need skb->data to point to the network
> header. Flow dissection also needs skb->protocol set, so
> eth_type_trans has to be called.
>
> Temporarily pull ETH_HLEN to make control flow the same for frags and
> not frags. Then push the header just before calling napi_gro_frags.
>
> Fixes: 90e33d459407 ("tun: enable napi_gro_frags() for TUN/TAP driver")
> Signed-off-by: Willem de Bruijn <willemb@google.com>

Acked-by: Petar Penkov <ppenkov@google.com>

> ---
>  drivers/net/tun.c | 14 ++++++++++----
>  1 file changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 44889eba1dbc..b984733c6c31 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -1871,8 +1871,11 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>                 skb->dev = tun->dev;
>                 break;
>         case IFF_TAP:
> -               if (!frags)
> -                       skb->protocol = eth_type_trans(skb, tun->dev);
> +               if (frags && !pskb_may_pull(skb, ETH_HLEN)) {
> +                       err = -ENOMEM;
> +                       goto drop;
> +               }
> +               skb->protocol = eth_type_trans(skb, tun->dev);
>                 break;
>         }
>
> @@ -1929,9 +1932,12 @@ static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
>         }
>
>         if (frags) {
> +               u32 headlen;
> +
>                 /* Exercise flow dissector code path. */
> -               u32 headlen = eth_get_headlen(tun->dev, skb->data,
> -                                             skb_headlen(skb));
> +               skb_push(skb, ETH_HLEN);
> +               headlen = eth_get_headlen(tun->dev, skb->data,
> +                                         skb_headlen(skb));
>
>                 if (unlikely(headlen > skb_headlen(skb))) {
>                         this_cpu_inc(tun->pcpu_stats->rx_dropped);
> --
> 2.27.0.rc0.183.gde8f92d652-goog
>
David Miller May 30, 2020, 12:27 a.m. UTC | #2
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Thu, 28 May 2020 13:05:32 -0400

> Temporarily pull ETH_HLEN to make control flow the same for frags and
> not frags. Then push the header just before calling napi_gro_frags.
 ...
>  	case IFF_TAP:
> -		if (!frags)
> -			skb->protocol = eth_type_trans(skb, tun->dev);
> +		if (frags && !pskb_may_pull(skb, ETH_HLEN)) {
> +			err = -ENOMEM;
> +			goto drop;
> +		}
> +		skb->protocol = eth_type_trans(skb, tun->dev);
 ...
>  		/* Exercise flow dissector code path. */
> -		u32 headlen = eth_get_headlen(tun->dev, skb->data,
> -					      skb_headlen(skb));
> +		skb_push(skb, ETH_HLEN);
> +		headlen = eth_get_headlen(tun->dev, skb->data,
> +					  skb_headlen(skb));

I hate to be a stickler on wording in the commit message, but the
change is not really "pulling" the ethernet header from the SKB.

Instead it is invoking pskb_may_pull() which just makes sure the
header is there in the linear SKB data area.

Can you please refine this description and resubmit?

Thank you.
Willem de Bruijn May 30, 2020, 3:14 a.m. UTC | #3
On Fri, May 29, 2020 at 8:27 PM David Miller <davem@davemloft.net> wrote:
>
> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Date: Thu, 28 May 2020 13:05:32 -0400
>
> > Temporarily pull ETH_HLEN to make control flow the same for frags and
> > not frags. Then push the header just before calling napi_gro_frags.
>  ...
> >       case IFF_TAP:
> > -             if (!frags)
> > -                     skb->protocol = eth_type_trans(skb, tun->dev);
> > +             if (frags && !pskb_may_pull(skb, ETH_HLEN)) {
> > +                     err = -ENOMEM;
> > +                     goto drop;
> > +             }
> > +             skb->protocol = eth_type_trans(skb, tun->dev);
>  ...
> >               /* Exercise flow dissector code path. */
> > -             u32 headlen = eth_get_headlen(tun->dev, skb->data,
> > -                                           skb_headlen(skb));
> > +             skb_push(skb, ETH_HLEN);
> > +             headlen = eth_get_headlen(tun->dev, skb->data,
> > +                                       skb_headlen(skb));
>
> I hate to be a stickler on wording in the commit message, but the
> change is not really "pulling" the ethernet header from the SKB.
>
> Instead it is invoking pskb_may_pull() which just makes sure the
> header is there in the linear SKB data area.
>
> Can you please refine this description and resubmit?

Of course. How is this

"
    Ensure the link layer header lies in linear as eth_type_trans pulls
    ETH_HLEN. Then take the same code paths for frags as for not frags.
    Push the link layer header back just before calling napi_gro_frags.

    By pulling up to ETH_HLEN from frag0 into linear, this disables the
    frag0 optimization in the special case when IFF_NAPI_FRAGS is
    called with zero length iov[0] (and thus empty skb->linear).
"

Seemed good to add the extra clarification. I don't see a reasonable way
to avoid that consequence, especially as I cannot restore the first skb frag
(iov[1]) if it was exactly ETH_HLEN bytes and thus freed by __skb_pull_tail.
Willem de Bruijn May 30, 2020, 7:47 p.m. UTC | #4
On Fri, May 29, 2020 at 11:14 PM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> On Fri, May 29, 2020 at 8:27 PM David Miller <davem@davemloft.net> wrote:
> >
> > From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> > Date: Thu, 28 May 2020 13:05:32 -0400
> >
> > > Temporarily pull ETH_HLEN to make control flow the same for frags and
> > > not frags. Then push the header just before calling napi_gro_frags.
> >  ...
> > >       case IFF_TAP:
> > > -             if (!frags)
> > > -                     skb->protocol = eth_type_trans(skb, tun->dev);
> > > +             if (frags && !pskb_may_pull(skb, ETH_HLEN)) {
> > > +                     err = -ENOMEM;
> > > +                     goto drop;
> > > +             }
> > > +             skb->protocol = eth_type_trans(skb, tun->dev);
> >  ...
> > >               /* Exercise flow dissector code path. */
> > > -             u32 headlen = eth_get_headlen(tun->dev, skb->data,
> > > -                                           skb_headlen(skb));
> > > +             skb_push(skb, ETH_HLEN);
> > > +             headlen = eth_get_headlen(tun->dev, skb->data,
> > > +                                       skb_headlen(skb));
> >
> > I hate to be a stickler on wording in the commit message, but the
> > change is not really "pulling" the ethernet header from the SKB.
> >
> > Instead it is invoking pskb_may_pull() which just makes sure the
> > header is there in the linear SKB data area.
> >
> > Can you please refine this description and resubmit?
>
> Of course. How is this
>
> "
>     Ensure the link layer header lies in linear as eth_type_trans pulls
>     ETH_HLEN. Then take the same code paths for frags as for not frags.
>     Push the link layer header back just before calling napi_gro_frags.
>
>     By pulling up to ETH_HLEN from frag0 into linear, this disables the
>     frag0 optimization in the special case when IFF_NAPI_FRAGS is
>     called with zero length iov[0] (and thus empty skb->linear).
> "
>
> Seemed good to add the extra clarification. I don't see a reasonable way
> to avoid that consequence, especially as I cannot restore the first skb frag
> (iov[1]) if it was exactly ETH_HLEN bytes and thus freed by __skb_pull_tail.

Sent. Probably faster that way. Do let me know if still too fast and
loose with wording. I can always do a v3.

Or to add some frags gymnastics to try to maintain the frag0 optimization
when iov[1] > ETH_LEN and frag0 thus can be restored. That just makes
for a more complicated fix.
diff mbox series

Patch

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 44889eba1dbc..b984733c6c31 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -1871,8 +1871,11 @@  static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 		skb->dev = tun->dev;
 		break;
 	case IFF_TAP:
-		if (!frags)
-			skb->protocol = eth_type_trans(skb, tun->dev);
+		if (frags && !pskb_may_pull(skb, ETH_HLEN)) {
+			err = -ENOMEM;
+			goto drop;
+		}
+		skb->protocol = eth_type_trans(skb, tun->dev);
 		break;
 	}
 
@@ -1929,9 +1932,12 @@  static ssize_t tun_get_user(struct tun_struct *tun, struct tun_file *tfile,
 	}
 
 	if (frags) {
+		u32 headlen;
+
 		/* Exercise flow dissector code path. */
-		u32 headlen = eth_get_headlen(tun->dev, skb->data,
-					      skb_headlen(skb));
+		skb_push(skb, ETH_HLEN);
+		headlen = eth_get_headlen(tun->dev, skb->data,
+					  skb_headlen(skb));
 
 		if (unlikely(headlen > skb_headlen(skb))) {
 			this_cpu_inc(tun->pcpu_stats->rx_dropped);