Message ID | 20200718091732.8761-1-srirakr2@cisco.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | [v2] AF_PACKET doesnt strip VLAN information | expand |
On Sat, Jul 18, 2020 at 5:25 AM Sriram Krishnan <srirakr2@cisco.com> wrote: > > When an application sends with AF_PACKET and places a vlan header on > the raw packet; then the AF_PACKET needs to move the tag into the skb > so that it gets processed normally through the rest of the transmit > path. > > This is particularly a problem on Hyper-V where the host only allows > vlan in the offload info. Shouldn't this be resolved in that driver, then? Packet sockets are not the only way to introduce packets in the kernel. Tuntap would be another.
Hi Sriram, Thank you for the patch! Perhaps something to improve: [auto build test WARNING on hnaz-linux-mm/master] [also build test WARNING on sparc-next/master linus/master v5.8-rc5 next-20200717] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch] url: https://github.com/0day-ci/linux/commits/Sriram-Krishnan/AF_PACKET-doesnt-strip-VLAN-information/20200718-172707 base: https://github.com/hnaz/linux-mm master config: s390-randconfig-s031-20200719 (attached as .config) compiler: s390-linux-gcc (GCC) 9.3.0 reproduce: wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # apt-get install sparse # sparse version: v0.6.2-49-g707c5017-dirty # save the attached .config to linux build tree COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' ARCH=s390 If you fix the issue, kindly add following tag as appropriate Reported-by: kernel test robot <lkp@intel.com> sparse warnings: (new ones prefixed by >>) >> net/packet/af_packet.c:1877:52: sparse: sparse: incorrect type in argument 2 (different base types) @@ expected unsigned short [usertype] *vlan_tci @@ got restricted __be16 * @@ >> net/packet/af_packet.c:1877:52: sparse: expected unsigned short [usertype] *vlan_tci >> net/packet/af_packet.c:1877:52: sparse: got restricted __be16 * >> net/packet/af_packet.c:1881:64: sparse: sparse: incorrect type in argument 3 (different base types) @@ expected unsigned short [usertype] vlan_tci @@ got restricted __be16 [addressable] [usertype] vlan_tci @@ net/packet/af_packet.c:1881:64: sparse: expected unsigned short [usertype] vlan_tci >> net/packet/af_packet.c:1881:64: sparse: got restricted __be16 [addressable] [usertype] vlan_tci vim +1877 net/packet/af_packet.c 1859 1860 static int packet_parse_headers(struct sk_buff *skb, struct socket *sock) 1861 { 1862 if ((!skb->protocol || skb->protocol == htons(ETH_P_ALL)) && 1863 sock->type == SOCK_RAW) { 1864 __be16 ethertype; 1865 1866 skb_reset_mac_header(skb); 1867 1868 ethertype = eth_hdr(skb)->h_proto; 1869 /* 1870 * If Vlan tag is present in the packet 1871 * move it to skb 1872 */ 1873 if (eth_type_vlan(ethertype)) { 1874 int err; 1875 __be16 vlan_tci; 1876 > 1877 err = __skb_vlan_pop(skb, &vlan_tci); 1878 if (unlikely(err)) 1879 return err; 1880 > 1881 __vlan_hwaccel_put_tag(skb, ethertype, vlan_tci); 1882 } 1883 1884 skb->protocol = dev_parse_header_protocol(skb); 1885 } 1886 1887 skb_probe_transport_header(skb); 1888 return 0; 1889 } 1890 --- 0-DAY CI Kernel Test Service, Intel Corporation https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Mon, Jul 20, 2020 at 12:27 AM Sriram Krishnan (srirakr2) <srirakr2@cisco.com> wrote: > > +Stephen Hemminger > > Hi Willem, > Thanks for looking into the code, I understand that this is more of a generic problem wherein many of the filtering functions assume the vlan tag to be in the skb rather than in the packet. Hence we moved the fix from the driver to the common AF packet that our solution uses. > > I recall from the v1 of the patch you had mentioned other common areas where this fix might be relevant (such as tap/tun), but I'm afraid I cant comprehensively test those patches out. Please let me know your thoughts Please use plain text to respond. HTML replies do not reach the list. Can you be more precise in which other code besides the hyper-v driver is affected? Do you have an example? This is a resubmit of the original patch. My previous questions/concerns remain valid: - if the function can now fail, all callers must be updated to detect and handle that - any solution should probably address all inputs into the tx path: packet sockets, tuntap, virtio-net - this only addresses packet sockets with ETH_P_ALL/ETH_P_NONE. Not sockets that set ETH_P_8021Q - which code in the transmit stack requires the tag to be in the skb, and does this problem after this patch still persist for Q-in-Q?
I have moved the code to the driver and pushed a new patch due to the below highlighted issues. Stephen H, Please let me know if you have any concerns localising the changes to the netvsc driver. Thanks, Sriram On 20/07/20, 7:23 PM, "Willem de Bruijn" <willemdebruijn.kernel@gmail.com> wrote: On Mon, Jul 20, 2020 at 12:27 AM Sriram Krishnan (srirakr2) <srirakr2@cisco.com> wrote: > > +Stephen Hemminger > > Hi Willem, > Thanks for looking into the code, I understand that this is more of a generic problem wherein many of the filtering functions assume the vlan tag to be in the skb rather than in the packet. Hence we moved the fix from the driver to the common AF packet that our solution uses. > > I recall from the v1 of the patch you had mentioned other common areas where this fix might be relevant (such as tap/tun), but I'm afraid I cant comprehensively test those patches out. Please let me know your thoughts Please use plain text to respond. HTML replies do not reach the list. Can you be more precise in which other code besides the hyper-v driver is affected? Do you have an example? This is a resubmit of the original patch. My previous questions/concerns remain valid: - if the function can now fail, all callers must be updated to detect and handle that - any solution should probably address all inputs into the tx path: packet sockets, tuntap, virtio-net - this only addresses packet sockets with ETH_P_ALL/ETH_P_NONE. Not sockets that set ETH_P_8021Q - which code in the transmit stack requires the tag to be in the skb, and does this problem after this patch still persist for Q-in-Q?
On Mon, 20 Jul 2020 09:52:27 -0400 Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > On Mon, Jul 20, 2020 at 12:27 AM Sriram Krishnan (srirakr2) > <srirakr2@cisco.com> wrote: > > > > +Stephen Hemminger > > > > Hi Willem, > > Thanks for looking into the code, I understand that this is more of a generic problem wherein many of the filtering functions assume the vlan tag to be in the skb rather than in the packet. Hence we moved the fix from the driver to the common AF packet that our solution uses. > > > > I recall from the v1 of the patch you had mentioned other common areas where this fix might be relevant (such as tap/tun), but I'm afraid I cant comprehensively test those patches out. Please let me know your thoughts > > Please use plain text to respond. HTML replies do not reach the list. > > Can you be more precise in which other code besides the hyper-v driver > is affected? Do you have an example? > > This is a resubmit of the original patch. My previous > questions/concerns remain valid: > > - if the function can now fail, all callers must be updated to detect > and handle that > > - any solution should probably address all inputs into the tx path: > packet sockets, tuntap, virtio-net > > - this only addresses packet sockets with ETH_P_ALL/ETH_P_NONE. Not > sockets that set ETH_P_8021Q > > - which code in the transmit stack requires the tag to be in the skb, > and does this problem after this patch still persist for Q-in-Q? It matters because the problem is generic, not just to the netvsc driver. For example, BPF programs and netfilter rules will see different packets when send is through AF_PACKET than they would see for sends from the kernel stack. Presenting uniform data to the lower layers makes sense.
On Mon, Jul 20, 2020 at 4:57 PM Stephen Hemminger <stephen@networkplumber.org> wrote: > > On Mon, 20 Jul 2020 09:52:27 -0400 > Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > > On Mon, Jul 20, 2020 at 12:27 AM Sriram Krishnan (srirakr2) > > <srirakr2@cisco.com> wrote: > > > > > > +Stephen Hemminger > > > > > > Hi Willem, > > > Thanks for looking into the code, I understand that this is more of a generic problem wherein many of the filtering functions assume the vlan tag to be in the skb rather than in the packet. Hence we moved the fix from the driver to the common AF packet that our solution uses. > > > > > > I recall from the v1 of the patch you had mentioned other common areas where this fix might be relevant (such as tap/tun), but I'm afraid I cant comprehensively test those patches out. Please let me know your thoughts > > > > Please use plain text to respond. HTML replies do not reach the list. > > > > Can you be more precise in which other code besides the hyper-v driver > > is affected? Do you have an example? > > > > This is a resubmit of the original patch. My previous > > questions/concerns remain valid: > > > > - if the function can now fail, all callers must be updated to detect > > and handle that > > > > - any solution should probably address all inputs into the tx path: > > packet sockets, tuntap, virtio-net > > > > - this only addresses packet sockets with ETH_P_ALL/ETH_P_NONE. Not > > sockets that set ETH_P_8021Q > > > > - which code in the transmit stack requires the tag to be in the skb, > > and does this problem after this patch still persist for Q-in-Q? > > It matters because the problem is generic, not just to the netvsc driver. > For example, BPF programs and netfilter rules will see different packets > when send is through AF_PACKET than they would see for sends from the > kernel stack. > > Presenting uniform data to the lower layers makes sense. Are all forwarded and locally generated packets guaranteed to always have VLAN information in the tag (so that this is indeed only an issue with input from userspace, through tuntap, virtio and packet sockets)? I guess the first might be assured due to this in __netif_receive_skb_core: if (skb->protocol == cpu_to_be16(ETH_P_8021Q) || skb->protocol == cpu_to_be16(ETH_P_8021AD)) { skb = skb_vlan_untag(skb); if (unlikely(!skb)) goto out; } and the second by this in vlan_dev_hard_start_xmit: if (veth->h_vlan_proto != vlan->vlan_proto || vlan->flags & VLAN_FLAG_REORDER_HDR) { u16 vlan_tci; vlan_tci = vlan->vlan_id; vlan_tci |= vlan_dev_get_egress_qos_mask(dev, skb->priority); __vlan_hwaccel_put_tag(skb, vlan->vlan_proto, vlan_tci); } But I don't know this code very well, so that is based on a very cursory glance only. Might well be missing other paths. (update: I think pktgen is another example.) Netfilter and BPF still need to handle tags in the packet for Q-in-Q, right? So does this actually simplify their logic? If the above holds and Q-in-Q is not a problem, then doing the same on ingress from userspace may make sense. I don't know the kind of BPF or netfilter programs what would be affected, and how. Then it would be good to all those inputs at once to really plug the hole. See also virtio_net_hdr_to_skb for another example of code that applies to all of tuntap, virtio, pf_packet and uml.
On Mon, 20 Jul 2020 17:22:49 -0400 Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > On Mon, Jul 20, 2020 at 4:57 PM Stephen Hemminger > <stephen@networkplumber.org> wrote: > > > > On Mon, 20 Jul 2020 09:52:27 -0400 > > Willem de Bruijn <willemdebruijn.kernel@gmail.com> wrote: > > > > > On Mon, Jul 20, 2020 at 12:27 AM Sriram Krishnan (srirakr2) > > > <srirakr2@cisco.com> wrote: > > > > > > > > +Stephen Hemminger > > > > > > > > Hi Willem, > > > > Thanks for looking into the code, I understand that this is more of a generic problem wherein many of the filtering functions assume the vlan tag to be in the skb rather than in the packet. Hence we moved the fix from the driver to the common AF packet that our solution uses. > > > > > > > > I recall from the v1 of the patch you had mentioned other common areas where this fix might be relevant (such as tap/tun), but I'm afraid I cant comprehensively test those patches out. Please let me know your thoughts > > > > > > Please use plain text to respond. HTML replies do not reach the list. > > > > > > Can you be more precise in which other code besides the hyper-v driver > > > is affected? Do you have an example? > > > > > > This is a resubmit of the original patch. My previous > > > questions/concerns remain valid: > > > > > > - if the function can now fail, all callers must be updated to detect > > > and handle that > > > > > > - any solution should probably address all inputs into the tx path: > > > packet sockets, tuntap, virtio-net > > > > > > - this only addresses packet sockets with ETH_P_ALL/ETH_P_NONE. Not > > > sockets that set ETH_P_8021Q > > > > > > - which code in the transmit stack requires the tag to be in the skb, > > > and does this problem after this patch still persist for Q-in-Q? > > > > It matters because the problem is generic, not just to the netvsc driver. > > For example, BPF programs and netfilter rules will see different packets > > when send is through AF_PACKET than they would see for sends from the > > kernel stack. > > > > Presenting uniform data to the lower layers makes sense. > > Are all forwarded and locally generated packets guaranteed to always > have VLAN information in the tag (so that this is indeed only an issue > with input from userspace, through tuntap, virtio and packet sockets)? > > I guess the first might be assured due to this in __netif_receive_skb_core: > > if (skb->protocol == cpu_to_be16(ETH_P_8021Q) || > skb->protocol == cpu_to_be16(ETH_P_8021AD)) { > skb = skb_vlan_untag(skb); > if (unlikely(!skb)) > goto out; > } > > and the second by this in vlan_dev_hard_start_xmit: > > if (veth->h_vlan_proto != vlan->vlan_proto || > vlan->flags & VLAN_FLAG_REORDER_HDR) { > u16 vlan_tci; > vlan_tci = vlan->vlan_id; > vlan_tci |= vlan_dev_get_egress_qos_mask(dev, skb->priority); > __vlan_hwaccel_put_tag(skb, vlan->vlan_proto, vlan_tci); > } > > But I don't know this code very well, so that is based on a very > cursory glance only. Might well be missing other paths. (update: I > think pktgen is another example.) > > Netfilter and BPF still need to handle tags in the packet for Q-in-Q, > right? So does this actually simplify their logic? > > If the above holds and Q-in-Q is not a problem, then doing the same on > ingress from userspace may make sense. I don't know the kind of BPF > or netfilter programs what would be affected, and how. > > Then it would be good to all those inputs at once to really plug the hole. > See also virtio_net_hdr_to_skb for another example of code that > applies to all of tuntap, virtio, pf_packet and uml. Older versions of Linux used to handle outer VLAN differentl based on what the driver supported. It was a mess. Some drivers and code paths would strip and put in meta-data, some would leave it in skb data. But in recent (like 5 yrs), the kernel has tried to be more uniform and only have vlan as skb tag. It looks like AF_PACKET was overlooked at that time.
From: Stephen Hemminger <stephen@networkplumber.org> Date: Mon, 20 Jul 2020 13:56:50 -0700 > It matters because the problem is generic, not just to the netvsc driver. > For example, BPF programs and netfilter rules will see different packets > when send is through AF_PACKET than they would see for sends from the > kernel stack. > > Presenting uniform data to the lower layers makes sense. The issue here is that for hyperv, vlan offloading is not a "may" but a "must" and I've never understood it to have that meaning. And I still haven't heard what is going to happen in Q-in-Q situations even with the ugly hyperv driver hack that is currently under review.
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c index 29bd405adbbd..10988639561e 100644 --- a/net/packet/af_packet.c +++ b/net/packet/af_packet.c @@ -1857,15 +1857,35 @@ static int packet_rcv_spkt(struct sk_buff *skb, struct net_device *dev, return 0; } -static void packet_parse_headers(struct sk_buff *skb, struct socket *sock) +static int packet_parse_headers(struct sk_buff *skb, struct socket *sock) { if ((!skb->protocol || skb->protocol == htons(ETH_P_ALL)) && sock->type == SOCK_RAW) { + __be16 ethertype; + skb_reset_mac_header(skb); + + ethertype = eth_hdr(skb)->h_proto; + /* + * If Vlan tag is present in the packet + * move it to skb + */ + if (eth_type_vlan(ethertype)) { + int err; + __be16 vlan_tci; + + err = __skb_vlan_pop(skb, &vlan_tci); + if (unlikely(err)) + return err; + + __vlan_hwaccel_put_tag(skb, ethertype, vlan_tci); + } + skb->protocol = dev_parse_header_protocol(skb); } skb_probe_transport_header(skb); + return 0; } /* @@ -1987,7 +2007,9 @@ static int packet_sendmsg_spkt(struct socket *sock, struct msghdr *msg, if (unlikely(extra_len == 4)) skb->no_fcs = 1; - packet_parse_headers(skb, sock); + err = packet_parse_headers(skb, sock); + if (err) + goto out_unlock; dev_queue_xmit(skb); rcu_read_unlock();
When an application sends with AF_PACKET and places a vlan header on the raw packet; then the AF_PACKET needs to move the tag into the skb so that it gets processed normally through the rest of the transmit path. This is particularly a problem on Hyper-V where the host only allows vlan in the offload info. Cc: xe-linux-external@cisco.com Cc: Sriram Krishnan <srirakr2@cisco.com> Signed-off-by: Sriram Krishnan <srirakr2@cisco.com> --- net/packet/af_packet.c | 26 ++++++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-)