diff mbox series

[RFC,v2,1/9] net: Pad short frames to minimum size before send from SLiRP/TAP

Message ID 1614763306-18026-2-git-send-email-bmeng.cn@gmail.com
State Superseded
Headers show
Series net: Handle short frames for SLiRP/TAP interfaces | expand

Commit Message

Bin Meng March 3, 2021, 9:21 a.m. UTC
From: Bin Meng <bin.meng@windriver.com>

The minimum Ethernet frame length is 60 bytes. For short frames with
smaller length like ARP packets (only 42 bytes), on a real world NIC
it can choose either padding its length to the minimum required 60
bytes, or sending it out directly to the wire. Such behavior can be
hardcoded or controled by a register bit. Similarly on the receive
path, NICs can choose either dropping such short frames directly or
handing them over to software to handle.

On the other hand, for the network backends SLiRP/TAP, they don't
expose a way to control the short frame behavior. As of today they
just send/receive data from/to the other end connected to them,
which means any sized packet is acceptable. So they can send and
receive short frames without any problem. It is observed that ARP
packets sent from SLiRP/TAP are 42 bytes, and SLiRP/TAP just send
these ARP packets to the other end which might be a NIC model that
does not allow short frames to pass through.

To provide better compatibility, for packets sent from SLiRP/TAP, we
change to pad short frames before sending it out to the other end.
This ensures SLiRP/TAP as an Ethernet sender do not violate the spec.
But with this change, the behavior of dropping short frames in the
NIC model cannot be emulated because it always receives a packet that
is spec complaint. The capability of sending short frames from NIC
models are still supported and short frames can still pass through
SLiRP/TAP interfaces.

This commit should be able to fix the issue as reported with some
NIC models before, that ARP requests get dropped, preventing the
guest from becoming visible on the network. It was workarounded in
these NIC models on the receive path, that when a short frame is
received, it is padded up to 60 bytes.

The following 2 commits seem to be the one to workaround this issue
in e1000 and vmxenet3 before, and should probably be reverted.

  commit 78aeb23eded2 ("e1000: Pad short frames to minimum size (60 bytes)")
  commit 40a87c6c9b11 ("vmxnet3: Pad short frames to minimum size (60 bytes)")

Signed-off-by: Bin Meng <bin.meng@windriver.com>

---

Changes in v2:
- only pad short frames for SLiRP/TAP interfaces

 include/net/eth.h |  1 +
 net/net.c         | 12 ++++++++++++
 2 files changed, 13 insertions(+)

Comments

Philippe Mathieu-Daudé March 3, 2021, 10:15 a.m. UTC | #1
On 3/3/21 10:21 AM, Bin Meng wrote:
> From: Bin Meng <bin.meng@windriver.com>
> 
> The minimum Ethernet frame length is 60 bytes. For short frames with
> smaller length like ARP packets (only 42 bytes), on a real world NIC
> it can choose either padding its length to the minimum required 60
> bytes, or sending it out directly to the wire. Such behavior can be
> hardcoded or controled by a register bit. Similarly on the receive
> path, NICs can choose either dropping such short frames directly or
> handing them over to software to handle.
> 
> On the other hand, for the network backends SLiRP/TAP, they don't
> expose a way to control the short frame behavior. As of today they
> just send/receive data from/to the other end connected to them,
> which means any sized packet is acceptable. So they can send and
> receive short frames without any problem. It is observed that ARP
> packets sent from SLiRP/TAP are 42 bytes, and SLiRP/TAP just send
> these ARP packets to the other end which might be a NIC model that
> does not allow short frames to pass through.
> 
> To provide better compatibility, for packets sent from SLiRP/TAP, we
> change to pad short frames before sending it out to the other end.
> This ensures SLiRP/TAP as an Ethernet sender do not violate the spec.
> But with this change, the behavior of dropping short frames in the
> NIC model cannot be emulated because it always receives a packet that
> is spec complaint. The capability of sending short frames from NIC
> models are still supported and short frames can still pass through
> SLiRP/TAP interfaces.
> 
> This commit should be able to fix the issue as reported with some
> NIC models before, that ARP requests get dropped, preventing the
> guest from becoming visible on the network. It was workarounded in
> these NIC models on the receive path, that when a short frame is
> received, it is padded up to 60 bytes.
> 
> The following 2 commits seem to be the one to workaround this issue
> in e1000 and vmxenet3 before, and should probably be reverted.
> 
>   commit 78aeb23eded2 ("e1000: Pad short frames to minimum size (60 bytes)")
>   commit 40a87c6c9b11 ("vmxnet3: Pad short frames to minimum size (60 bytes)")
> 
> Signed-off-by: Bin Meng <bin.meng@windriver.com>
> 
> ---
> 
> Changes in v2:
> - only pad short frames for SLiRP/TAP interfaces
> 
>  include/net/eth.h |  1 +
>  net/net.c         | 12 ++++++++++++
>  2 files changed, 13 insertions(+)
> 
> diff --git a/include/net/eth.h b/include/net/eth.h
> index 0671be6..7c825ec 100644
> --- a/include/net/eth.h
> +++ b/include/net/eth.h
> @@ -31,6 +31,7 @@
>  
>  #define ETH_ALEN 6
>  #define ETH_HLEN 14
> +#define ETH_ZLEN 60     /* Min. octets in frame sans FCS */
>  
>  struct eth_header {
>      uint8_t  h_dest[ETH_ALEN];   /* destination eth addr */
> diff --git a/net/net.c b/net/net.c
> index 32d71c1..27c3b25 100644
> --- a/net/net.c
> +++ b/net/net.c
> @@ -638,6 +638,7 @@ static ssize_t qemu_send_packet_async_with_flags(NetClientState *sender,
>                                                   NetPacketSent *sent_cb)
>  {
>      NetQueue *queue;
> +    uint8_t min_buf[ETH_ZLEN];
>      int ret;
>  
>  #ifdef DEBUG_NET
> @@ -649,6 +650,17 @@ static ssize_t qemu_send_packet_async_with_flags(NetClientState *sender,
>          return size;
>      }
>  
> +    /* Pad to minimum Ethernet frame length for SLiRP and TAP */
> +    if (sender->info->type == NET_CLIENT_DRIVER_USER ||
> +        sender->info->type == NET_CLIENT_DRIVER_TAP) {
> +        if (size < ETH_ZLEN) {
> +            memcpy(min_buf, buf, size);
> +            memset(&min_buf[size], 0, ETH_ZLEN - size);
> +            buf = min_buf;
> +            size = ETH_ZLEN;
> +        }

We can have zero-copy by using a static zeroed buf and rewrite
this function to call the _iov() equivalents with a pair of
struct iovec.

> +    }
> +
>      /* Let filters handle the packet first */
>      ret = filter_receive(sender, NET_FILTER_DIRECTION_TX,
>                           sender, flags, buf, size, sent_cb);
>
Bin Meng March 3, 2021, 10:31 a.m. UTC | #2
Hi Philippe,

On Wed, Mar 3, 2021 at 6:15 PM Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>
> On 3/3/21 10:21 AM, Bin Meng wrote:
> > From: Bin Meng <bin.meng@windriver.com>
> >
> > The minimum Ethernet frame length is 60 bytes. For short frames with
> > smaller length like ARP packets (only 42 bytes), on a real world NIC
> > it can choose either padding its length to the minimum required 60
> > bytes, or sending it out directly to the wire. Such behavior can be
> > hardcoded or controled by a register bit. Similarly on the receive
> > path, NICs can choose either dropping such short frames directly or
> > handing them over to software to handle.
> >
> > On the other hand, for the network backends SLiRP/TAP, they don't
> > expose a way to control the short frame behavior. As of today they
> > just send/receive data from/to the other end connected to them,
> > which means any sized packet is acceptable. So they can send and
> > receive short frames without any problem. It is observed that ARP
> > packets sent from SLiRP/TAP are 42 bytes, and SLiRP/TAP just send
> > these ARP packets to the other end which might be a NIC model that
> > does not allow short frames to pass through.
> >
> > To provide better compatibility, for packets sent from SLiRP/TAP, we
> > change to pad short frames before sending it out to the other end.
> > This ensures SLiRP/TAP as an Ethernet sender do not violate the spec.
> > But with this change, the behavior of dropping short frames in the
> > NIC model cannot be emulated because it always receives a packet that
> > is spec complaint. The capability of sending short frames from NIC
> > models are still supported and short frames can still pass through
> > SLiRP/TAP interfaces.
> >
> > This commit should be able to fix the issue as reported with some
> > NIC models before, that ARP requests get dropped, preventing the
> > guest from becoming visible on the network. It was workarounded in
> > these NIC models on the receive path, that when a short frame is
> > received, it is padded up to 60 bytes.
> >
> > The following 2 commits seem to be the one to workaround this issue
> > in e1000 and vmxenet3 before, and should probably be reverted.
> >
> >   commit 78aeb23eded2 ("e1000: Pad short frames to minimum size (60 bytes)")
> >   commit 40a87c6c9b11 ("vmxnet3: Pad short frames to minimum size (60 bytes)")
> >
> > Signed-off-by: Bin Meng <bin.meng@windriver.com>
> >
> > ---
> >
> > Changes in v2:
> > - only pad short frames for SLiRP/TAP interfaces
> >
> >  include/net/eth.h |  1 +
> >  net/net.c         | 12 ++++++++++++
> >  2 files changed, 13 insertions(+)
> >
> > diff --git a/include/net/eth.h b/include/net/eth.h
> > index 0671be6..7c825ec 100644
> > --- a/include/net/eth.h
> > +++ b/include/net/eth.h
> > @@ -31,6 +31,7 @@
> >
> >  #define ETH_ALEN 6
> >  #define ETH_HLEN 14
> > +#define ETH_ZLEN 60     /* Min. octets in frame sans FCS */
> >
> >  struct eth_header {
> >      uint8_t  h_dest[ETH_ALEN];   /* destination eth addr */
> > diff --git a/net/net.c b/net/net.c
> > index 32d71c1..27c3b25 100644
> > --- a/net/net.c
> > +++ b/net/net.c
> > @@ -638,6 +638,7 @@ static ssize_t qemu_send_packet_async_with_flags(NetClientState *sender,
> >                                                   NetPacketSent *sent_cb)
> >  {
> >      NetQueue *queue;
> > +    uint8_t min_buf[ETH_ZLEN];
> >      int ret;
> >
> >  #ifdef DEBUG_NET
> > @@ -649,6 +650,17 @@ static ssize_t qemu_send_packet_async_with_flags(NetClientState *sender,
> >          return size;
> >      }
> >
> > +    /* Pad to minimum Ethernet frame length for SLiRP and TAP */
> > +    if (sender->info->type == NET_CLIENT_DRIVER_USER ||
> > +        sender->info->type == NET_CLIENT_DRIVER_TAP) {
> > +        if (size < ETH_ZLEN) {
> > +            memcpy(min_buf, buf, size);
> > +            memset(&min_buf[size], 0, ETH_ZLEN - size);
> > +            buf = min_buf;
> > +            size = ETH_ZLEN;
> > +        }
>
> We can have zero-copy by using a static zeroed buf and rewrite
> this function to call the _iov() equivalents with a pair of
> struct iovec.

Do you have an example of doing zero-copy using _iov() equivalents?

Also I am not sure whether it's worth doing zero-copy given this is
only a 60 bytes copy.

>
> > +    }
> > +
> >      /* Let filters handle the packet first */
> >      ret = filter_receive(sender, NET_FILTER_DIRECTION_TX,
> >                           sender, flags, buf, size, sent_cb);
> >
>

Regards,
Bin
diff mbox series

Patch

diff --git a/include/net/eth.h b/include/net/eth.h
index 0671be6..7c825ec 100644
--- a/include/net/eth.h
+++ b/include/net/eth.h
@@ -31,6 +31,7 @@ 
 
 #define ETH_ALEN 6
 #define ETH_HLEN 14
+#define ETH_ZLEN 60     /* Min. octets in frame sans FCS */
 
 struct eth_header {
     uint8_t  h_dest[ETH_ALEN];   /* destination eth addr */
diff --git a/net/net.c b/net/net.c
index 32d71c1..27c3b25 100644
--- a/net/net.c
+++ b/net/net.c
@@ -638,6 +638,7 @@  static ssize_t qemu_send_packet_async_with_flags(NetClientState *sender,
                                                  NetPacketSent *sent_cb)
 {
     NetQueue *queue;
+    uint8_t min_buf[ETH_ZLEN];
     int ret;
 
 #ifdef DEBUG_NET
@@ -649,6 +650,17 @@  static ssize_t qemu_send_packet_async_with_flags(NetClientState *sender,
         return size;
     }
 
+    /* Pad to minimum Ethernet frame length for SLiRP and TAP */
+    if (sender->info->type == NET_CLIENT_DRIVER_USER ||
+        sender->info->type == NET_CLIENT_DRIVER_TAP) {
+        if (size < ETH_ZLEN) {
+            memcpy(min_buf, buf, size);
+            memset(&min_buf[size], 0, ETH_ZLEN - size);
+            buf = min_buf;
+            size = ETH_ZLEN;
+        }
+    }
+
     /* Let filters handle the packet first */
     ret = filter_receive(sender, NET_FILTER_DIRECTION_TX,
                          sender, flags, buf, size, sent_cb);