Patchwork Re: [PATCH][RFC] qemu:virtio-net: Use TUNSETTXFILTER for MAC filteringlogin register about

login
register
mail settings
Submitter Dragos Tatulea
Date Jan. 12, 2011, 5:26 p.m.
Message ID <AANLkTi=xKocgwzcUAQbRbJqM7LkSUi+KXr8kHck4-BZb@mail.gmail.com>
Download mbox | patch
Permalink /patch/78595/
State New
Headers show

Comments

Dragos Tatulea - Jan. 12, 2011, 5:26 p.m.
Hi,

Trying to stir up  a year old conversation [1] about mac filtering.
The patch below is Alex Williamson's work updated for the current qemu
taking into account some comments. An extra check for multiple nics on
the same vlan has been added as well. Now, I know it's not ideal but
I'm looking for feedback here.

The original thread contained some ideas that weren't discussed extensively:

* Alex Williamson's patch that makes the nic call the associated tap
vlan client to set the mac filters. This is possible only for 1 nix x
1 tap setup. It's a bit hackish (even more visible in the current
version) but it's straightforward.

* Paul Brook considered that the nic shouldn't know anything about
what else is connected to the vlan. He proposed a model that is more
generic and would allow us to filter stuff from device to vlan and the
other way around.

* Anthony Liguori proposed moving all the mac filtering to the vlan
side. This would be nice, but it would imply removing & rewriting all
the emulated nics. Don't know how acceptable this is since, I suppose,
the emulated behavior will change.

The bottom line here is that neither of these 3 approaches is ok for
what we want to achieve: basically tap & macvtap filtering for virtio.
What do we want? The simple change for virtio tun tx fiiltering or
filtering MACs in both directions taking into account how nics are
connected and if hw filtering is possible? The latter being quite a
change.

[1] - http://thread.gmane.org/gmane.comp.emulators.qemu/37714/focus=37719

-- Dragos Tatulea

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Dragos Tatulea <dragos.tatulea@gmail.com>
---
 {
@@ -333,6 +360,7 @@ static TAPState *net_tap_fd_init(VLANState *vlan,

     nc = qemu_new_net_client(&net_tap_info, vlan, NULL, model, name);

+    nc->info->set_receive_filter = tap_set_receive_filter;
     s = DO_UPCAST(TAPState, nc, nc);

     s->fd = fd;
Michael S. Tsirkin - Jan. 12, 2011, 5:35 p.m.
On Wed, Jan 12, 2011 at 06:26:55PM +0100, Dragos Tatulea wrote:
> Hi,
> 
> Trying to stir up  a year old conversation [1] about mac filtering.
> The patch below is Alex Williamson's work updated for the current qemu
> taking into account some comments. An extra check for multiple nics on
> the same vlan has been added as well. Now, I know it's not ideal but
> I'm looking for feedback here.
> 
> The original thread contained some ideas that weren't discussed extensively:
> 
> * Alex Williamson's patch that makes the nic call the associated tap
> vlan client to set the mac filters. This is possible only for 1 nix x
> 1 tap setup. It's a bit hackish (even more visible in the current
> version) but it's straightforward.
> 
> * Paul Brook considered that the nic shouldn't know anything about
> what else is connected to the vlan. He proposed a model that is more
> generic and would allow us to filter stuff from device to vlan and the
> other way around.
> 
> * Anthony Liguori proposed moving all the mac filtering to the vlan
> side. This would be nice, but it would imply removing & rewriting all
> the emulated nics. Don't know how acceptable this is since, I suppose,
> the emulated behavior will change.
> 
> The bottom line here is that neither of these 3 approaches is ok for
> what we want to achieve: basically tap & macvtap filtering for virtio.
> What do we want? The simple change for virtio tun tx fiiltering or
> filtering MACs in both directions taking into account how nics are
> connected and if hw filtering is possible? The latter being quite a
> change.
> 
> [1] - http://thread.gmane.org/gmane.comp.emulators.qemu/37714/focus=37719
> 
> -- Dragos Tatulea
> 
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> Signed-off-by: Dragos Tatulea <dragos.tatulea@gmail.com>

I tested a similar patch a while ago.  At least with tap connected to a
bridge, I observed the following:
- with recent kernels bridge already learns mac addresses
  and performs mac filtering for unicast and multicast packets
- mac filtering is also not necessary in many (most?) other situations
  e.g. userspace, ip routing in host ...
- the overhead of adding filtering in kernel is not negligeable
  (don't have numbers, but it was noticeable)
- vlan filtering is not supported in kernel - should it be?

So I would suggest a flag to enable/disable mac and vlan
filtering, and disable by default.

	Look up this discussion for suggestions:

	Date: Tue, 9 Mar 2010 15:15:44 +0200                                                                              
	From: "Michael S. Tsirkin" <mst@redhat.com>                                                                       
	To: qemu-devel@nongnu.org, Alex Williamson <alex.williamson@hp.com>,
	Andreas Plesner Jacobsen <apj@mutt.dk>       
	Subject: [PATCH RFC] net: add a flag to disable mac/vlan filtering                                                
	Message-ID: <20100309131544.GA15319@redhat.com>                                                                   


> ---
> diff --git a/hw/virtio-net.c b/hw/virtio-net.c
> index ec1bf8d..0276f3a 100644
> --- a/hw/virtio-net.c
> +++ b/hw/virtio-net.c
> @@ -21,6 +21,8 @@
>  #include "virtio-net.h"
>  #include "vhost_net.h"
> 
> +#include <net/if.h>
> +
>  #define VIRTIO_NET_VM_VERSION    11
> 
>  #define MAC_TABLE_ENTRIES    64
> @@ -49,6 +51,7 @@ typedef struct VirtIONet
>      int mergeable_rx_bufs;
>      uint8_t promisc;
>      uint8_t allmulti;
> +    uint8_t hw_mac_filter;
>      uint8_t alluni;
>      uint8_t nomulti;
>      uint8_t nouni;
> @@ -176,6 +179,30 @@ static void virtio_net_set_link_status(VLANClientState *nc)
>      virtio_net_set_status(&n->vdev, n->vdev.status);
>  }
> 
> +static void virtio_net_set_hw_rx_filter(VirtIONet *n)
> +{
> +    static const uint8_t bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
> +    uint8_t *buf;
> +    int flags = 0;
> +    VLANState *vlan = n->nic->nc.vlan;
> +
> +    if (n->promisc || vlan_count_nics(vlan) > 1)
> +        /* can't have several nics within the same tap */
> +        flags |= IFF_PROMISC;
> +    if (n->allmulti)
> +        flags |= IFF_ALLMULTI;
> +
> +    buf = qemu_mallocz((n->mac_table.in_use + 2) * ETH_ALEN);
> +
> +    memcpy(&buf[ETH_ALEN*0], n->mac, ETH_ALEN);
> +    memcpy(&buf[ETH_ALEN*1], bcast, ETH_ALEN);
> +    memcpy(&buf[ETH_ALEN*2], n->mac_table.macs, n->mac_table.in_use *
> ETH_ALEN);
> +
> +    n->hw_mac_filter = vlan_set_hw_receive_filter(vlan, flags,
> +                                                  n->mac_table.in_use
> + 2, buf);
> +    qemu_free(buf);
> +}
> +
>  static void virtio_net_reset(VirtIODevice *vdev)
>  {
>      VirtIONet *n = to_virtio_net(vdev);
> @@ -194,6 +221,7 @@ static void virtio_net_reset(VirtIODevice *vdev)
>      n->mac_table.multi_overflow = 0;
>      n->mac_table.uni_overflow = 0;
>      memset(n->mac_table.macs, 0, MAC_TABLE_ENTRIES * ETH_ALEN);
> +    virtio_net_set_hw_rx_filter(n);
>      memset(n->vlans, 0, MAX_VLAN >> 3);
>  }
> 
> @@ -423,11 +451,13 @@ static void virtio_net_handle_ctrl(VirtIODevice
> *vdev, VirtQueue *vq)
>          ctrl.class = ldub_p(elem.out_sg[0].iov_base);
>          ctrl.cmd = ldub_p(elem.out_sg[0].iov_base + sizeof(ctrl.class));
> 
> -        if (ctrl.class == VIRTIO_NET_CTRL_RX_MODE)
> +        if (ctrl.class == VIRTIO_NET_CTRL_RX_MODE) {
>              status = virtio_net_handle_rx_mode(n, ctrl.cmd, &elem);
> -        else if (ctrl.class == VIRTIO_NET_CTRL_MAC)
> +            virtio_net_set_hw_rx_filter(n);
> +        } else if (ctrl.class == VIRTIO_NET_CTRL_MAC) {
>              status = virtio_net_handle_mac(n, ctrl.cmd, &elem);
> -        else if (ctrl.class == VIRTIO_NET_CTRL_VLAN)
> +            virtio_net_set_hw_rx_filter(n);
> +        } else if (ctrl.class == VIRTIO_NET_CTRL_VLAN)
>              status = virtio_net_handle_vlan_table(n, ctrl.cmd, &elem);
> 
>          stb_p(elem.in_sg[elem.in_num - 1].iov_base, status);
> @@ -547,6 +577,9 @@ static int receive_filter(VirtIONet *n, const
> uint8_t *buf, int size)
>      if (n->promisc)
>          return 1;
> 
> +    if (n->hw_mac_filter)
> +        return 1;
> +
>      if (n->has_vnet_hdr) {
>          ptr += sizeof(struct virtio_net_hdr);
>      }
> @@ -969,6 +1002,9 @@ static int virtio_net_load(QEMUFile *f, void
> *opaque, int version_id)
>          }
>      }
>      n->mac_table.first_multi = i;
> +
> +    virtio_net_set_hw_rx_filter(n);
> +
>      return 0;
>  }
> 
> diff --git a/net.c b/net.c
> index 9ba5be2..8814e8b 100644
> --- a/net.c
> +++ b/net.c
> @@ -492,6 +492,37 @@ static ssize_t
> qemu_vlan_deliver_packet(VLANClientState *sender,
>      return ret;
>  }
> 
> +int vlan_count_nics(VLANState *vlan)
> +{
> +    VLANClientState *vc;
> +    int count = 0;
> +
> +    QTAILQ_FOREACH(vc, &vlan->clients, next) {
> +        if (vc->info->type == NET_CLIENT_TYPE_NIC)
> +            count++;
> +    }
> +
> +    return count;
> +}
> +
> +int vlan_set_hw_receive_filter(VLANState *vlan, int flags,
> +                               int count, uint8_t *buf)
> +{
> +    VLANClientState *vc;
> +
> +    QTAILQ_FOREACH(vc, &vlan->clients, next) {
> +        int ret;
> +
> +        if (!vc->info->set_receive_filter)
> +            continue;
> +
> +        ret = vc->info->set_receive_filter(vc, flags, count, buf);
> +        return (ret == count);
> +    }
> +
> +    return 0;
> +}
> +
>  void qemu_purge_queued_packets(VLANClientState *vc)
>  {
>      NetQueue *queue;
> diff --git a/net.h b/net.h
> index 6ceca50..fe12c27 100644
> --- a/net.h
> +++ b/net.h
> @@ -42,6 +42,7 @@ typedef void (NetPoll)(VLANClientState *, bool enable);
>  typedef int (NetCanReceive)(VLANClientState *);
>  typedef ssize_t (NetReceive)(VLANClientState *, const uint8_t *, size_t);
>  typedef ssize_t (NetReceiveIOV)(VLANClientState *, const struct iovec *, int);
> +typedef int (NetSetReceiveFilter)(VLANClientState *, unsigned int,
> int, uint8_t *);
>  typedef void (NetCleanup) (VLANClientState *);
>  typedef void (LinkStatusChanged)(VLANClientState *);
> 
> @@ -52,6 +53,7 @@ typedef struct NetClientInfo {
>      NetReceive *receive_raw;
>      NetReceiveIOV *receive_iov;
>      NetCanReceive *can_receive;
> +    NetSetReceiveFilter *set_receive_filter;
>      NetCleanup *cleanup;
>      LinkStatusChanged *link_status_changed;
>      NetPoll *poll;
> @@ -99,6 +101,11 @@ NICState *qemu_new_nic(NetClientInfo *info,
>  void qemu_del_vlan_client(VLANClientState *vc);
>  VLANClientState *qemu_find_vlan_client_by_name(Monitor *mon, int vlan_id,
>                                                 const char *client_str);
> +
> +int vlan_count_nics(VLANState *vlan);
> +int vlan_set_hw_receive_filter(VLANState *vlan, int flags,
> +                               int count, uint8_t *buf);
> +
>  typedef void (*qemu_nic_foreach)(NICState *nic, void *opaque);
>  void qemu_foreach_nic(qemu_nic_foreach func, void *opaque);
>  int qemu_can_send_packet(VLANClientState *vc);
> diff --git a/net/tap.c b/net/tap.c
> index eada34a..f3b9b1b 100644
> --- a/net/tap.c
> +++ b/net/tap.c
> @@ -44,6 +44,8 @@
> 
>  #include "hw/vhost_net.h"
> 
> +#include <linux/if_tun.h>
> +
>  /* Maximum GSO packet size (64k) plus plenty of room for
>   * the ethernet and virtio_net headers
>   */
> @@ -115,6 +117,31 @@ static ssize_t tap_write_packet(TAPState *s,
> const struct iovec *iov, int iovcnt
>      return len;
>  }
> 
> +static int tap_set_receive_filter(VLANClientState *nc, unsigned int flags,
> +                                  int count, uint8_t *list)
> +{
> +    TAPState *s = DO_UPCAST(TAPState, nc, nc);
> +    struct tun_filter *filter;
> +    int ret;
> +
> +    if (flags & IFF_PROMISC)
> +        count = 0;
> +
> +    filter = qemu_mallocz(sizeof(*filter) + (count * ETH_ALEN));
> +
> +    memcpy(filter->addr, list, count * ETH_ALEN);
> +    filter->count += count;
> +
> +    if (flags & IFF_ALLMULTI)
> +        filter->flags |= TUN_FLT_ALLMULTI;
> +
> +    ret = ioctl(s->fd, TUNSETTXFILTER, filter);
> +
> +    qemu_free(filter);
> +
> +    return ret;
> +}
> +
>  static ssize_t tap_receive_iov(VLANClientState *nc, const struct iovec *iov,
>                                 int iovcnt)
>  {
> @@ -333,6 +360,7 @@ static TAPState *net_tap_fd_init(VLANState *vlan,
> 
>      nc = qemu_new_net_client(&net_tap_info, vlan, NULL, model, name);
> 
> +    nc->info->set_receive_filter = tap_set_receive_filter;
>      s = DO_UPCAST(TAPState, nc, nc);
> 
>      s->fd = fd;

Patch

diff --git a/hw/virtio-net.c b/hw/virtio-net.c
index ec1bf8d..0276f3a 100644
--- a/hw/virtio-net.c
+++ b/hw/virtio-net.c
@@ -21,6 +21,8 @@ 
 #include "virtio-net.h"
 #include "vhost_net.h"

+#include <net/if.h>
+
 #define VIRTIO_NET_VM_VERSION    11

 #define MAC_TABLE_ENTRIES    64
@@ -49,6 +51,7 @@  typedef struct VirtIONet
     int mergeable_rx_bufs;
     uint8_t promisc;
     uint8_t allmulti;
+    uint8_t hw_mac_filter;
     uint8_t alluni;
     uint8_t nomulti;
     uint8_t nouni;
@@ -176,6 +179,30 @@  static void virtio_net_set_link_status(VLANClientState *nc)
     virtio_net_set_status(&n->vdev, n->vdev.status);
 }

+static void virtio_net_set_hw_rx_filter(VirtIONet *n)
+{
+    static const uint8_t bcast[] = {0xff, 0xff, 0xff, 0xff, 0xff, 0xff};
+    uint8_t *buf;
+    int flags = 0;
+    VLANState *vlan = n->nic->nc.vlan;
+
+    if (n->promisc || vlan_count_nics(vlan) > 1)
+        /* can't have several nics within the same tap */
+        flags |= IFF_PROMISC;
+    if (n->allmulti)
+        flags |= IFF_ALLMULTI;
+
+    buf = qemu_mallocz((n->mac_table.in_use + 2) * ETH_ALEN);
+
+    memcpy(&buf[ETH_ALEN*0], n->mac, ETH_ALEN);
+    memcpy(&buf[ETH_ALEN*1], bcast, ETH_ALEN);
+    memcpy(&buf[ETH_ALEN*2], n->mac_table.macs, n->mac_table.in_use *
ETH_ALEN);
+
+    n->hw_mac_filter = vlan_set_hw_receive_filter(vlan, flags,
+                                                  n->mac_table.in_use
+ 2, buf);
+    qemu_free(buf);
+}
+
 static void virtio_net_reset(VirtIODevice *vdev)
 {
     VirtIONet *n = to_virtio_net(vdev);
@@ -194,6 +221,7 @@  static void virtio_net_reset(VirtIODevice *vdev)
     n->mac_table.multi_overflow = 0;
     n->mac_table.uni_overflow = 0;
     memset(n->mac_table.macs, 0, MAC_TABLE_ENTRIES * ETH_ALEN);
+    virtio_net_set_hw_rx_filter(n);
     memset(n->vlans, 0, MAX_VLAN >> 3);
 }

@@ -423,11 +451,13 @@  static void virtio_net_handle_ctrl(VirtIODevice
*vdev, VirtQueue *vq)
         ctrl.class = ldub_p(elem.out_sg[0].iov_base);
         ctrl.cmd = ldub_p(elem.out_sg[0].iov_base + sizeof(ctrl.class));

-        if (ctrl.class == VIRTIO_NET_CTRL_RX_MODE)
+        if (ctrl.class == VIRTIO_NET_CTRL_RX_MODE) {
             status = virtio_net_handle_rx_mode(n, ctrl.cmd, &elem);
-        else if (ctrl.class == VIRTIO_NET_CTRL_MAC)
+            virtio_net_set_hw_rx_filter(n);
+        } else if (ctrl.class == VIRTIO_NET_CTRL_MAC) {
             status = virtio_net_handle_mac(n, ctrl.cmd, &elem);
-        else if (ctrl.class == VIRTIO_NET_CTRL_VLAN)
+            virtio_net_set_hw_rx_filter(n);
+        } else if (ctrl.class == VIRTIO_NET_CTRL_VLAN)
             status = virtio_net_handle_vlan_table(n, ctrl.cmd, &elem);

         stb_p(elem.in_sg[elem.in_num - 1].iov_base, status);
@@ -547,6 +577,9 @@  static int receive_filter(VirtIONet *n, const
uint8_t *buf, int size)
     if (n->promisc)
         return 1;

+    if (n->hw_mac_filter)
+        return 1;
+
     if (n->has_vnet_hdr) {
         ptr += sizeof(struct virtio_net_hdr);
     }
@@ -969,6 +1002,9 @@  static int virtio_net_load(QEMUFile *f, void
*opaque, int version_id)
         }
     }
     n->mac_table.first_multi = i;
+
+    virtio_net_set_hw_rx_filter(n);
+
     return 0;
 }

diff --git a/net.c b/net.c
index 9ba5be2..8814e8b 100644
--- a/net.c
+++ b/net.c
@@ -492,6 +492,37 @@  static ssize_t
qemu_vlan_deliver_packet(VLANClientState *sender,
     return ret;
 }

+int vlan_count_nics(VLANState *vlan)
+{
+    VLANClientState *vc;
+    int count = 0;
+
+    QTAILQ_FOREACH(vc, &vlan->clients, next) {
+        if (vc->info->type == NET_CLIENT_TYPE_NIC)
+            count++;
+    }
+
+    return count;
+}
+
+int vlan_set_hw_receive_filter(VLANState *vlan, int flags,
+                               int count, uint8_t *buf)
+{
+    VLANClientState *vc;
+
+    QTAILQ_FOREACH(vc, &vlan->clients, next) {
+        int ret;
+
+        if (!vc->info->set_receive_filter)
+            continue;
+
+        ret = vc->info->set_receive_filter(vc, flags, count, buf);
+        return (ret == count);
+    }
+
+    return 0;
+}
+
 void qemu_purge_queued_packets(VLANClientState *vc)
 {
     NetQueue *queue;
diff --git a/net.h b/net.h
index 6ceca50..fe12c27 100644
--- a/net.h
+++ b/net.h
@@ -42,6 +42,7 @@  typedef void (NetPoll)(VLANClientState *, bool enable);
 typedef int (NetCanReceive)(VLANClientState *);
 typedef ssize_t (NetReceive)(VLANClientState *, const uint8_t *, size_t);
 typedef ssize_t (NetReceiveIOV)(VLANClientState *, const struct iovec *, int);
+typedef int (NetSetReceiveFilter)(VLANClientState *, unsigned int,
int, uint8_t *);
 typedef void (NetCleanup) (VLANClientState *);
 typedef void (LinkStatusChanged)(VLANClientState *);

@@ -52,6 +53,7 @@  typedef struct NetClientInfo {
     NetReceive *receive_raw;
     NetReceiveIOV *receive_iov;
     NetCanReceive *can_receive;
+    NetSetReceiveFilter *set_receive_filter;
     NetCleanup *cleanup;
     LinkStatusChanged *link_status_changed;
     NetPoll *poll;
@@ -99,6 +101,11 @@  NICState *qemu_new_nic(NetClientInfo *info,
 void qemu_del_vlan_client(VLANClientState *vc);
 VLANClientState *qemu_find_vlan_client_by_name(Monitor *mon, int vlan_id,
                                                const char *client_str);
+
+int vlan_count_nics(VLANState *vlan);
+int vlan_set_hw_receive_filter(VLANState *vlan, int flags,
+                               int count, uint8_t *buf);
+
 typedef void (*qemu_nic_foreach)(NICState *nic, void *opaque);
 void qemu_foreach_nic(qemu_nic_foreach func, void *opaque);
 int qemu_can_send_packet(VLANClientState *vc);
diff --git a/net/tap.c b/net/tap.c
index eada34a..f3b9b1b 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -44,6 +44,8 @@ 

 #include "hw/vhost_net.h"

+#include <linux/if_tun.h>
+
 /* Maximum GSO packet size (64k) plus plenty of room for
  * the ethernet and virtio_net headers
  */
@@ -115,6 +117,31 @@  static ssize_t tap_write_packet(TAPState *s,
const struct iovec *iov, int iovcnt
     return len;
 }

+static int tap_set_receive_filter(VLANClientState *nc, unsigned int flags,
+                                  int count, uint8_t *list)
+{
+    TAPState *s = DO_UPCAST(TAPState, nc, nc);
+    struct tun_filter *filter;
+    int ret;
+
+    if (flags & IFF_PROMISC)
+        count = 0;
+
+    filter = qemu_mallocz(sizeof(*filter) + (count * ETH_ALEN));
+
+    memcpy(filter->addr, list, count * ETH_ALEN);
+    filter->count += count;
+
+    if (flags & IFF_ALLMULTI)
+        filter->flags |= TUN_FLT_ALLMULTI;
+
+    ret = ioctl(s->fd, TUNSETTXFILTER, filter);
+
+    qemu_free(filter);
+
+    return ret;
+}
+
 static ssize_t tap_receive_iov(VLANClientState *nc, const struct iovec *iov,
                                int iovcnt)