diff mbox

[V2,3/3] virtio-net: announce self by guest

Message ID 1400565704-11592-4-git-send-email-jasowang@redhat.com
State New
Headers show

Commit Message

Jason Wang May 20, 2014, 6:01 a.m. UTC
It's hard to track all mac addresses and their configurations (e.g
vlan or ipv6) in qemu. Without this information, it's impossible to
build proper garp packet after migration. The only possible solution
to this is let guest (who knows all configurations) to do this.

So, this patch introduces a new readonly config status bit of virtio-net,
VIRTIO_NET_S_ANNOUNCE which is used to notify guest to announce
presence of its link through config update interrupt.When guest has
done the announcement, it should ack the notification through
VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This feature is negotiated by a new
feature bit VIRTIO_NET_F_ANNOUNCE (which has already been supported by
Linux guest).

During load, a counter of announcing rounds is set so that after the vm is
running it can trigger rounds of config interrupts to notify the guest to build
and send the correct garps.

Cc: Liuyongan <liuyongan@huawei.com>
Cc: Amos Kong <akong@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 hw/net/virtio-net.c            |   42 ++++++++++++++++++++++++++++++++++++++++
 include/hw/i386/pc.h           |    5 ++++
 include/hw/virtio/virtio-net.h |   17 ++++++++++++++++
 3 files changed, 64 insertions(+), 0 deletions(-)

Comments

Amit Shah June 10, 2014, 6:20 a.m. UTC | #1
On (Tue) 20 May 2014 [14:01:44], Jason Wang wrote:
> It's hard to track all mac addresses and their configurations (e.g
> vlan or ipv6) in qemu. Without this information, it's impossible to
> build proper garp packet after migration. The only possible solution
> to this is let guest (who knows all configurations) to do this.
> 
> So, this patch introduces a new readonly config status bit of virtio-net,
> VIRTIO_NET_S_ANNOUNCE which is used to notify guest to announce
> presence of its link through config update interrupt.When guest has
> done the announcement, it should ack the notification through
> VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This feature is negotiated by a new
> feature bit VIRTIO_NET_F_ANNOUNCE (which has already been supported by
> Linux guest).
> 
> During load, a counter of announcing rounds is set so that after the vm is
> running it can trigger rounds of config interrupts to notify the guest to build
> and send the correct garps.

Live migration is supposed to be transparent to guests.

Doing things this way makes the guest involved in live migration.
It's not desirable.  For networking, this may well be not possible.
Are there any ways of doing this w/o involving the guest that have
been considered?


		Amit
Michael S. Tsirkin June 10, 2014, 10:10 a.m. UTC | #2
On Tue, Jun 10, 2014 at 11:50:33AM +0530, Amit Shah wrote:
> On (Tue) 20 May 2014 [14:01:44], Jason Wang wrote:
> > It's hard to track all mac addresses and their configurations (e.g
> > vlan or ipv6) in qemu. Without this information, it's impossible to
> > build proper garp packet after migration. The only possible solution
> > to this is let guest (who knows all configurations) to do this.
> > 
> > So, this patch introduces a new readonly config status bit of virtio-net,
> > VIRTIO_NET_S_ANNOUNCE which is used to notify guest to announce
> > presence of its link through config update interrupt.When guest has
> > done the announcement, it should ack the notification through
> > VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This feature is negotiated by a new
> > feature bit VIRTIO_NET_F_ANNOUNCE (which has already been supported by
> > Linux guest).
> > 
> > During load, a counter of announcing rounds is set so that after the vm is
> > running it can trigger rounds of config interrupts to notify the guest to build
> > and send the correct garps.
> 
> Live migration is supposed to be transparent to guests.
> 
> Doing things this way makes the guest involved in live migration.
> It's not desirable.

I'm not sure there's a problem.
As long as guest doesn't use networking, it does not
need to be involved. If guest does want to use networking,
it needs to be involved, but then it's accessing the
device anyway.

> For networking, this may well be not possible.
> Are there any ways of doing this w/o involving the guest that have
> been considered?
> 
> 
> 		Amit

Since we don't know guest addresses, this looks like the only way to me.
Jason Wang June 11, 2014, 2:50 a.m. UTC | #3
On 06/10/2014 06:10 PM, Michael S. Tsirkin wrote:
> On Tue, Jun 10, 2014 at 11:50:33AM +0530, Amit Shah wrote:
>> On (Tue) 20 May 2014 [14:01:44], Jason Wang wrote:
>>> It's hard to track all mac addresses and their configurations (e.g
>>> vlan or ipv6) in qemu. Without this information, it's impossible to
>>> build proper garp packet after migration. The only possible solution
>>> to this is let guest (who knows all configurations) to do this.
>>>
>>> So, this patch introduces a new readonly config status bit of virtio-net,
>>> VIRTIO_NET_S_ANNOUNCE which is used to notify guest to announce
>>> presence of its link through config update interrupt.When guest has
>>> done the announcement, it should ack the notification through
>>> VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This feature is negotiated by a new
>>> feature bit VIRTIO_NET_F_ANNOUNCE (which has already been supported by
>>> Linux guest).
>>>
>>> During load, a counter of announcing rounds is set so that after the vm is
>>> running it can trigger rounds of config interrupts to notify the guest to build
>>> and send the correct garps.
>> Live migration is supposed to be transparent to guests.
>>
>> Doing things this way makes the guest involved in live migration.
>> It's not desirable.
> I'm not sure there's a problem.
> As long as guest doesn't use networking, it does not
> need to be involved. If guest does want to use networking,
> it needs to be involved, but then it's accessing the
> device anyway.
>
>> For networking, this may well be not possible.
>> Are there any ways of doing this w/o involving the guest that have
>> been considered?
>>
>>
>> 		Amit
> Since we don't know guest addresses, this looks like the only way to me.
>

Yes and this method were also used by Xen and HyperV.
Amit Shah June 13, 2014, 12:35 p.m. UTC | #4
On (Wed) 11 Jun 2014 [10:50:04], Jason Wang wrote:
> On 06/10/2014 06:10 PM, Michael S. Tsirkin wrote:
> > On Tue, Jun 10, 2014 at 11:50:33AM +0530, Amit Shah wrote:
> >> On (Tue) 20 May 2014 [14:01:44], Jason Wang wrote:
> >>> It's hard to track all mac addresses and their configurations (e.g
> >>> vlan or ipv6) in qemu. Without this information, it's impossible to
> >>> build proper garp packet after migration. The only possible solution
> >>> to this is let guest (who knows all configurations) to do this.
> >>>
> >>> So, this patch introduces a new readonly config status bit of virtio-net,
> >>> VIRTIO_NET_S_ANNOUNCE which is used to notify guest to announce
> >>> presence of its link through config update interrupt.When guest has
> >>> done the announcement, it should ack the notification through
> >>> VIRTIO_NET_CTRL_ANNOUNCE_ACK cmd. This feature is negotiated by a new
> >>> feature bit VIRTIO_NET_F_ANNOUNCE (which has already been supported by
> >>> Linux guest).
> >>>
> >>> During load, a counter of announcing rounds is set so that after the vm is
> >>> running it can trigger rounds of config interrupts to notify the guest to build
> >>> and send the correct garps.
> >> Live migration is supposed to be transparent to guests.
> >>
> >> Doing things this way makes the guest involved in live migration.
> >> It's not desirable.
> > I'm not sure there's a problem.
> > As long as guest doesn't use networking, it does not
> > need to be involved. If guest does want to use networking,
> > it needs to be involved, but then it's accessing the
> > device anyway.
> >
> >> For networking, this may well be not possible.
> >> Are there any ways of doing this w/o involving the guest that have
> >> been considered?
> >>
> > Since we don't know guest addresses, this looks like the only way to me.

I was afraid of that.

> Yes and this method were also used by Xen and HyperV.

Well there's precedent, but that didn't stop us from trying for
something better in the past ;-)

		Amit
diff mbox

Patch

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 940a7cf..0ee0c5c 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -99,6 +99,16 @@  static bool virtio_net_started(VirtIONet *n, uint8_t status)
         (n->status & VIRTIO_NET_S_LINK_UP) && vdev->vm_running;
 }
 
+static void virtio_net_announce_timer(void *opaque)
+{
+    VirtIONet *n = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(n);
+
+    n->announce_counter--;
+    n->status |= VIRTIO_NET_S_ANNOUNCE;
+    virtio_notify_config(vdev);
+}
+
 static void virtio_net_vhost_status(VirtIONet *n, uint8_t status)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(n);
@@ -322,6 +332,9 @@  static void virtio_net_reset(VirtIODevice *vdev)
     n->nobcast = 0;
     /* multiqueue is disabled by default */
     n->curr_queues = 1;
+    timer_del(n->announce_timer);
+    n->announce_counter = 0;
+    n->status &= ~VIRTIO_NET_S_ANNOUNCE;
 
     /* Flush any MAC and VLAN filter table state */
     n->mac_table.in_use = 0;
@@ -731,6 +744,23 @@  static int virtio_net_handle_vlan_table(VirtIONet *n, uint8_t cmd,
     return VIRTIO_NET_OK;
 }
 
+static int virtio_net_handle_announce(VirtIONet *n, uint8_t cmd,
+                                      struct iovec *iov, unsigned int iov_cnt)
+{
+    if (cmd == VIRTIO_NET_CTRL_ANNOUNCE_ACK &&
+        n->status & VIRTIO_NET_S_ANNOUNCE) {
+        n->status &= ~VIRTIO_NET_S_ANNOUNCE;
+        if (n->announce_counter) {
+            timer_mod(n->announce_timer,
+                      qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
+                      self_announce_delay(n->announce_counter));
+        }
+        return VIRTIO_NET_OK;
+    } else {
+        return VIRTIO_NET_ERR;
+    }
+}
+
 static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd,
                                 struct iovec *iov, unsigned int iov_cnt)
 {
@@ -794,6 +824,8 @@  static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
             status = virtio_net_handle_mac(n, ctrl.cmd, iov, iov_cnt);
         } else if (ctrl.class == VIRTIO_NET_CTRL_VLAN) {
             status = virtio_net_handle_vlan_table(n, ctrl.cmd, iov, iov_cnt);
+        } else if (ctrl.class == VIRTIO_NET_CTRL_ANNOUNCE) {
+            status = virtio_net_handle_announce(n, ctrl.cmd, iov, iov_cnt);
         } else if (ctrl.class == VIRTIO_NET_CTRL_MQ) {
             status = virtio_net_handle_mq(n, ctrl.cmd, iov, iov_cnt);
         } else if (ctrl.class == VIRTIO_NET_CTRL_GUEST_OFFLOADS) {
@@ -1451,6 +1483,12 @@  static int virtio_net_load(QEMUFile *f, void *opaque, int version_id)
         qemu_get_subqueue(n->nic, i)->link_down = link_down;
     }
 
+    if (vdev->guest_features & (0x1 << VIRTIO_NET_F_GUEST_ANNOUNCE) &&
+        vdev->guest_features & (0x1 << VIRTIO_NET_F_CTRL_VQ)) {
+        n->announce_counter = SELF_ANNOUNCE_ROUNDS;
+        timer_mod(n->announce_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL));
+    }
+
     return 0;
 }
 
@@ -1562,6 +1600,8 @@  static void virtio_net_device_realize(DeviceState *dev, Error **errp)
     qemu_macaddr_default_if_unset(&n->nic_conf.macaddr);
     memcpy(&n->mac[0], &n->nic_conf.macaddr, sizeof(n->mac));
     n->status = VIRTIO_NET_S_LINK_UP;
+    n->announce_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL,
+                                     virtio_net_announce_timer, n);
 
     if (n->netclient_type) {
         /*
@@ -1642,6 +1682,8 @@  static void virtio_net_device_unrealize(DeviceState *dev, Error **errp)
         }
     }
 
+    timer_del(n->announce_timer);
+    timer_free(n->announce_timer);
     g_free(n->vqs);
     qemu_del_nic(n->nic);
     virtio_cleanup(vdev);
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 32a7687..f93b427 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -271,6 +271,11 @@  bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
             .driver   = "apic",\
             .property = "version",\
             .value    = stringify(0x11),\
+        },\
+        {\
+            .driver   = "virtio-net-pci",\
+            .property = "guest_announce",\
+            .value    = "off",\
         }
 
 #define PC_COMPAT_1_7 \
diff --git a/include/hw/virtio/virtio-net.h b/include/hw/virtio/virtio-net.h
index 4b32440..f7fccc0 100644
--- a/include/hw/virtio/virtio-net.h
+++ b/include/hw/virtio/virtio-net.h
@@ -49,12 +49,14 @@ 
 #define VIRTIO_NET_F_CTRL_RX    18      /* Control channel RX mode support */
 #define VIRTIO_NET_F_CTRL_VLAN  19      /* Control channel VLAN filtering */
 #define VIRTIO_NET_F_CTRL_RX_EXTRA 20   /* Extra RX mode control support */
+#define VIRTIO_NET_F_GUEST_ANNOUNCE 21  /* Guest can announce itself */
 #define VIRTIO_NET_F_MQ         22      /* Device supports Receive Flow
                                          * Steering */
 
 #define VIRTIO_NET_F_CTRL_MAC_ADDR   23 /* Set MAC address */
 
 #define VIRTIO_NET_S_LINK_UP    1       /* Link is up */
+#define VIRTIO_NET_S_ANNOUNCE   2       /* Announcement is needed */
 
 #define TX_TIMER_INTERVAL 150000 /* 150 us */
 
@@ -193,6 +195,8 @@  typedef struct VirtIONet {
     char *netclient_name;
     char *netclient_type;
     uint64_t curr_guest_offloads;
+    QEMUTimer *announce_timer;
+    int announce_counter;
 } VirtIONet;
 
 #define VIRTIO_NET_CTRL_MAC    1
@@ -213,6 +217,18 @@  typedef struct VirtIONet {
  #define VIRTIO_NET_CTRL_VLAN_DEL             1
 
 /*
+ * Control link announce acknowledgement
+ *
+ * VIRTIO_NET_S_ANNOUNCE bit in the status field requests link announcement from
+ * guest driver. The driver is notified by config space change interrupt.  The
+ * command VIRTIO_NET_CTRL_ANNOUNCE_ACK is used to indicate that the driver has
+ * received the notification. It makes the device clear the bit
+ * VIRTIO_NET_S_ANNOUNCE in the status field.
+ */
+#define VIRTIO_NET_CTRL_ANNOUNCE       3
+ #define VIRTIO_NET_CTRL_ANNOUNCE_ACK         0
+
+/*
  * Control Multiqueue
  *
  * The command VIRTIO_NET_CTRL_MQ_VQ_PAIRS_SET
@@ -251,6 +267,7 @@  struct virtio_net_ctrl_mq {
         DEFINE_PROP_BIT("guest_tso6", _state, _field, VIRTIO_NET_F_GUEST_TSO6, true), \
         DEFINE_PROP_BIT("guest_ecn", _state, _field, VIRTIO_NET_F_GUEST_ECN, true), \
         DEFINE_PROP_BIT("guest_ufo", _state, _field, VIRTIO_NET_F_GUEST_UFO, true), \
+        DEFINE_PROP_BIT("guest_announce", _state, _field, VIRTIO_NET_F_GUEST_ANNOUNCE, true), \
         DEFINE_PROP_BIT("host_tso4", _state, _field, VIRTIO_NET_F_HOST_TSO4, true), \
         DEFINE_PROP_BIT("host_tso6", _state, _field, VIRTIO_NET_F_HOST_TSO6, true), \
         DEFINE_PROP_BIT("host_ecn", _state, _field, VIRTIO_NET_F_HOST_ECN, true), \