diff mbox

[net-next,v2,9/9] xen-netback: Aggregate TX unmap operations

Message ID 1386892097-15502-10-git-send-email-zoltan.kiss@citrix.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Zoltan Kiss Dec. 12, 2013, 11:48 p.m. UTC
Unmapping causes TLB flushing, therefore we should make it in the largest
possible batches. However we shouldn't starve the guest for too long. So if
the guest has space for at least two big packets and we don't have at least a
quarter ring to unmap, delay it for at most 1 milisec.

Signed-off-by: Zoltan Kiss <zoltan.kiss@citrix.com>
---
 drivers/net/xen-netback/common.h  |    2 ++
 drivers/net/xen-netback/netback.c |   30 +++++++++++++++++++++++++++++-
 2 files changed, 31 insertions(+), 1 deletion(-)

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Wei Liu Dec. 13, 2013, 3:44 p.m. UTC | #1
On Thu, Dec 12, 2013 at 11:48:17PM +0000, Zoltan Kiss wrote:
> Unmapping causes TLB flushing, therefore we should make it in the largest
> possible batches. However we shouldn't starve the guest for too long. So if
> the guest has space for at least two big packets and we don't have at least a
> quarter ring to unmap, delay it for at most 1 milisec.
> 

Is this solution temporary or permanent? If it is permanent would it
make sense to make these parameter tunable?

Wei.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Zoltan Kiss Dec. 16, 2013, 4:30 p.m. UTC | #2
On 13/12/13 15:44, Wei Liu wrote:
> On Thu, Dec 12, 2013 at 11:48:17PM +0000, Zoltan Kiss wrote:
>> Unmapping causes TLB flushing, therefore we should make it in the largest
>> possible batches. However we shouldn't starve the guest for too long. So if
>> the guest has space for at least two big packets and we don't have at least a
>> quarter ring to unmap, delay it for at most 1 milisec.
>>
>
> Is this solution temporary or permanent? If it is permanent would it
> make sense to make these parameter tunable?

Well, I'm not entirely sure yet this is the best way to do this, so in 
this sense it's temporary. But generally we should do some sort of 
batching, as TLB flush cannot be avoided every time. If we settle on 
something we should make the tunable parameters tunable.
The problem is that it is a thin red line we should find here. My first 
approach was that I left the tx_dealloc_work_todo as it was, and after 
the thread woke up but before anything were done I made it sleep for 50 
ns and measured  how fast the guest is running out of free slots:

if (kthread_should_stop())
	break;

+i=0;
+do {
+	++i;
+	prev_free_slots = nr_free_slots(&vif->tx);
+	__set_current_state(TASK_UNINTERRUPTIBLE);
+	rc = schedule_hrtimeout_range(&tx_dealloc_delay_ktime, 10, 
HRTIMER_MODE_REL);
+	if (rc) trace_printk("%s sleep were interrupted! %d\n",vif->dev->name, 
rc);
+	curr_free_slots = nr_free_slots(&vif->tx);
+} while ( (curr_free_slots < 4 * (prev_free_slots - curr_free_slots)) 
&& i < 11);
+
xenvif_tx_dealloc_action(vif);

And worst case after 500 ns I let the thread to do the unmap anyway. But 
I was a bit worried about this approach, so I choose a bit more 
conservative one for this patch.

There are also ideas to use some other instrument for unmapping instead 
of the current separate thread approach. Putting it into the NAPI 
instance was the original idea, which caused problems. Placing it into 
the another thread where RX work happens also doesn't sound too good, 
these things can and should happen in parallel.
Other ideas were work queues and tasklets, I'll spend some more time to 
check if they are feasible.

Regards,

Zoli
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 05fa6be..a834818 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -111,6 +111,8 @@  struct xenvif {
 	u16 dealloc_ring[MAX_PENDING_REQS];
 	struct task_struct *dealloc_task;
 	wait_queue_head_t dealloc_wq;
+	struct timer_list dealloc_delay;
+	bool dealloc_delay_timed_out;
 
 	/* Use kthread for guest RX */
 	struct task_struct *task;
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 5252416..f4a9876 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -136,6 +136,11 @@  static inline pending_ring_idx_t nr_pending_reqs(struct xenvif *vif)
 		vif->pending_prod + vif->pending_cons;
 }
 
+static inline pending_ring_idx_t nr_free_slots(struct xen_netif_tx_back_ring *ring)
+{
+	return ring->nr_ents -	(ring->sring->req_prod - ring->rsp_prod_pvt);
+}
+
 bool xenvif_rx_ring_slots_available(struct xenvif *vif, int needed)
 {
 	RING_IDX prod, cons;
@@ -1898,10 +1903,33 @@  static inline int tx_work_todo(struct xenvif *vif)
 	return 0;
 }
 
+static void xenvif_dealloc_delay(unsigned long data)
+{
+	struct xenvif *vif = (struct xenvif *)data;
+
+	vif->dealloc_delay_timed_out = true;
+	wake_up(&vif->dealloc_wq);
+}
+
 static inline int tx_dealloc_work_todo(struct xenvif *vif)
 {
-	if (vif->dealloc_cons != vif->dealloc_prod)
+	if (vif->dealloc_cons != vif->dealloc_prod) {
+		if ((nr_free_slots(&vif->tx) > 2 * XEN_NETBK_LEGACY_SLOTS_MAX) &&
+			(vif->dealloc_prod - vif->dealloc_cons < MAX_PENDING_REQS / 4) &&
+			!vif->dealloc_delay_timed_out) {
+			if (!timer_pending(&vif->dealloc_delay)) {
+				vif->dealloc_delay.function = xenvif_dealloc_delay;
+				vif->dealloc_delay.data = (unsigned long)vif;
+				mod_timer(&vif->dealloc_delay,
+					jiffies + msecs_to_jiffies(1));
+
+			}
+			return 0;
+		}
+		del_timer_sync(&vif->dealloc_delay);
+		vif->dealloc_delay_timed_out = false;
 		return 1;
+	}
 
 	return 0;
 }