Patchwork net: use a deferred timer in rt_check_expire

mail settings
Submitter Eric Dumazet
Date May 19, 2009, 6:56 p.m.
Message ID <>
Download mbox | patch
Permalink /patch/27409/
State RFC
Delegated to: David Miller
Headers show


Eric Dumazet - May 19, 2009, 6:56 p.m. a écrit :
>> -----Original Message-----
>> From: ext Eric Dumazet [] 
>> Sent: 19 May, 2009 12:04
>> To: Kristo Tero (Nokia-D/Tampere)
>> Cc:
>> Subject: Re: Network stack timer hacks for power saving
>> a écrit :
>>> Hi,
>>> I have been looking at network stack timer optimization for power 
>>> saving in embedded ARM environment, basically trying to 
>> avoid as many 
>>> wakeups as possible. I have changed several timers in the network 
>>> stack into deferred ones, i.e. they do not wake up the 
>> device from low 
>>> power modes but instead they are deferred until next wakeup 
>>from some 
>>> other source, like another (non-deferred) timer or some I/O. 
>> Attached 
>>> a patch about the changes I've done, is something like this safe to 
>>> do?
>>> -Tero

Here is the patch I cooked and tested on a machine where ip_rt_gc_interval 
is set to minimal value (1 second), where equilibrium depends on garbage collection
done in time.

I found that delayed timers could be *really* delayed so I think we must take
into account the elapsed time (in jiffies) between two rt_check_expire()
calls, to "guarantee" a full scan of rt cache in a ip_rt_gc_timeout period.

Not for inclusion, as undergoing work is happening in this function
for a bug correction. I'll redo the patch later once stabilized.

[PATCH] net: use a deferred timer in rt_check_expire

For the sake of power saver lovers, use a deferrable timer to fire rt_check_expire()

As some big routers cache equilibrium depends on garbage collection done in time,
we take into account elapsed time between two rt_check_expire() invocations 
to adjust the amount of slots we have to check.

Based on an initial idea and patch from Tero Kristo

Signed-off-by: Eric Dumazet <>
Signed-off-by: Tero Kristo <>
 net/ipv4/route.c |   11 ++++++++---
 1 files changed, 8 insertions(+), 3 deletions(-)

To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to
More majordomo info at


diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index c4c60e9..b2c6793 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -131,8 +131,8 @@  static int ip_rt_min_advmss __read_mostly	= 256;
 static int ip_rt_secret_interval __read_mostly	= 10 * 60 * HZ;
 static int rt_chain_length_max __read_mostly	= 20;
-static void rt_worker_func(struct work_struct *work);
-static DECLARE_DELAYED_WORK(expires_work, rt_worker_func);
+static struct delayed_work expires_work;
+static unsigned long expires_ljiffies;
  *	Interface to generic destination cache.
@@ -787,9 +787,12 @@  static void rt_check_expire(void)
 	struct rtable *rth, **rthp;
 	unsigned long length = 0, samples = 0;
 	unsigned long sum = 0, sum2 = 0;
+	unsigned long delta;
 	u64 mult;
-	mult = ((u64)ip_rt_gc_interval) << rt_hash_log;
+	delta = jiffies - expires_ljiffies;
+	expires_ljiffies = jiffies;
+	mult = ((u64)delta) << rt_hash_log;
 	if (ip_rt_gc_timeout > 1)
 		do_div(mult, ip_rt_gc_timeout);
 	goal = (unsigned int)mult;
@@ -3410,6 +3413,8 @@  int __init ip_rt_init(void)
 	/* All the timers, started at system startup tend
 	   to synchronize. Perturb it a bit.
+	INIT_DELAYED_WORK_DEFERRABLE(&expires_work, rt_worker_func);
+	expires_ljiffies = jiffies;
 		net_random() % ip_rt_gc_interval + ip_rt_gc_interval);