Patchworkβ [0/3,v4] macvtap driver

login
register
about
Submitter Ed Swierk
Date 2010-02-08 23:30:15
Message ID <1265671815.6480.8.camel@localhost.localdomain>
Download mbox | patch
Permalink /patch/44872/
State Superseded
Delegated to: David Miller
Headers show

Comments

Ed Swierk - 2010-02-08 23:30:15
On Mon, 2010-02-08 at 10:55 -0800, Sridhar Samudrala wrote:
> I am also seeing this issue with net-next-2.6.
> Basically macvtap_put_user() and macvtap_get_user() call copy_to/from_user
> from within a RCU read-side critical section.
> 
> The following patch fixes this issue by releasing the RCU read lock before
> calling these routines, but instead hold a reference to q->sk.

Thanks, I tried your patch and it fixes the problem.

However, it seems to cause another minor problem.  macvlan_count_rx() is
now getting called from macvtap_put_user() with preemption enabled,
which causes smp_processor_id() to BUG:

Feb  8 20:31:38 ti102 kernel: BUG: using smp_processor_id() in
preemptible [00000000] code: qemu-kvm/4546 
Feb  8 20:31:38 ti102 kernel: caller is macvtap_aio_read+0x18c/0x221
[macvtap] 
Feb  8 20:31:38 ti102 kernel: Pid: 4546, comm: qemu-kvm Not tainted
2.6.29.6.Ar-224686.2009eswierk8.2 #1 
Feb  8 20:31:38 ti102 kernel: Call Trace: 
Feb  8 20:31:38 ti102 kernel: [<c0349546>] ? printk+0xf/0x11 
Feb  8 20:31:38 ti102 kernel: [<c02142c0>] debug_smp_processor_id
+0xa4/0xb8 
Feb  8 20:31:38 ti102 kernel: [<f8af581f>] macvtap_aio_read+0x18c/0x221
[macvtap] 
Feb  8 20:31:38 ti102 kernel: [<c011eaf7>] ? default_wake_function
+0x0/0xd 
Feb  8 20:31:38 ti102 kernel: [<c016c75f>] do_sync_read+0xab/0xe9 
Feb  8 20:31:38 ti102 kernel: [<c011933d>] ? update_curr+0x6c/0x147 
Feb  8 20:31:38 ti102 kernel: [<c0133933>] ? autoremove_wake_function
+0x0/0x33 
Feb  8 20:31:38 ti102 kernel: [<c0349fd0>] ? schedule+0x7af/0x7e3 
Feb  8 20:31:38 ti102 kernel: [<c016d101>] vfs_read+0xb5/0x129 
Feb  8 20:31:38 ti102 kernel: [<c016d20e>] sys_read+0x3b/0x60 
Feb  8 20:31:38 ti102 kernel: [<c0102e71>] sysenter_do_call+0x12/0x25 

I fixed this problem with the change below.  I'm not sure if replacing
smp_processor_id() with get_cpu() is the right thing to do but it works
for macvtap at least.

Signed-off-by: Ed Swierk <eswierk@aristanetworks.com>

---


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann - 2010-02-10 14:50:14
On Tuesday 09 February 2010, Ed Swierk wrote:
> I fixed this problem with the change below.  I'm not sure if replacing
> smp_processor_id() with get_cpu() is the right thing to do but it works
> for macvtap at least.

I think we also need to ensure the device doesn't go away, which
was one of the reasons for the rcu_read_lock_bh() earlier.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ed Swierk - 2010-02-11 00:42:04
On Wed, Feb 10, 2010 at 6:50 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> I think we also need to ensure the device doesn't go away, which
> was one of the reasons for the rcu_read_lock_bh() earlier.

This may be veering far off into the weeds, but I'm wondering if you
considered making macvtap devices behave more like tap devices.
Specifically, the application would open /dev/net/macvtap and send it
an ioctl with the name of the macvtap interface, the name of the lower
interface to attach to, the MAC address, etc; this would cause the
macvtap interface to spring into existence. The macvtap interface
would go away when the application exits or closes the file.

The tricky part here would be noticing when the lower interface goes
away, and (ideally) reattaching when an interface with the same name
reappears.

I think the advantage of this approach is that it better fits the way
applications like qemu and libvirt use tap interfaces. Unlike the
current approach, however, this wouldn't allow creating a macvtap
interface and keep it around independently of the application using
it. Is it desirable to support this use case?

--Ed
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Arnd Bergmann - 2010-02-11 07:12:54
On Thursday 11 February 2010 01:42:04 Ed Swierk wrote:
> On Wed, Feb 10, 2010 at 6:50 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > I think we also need to ensure the device doesn't go away, which
> > was one of the reasons for the rcu_read_lock_bh() earlier.
> 
> This may be veering far off into the weeds, but I'm wondering if you
> considered making macvtap devices behave more like tap devices.
> Specifically, the application would open /dev/net/macvtap and send it
> an ioctl with the name of the macvtap interface, the name of the lower
> interface to attach to, the MAC address, etc; this would cause the
> macvtap interface to spring into existence. The macvtap interface
> would go away when the application exits or closes the file.

No, I never considered this. In fact, this behavior of tun/tap
is what makes that driver have really complex lifetime rules (more
so than macvtap) and causes all sorts of problems if you want to
manage unprivileged users accessing different outgoing interfaces.

> The tricky part here would be noticing when the lower interface goes
> away, and (ideally) reattaching when an interface with the same name
> reappears.

The first part is not so hard, the second part I'd rather not do.
 
> I think the advantage of this approach is that it better fits the way
> applications like qemu and libvirt use tap interfaces. Unlike the
> current approach, however, this wouldn't allow creating a macvtap
> interface and keep it around independently of the application using
> it. Is it desirable to support this use case?

I think it's very useful that you can set up static interfaces and give
them to a user (or group) that are then able to use these interfaces
without getting any network privileges beyond that.

Another reason for having one chardev per interface is to support
multiple open files for the same interface. I want to use that as
an easy way to support multi-queue NICs.

	Arnd
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Patch

Index: linux-2.6.29.6/include/linux/if_macvlan.h
===================================================================
--- linux-2.6.29.6.orig/include/linux/if_macvlan.h
+++ linux-2.6.29.6/include/linux/if_macvlan.h
@@ -42,8 +42,9 @@  static inline void macvlan_count_rx(cons
 				    bool multicast)
 {
 	struct macvlan_rx_stats *rx_stats;
+	int cpu = get_cpu();
 
-	rx_stats = per_cpu_ptr(vlan->rx_stats, smp_processor_id());
+	rx_stats = per_cpu_ptr(vlan->rx_stats, cpu);
 	if (likely(success)) {
 		rx_stats->rx_packets++;;
 		rx_stats->rx_bytes += len;
@@ -52,6 +53,7 @@  static inline void macvlan_count_rx(cons
 	} else {
 		rx_stats->rx_errors++;
 	}
+	put_cpu();
 }
 
 extern int macvlan_common_newlink(struct net_device *dev,