Message ID | 1357316231.1678.1528.camel@edumazet-glaptop |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Fri, 2013-01-04 at 08:17 -0800, Eric Dumazet wrote: > From: Eric Dumazet <edumazet@google.com> > > On Fri, 2013-01-04 at 07:45 -0800, Eric Dumazet wrote: > > On Fri, 2013-01-04 at 10:59 +0000, Tom Parkin wrote: > > > Hi list, > > > > > > I recently tripped over a NULL pointer dereference in the veth driver. > > > I'm running a 3.8.0_rc1 (updated from net-next git tree this morning) > > > on an Athlon 64 X2 machine running a 32 bit kernel. To trigger the > > > oops I simply created a veth interface as follows: > > > > > > ip link add name ve0 type veth peer name ve1 > > > > > > I did a little digging in the git history and I note that veth > > > statistics changed a little with commit 2681128f0ced8aa4. I tried > > > reverting that commit in my tree, which made the oops go away again. > > > > > > Thanks, > > > Tom > > > > Thanks Tom, I'll fix this. > > > > Oh well, a last minute change again... > > I was fooled by veth_get_ethtool_stats() doing the priv->peer->ifindex > deref without checking. [...] > --- a/drivers/net/veth.c > +++ b/drivers/net/veth.c > @@ -162,15 +162,18 @@ static struct rtnl_link_stats64 *veth_get_stats64(struct net_device *dev, > struct rtnl_link_stats64 *tot) > { > struct veth_priv *priv = netdev_priv(dev); > + struct net_device *peer = priv->peer; > struct pcpu_vstats one; > > tot->tx_dropped = veth_stats_one(&one, dev); > tot->tx_bytes = one.bytes; > tot->tx_packets = one.packets; > > - tot->rx_dropped = veth_stats_one(&one, priv->peer); > - tot->rx_bytes = one.bytes; > - tot->rx_packets = one.packets; > + if (peer) { This possibly needs some memory barriers to properly synchronise with veth_newlink(). But can you not move initialisation of the peer pointers before registration of the devices in veth_newlink(), so that veth_get_stats64() cannot be called before they are initialised? Ben. > + tot->rx_dropped = veth_stats_one(&one, peer); > + tot->rx_bytes = one.bytes; > + tot->rx_packets = one.packets; > + } > > return tot; > }
On Fri, 2013-01-04 at 18:17 +0000, Ben Hutchings wrote: > This possibly needs some memory barriers to properly synchronise with > veth_newlink(). But can you not move initialisation of the peer > pointers before registration of the devices in veth_newlink(), so that > veth_get_stats64() cannot be called before they are initialised? The ->peer pointer cannot change once set. ( its never cleared ) So the problem would not be in veth_newlink(), but might be in veth_dellink() It seems we would have a problem in veth_get_ethtool_stats() already... More generally, what prevents a get_stats() being called while a dellink() (-> veth_dev_free() -> free_percpu()) is done ? (Same thing is done for tunnel/dummy stats percpu data) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2013-01-04 at 11:23 -0800, Eric Dumazet wrote: > On Fri, 2013-01-04 at 18:17 +0000, Ben Hutchings wrote: > > > This possibly needs some memory barriers to properly synchronise with > > veth_newlink(). But can you not move initialisation of the peer > > pointers before registration of the devices in veth_newlink(), so that > > veth_get_stats64() cannot be called before they are initialised? > > The ->peer pointer cannot change once set. ( its never cleared ) We may still need an explicit barrier for data-dependency. > So the problem would not be in veth_newlink(), but might be in > veth_dellink() A lot of things are done in between the unregister_netdevice_queue() and the actual deletion which are probably sufficient to flush out any calls to dev_get_stats(). But to make sure, I think we would need some small amount of shared state that isn't freed until both devices are. > It seems we would have a problem in veth_get_ethtool_stats() already... That should be OK because both ethtool operations and the whole process of interface deletion are serialised by the RTNL lock. > More generally, what prevents a get_stats() being called while a > dellink() (-> veth_dev_free() -> free_percpu()) is done ? Anything calling dev_get_stats() must have a counted or RCU reference to the device, and netdev_run_todo() waits for those to go away. For mutually referencing devices we want a kind of weak reference and we have no good way to implement those. Ben. > (Same thing is done for tunnel/dummy stats percpu data)
diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 8b2e112..bd57213 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -162,15 +162,18 @@ static struct rtnl_link_stats64 *veth_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *tot) { struct veth_priv *priv = netdev_priv(dev); + struct net_device *peer = priv->peer; struct pcpu_vstats one; tot->tx_dropped = veth_stats_one(&one, dev); tot->tx_bytes = one.bytes; tot->tx_packets = one.packets; - tot->rx_dropped = veth_stats_one(&one, priv->peer); - tot->rx_bytes = one.bytes; - tot->rx_packets = one.packets; + if (peer) { + tot->rx_dropped = veth_stats_one(&one, peer); + tot->rx_bytes = one.bytes; + tot->rx_packets = one.packets; + } return tot; }