diff mbox

[net-2.6] nete zero kobject in rx_queue_release

Message ID 20101111201341.4418.16400.stgit@jf-dev1-dcblab
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

John Fastabend Nov. 11, 2010, 8:13 p.m. UTC
netif_set_real_num_rx_queues() can decrement and increment
the number of rx queues. For example ixgbe does this as
features and offloads are toggled. Presumably this could
also happen across down/up on most devices if the available
resources changed (cpu offlined).

The kobject needs to be zero'd in this case so that the
state is not preserved across kobject_put()/kobject_init_and_add().

This resolves the following error report.

ixgbe 0000:03:00.0: eth2: NIC Link is Up 10 Gbps, Flow Control: RX/TX
kobject (ffff880324b83210): tried to init an initialized object, something is seriously wrong.
Pid: 1972, comm: lldpad Not tainted 2.6.37-rc18021qaz+ #169
Call Trace:
 [<ffffffff8121c940>] kobject_init+0x3a/0x83
 [<ffffffff8121cf77>] kobject_init_and_add+0x23/0x57
 [<ffffffff8107b800>] ? mark_lock+0x21/0x267
 [<ffffffff813c6d11>] net_rx_queue_update_kobjects+0x63/0xc6
 [<ffffffff813b5e0e>] netif_set_real_num_rx_queues+0x5f/0x78
 [<ffffffffa0261d49>] ixgbe_set_num_queues+0x1c6/0x1ca [ixgbe]
 [<ffffffffa0262509>] ixgbe_init_interrupt_scheme+0x1e/0x79c [ixgbe]
 [<ffffffffa0274596>] ixgbe_dcbnl_set_state+0x167/0x189 [ixgbe]

Signed-off-by: John Fastabend <john.r.fastabend@intel.com>
---

 net/core/net-sysfs.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller Nov. 12, 2010, 9:08 p.m. UTC | #1
From: John Fastabend <john.r.fastabend@intel.com>
Date: Thu, 11 Nov 2010 12:13:41 -0800

> netif_set_real_num_rx_queues() can decrement and increment
> the number of rx queues. For example ixgbe does this as
> features and offloads are toggled. Presumably this could
> also happen across down/up on most devices if the available
> resources changed (cpu offlined).
> 
> The kobject needs to be zero'd in this case so that the
> state is not preserved across kobject_put()/kobject_init_and_add().
> 
> This resolves the following error report.
 ...
> Signed-off-by: John Fastabend <john.r.fastabend@intel.com>

I think it's probably better to clear the entire netdev_rx_queue
object rather than just the embedded kobject.

Otherwise we leave dangling rps_map, rps_flow_table, etc. pointers.

In fact, it's more tricky than this, because notice that your
patch will memset() free'd memory in the case where the
first->count drops to zero and we execute the kfree().

So we'll need something like:

	if (atomic_dec_and_test(&first->count))
		kfree(first);
	else
		/* clear everything except queue->first */

or, alternatively:

--------------------
	map = rcu_dereference_raw(queue->rps_map);
	if (map) {
		call_rcu(&map->rcu, rps_map_release);
		rcu_assign_pointer(queue->rps_map, NULL);
	}

	flow_table = rcu_dereference_raw(queue->rps_flow_table);
	if (flow_table) {
		call_rcu(&flow_table->rcu, rps_dev_flow_table_release);
		rcu_assign_pointer(queue->rps_flow_table, NULL);
	}
	if (atomic_dec_and_test(&first->count))
		kfree(first);
	else
		memset(kobj);
--------------------

Something like that.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tom Herbert Nov. 14, 2010, 10:40 p.m. UTC | #2
> So we'll need something like:
>
>        if (atomic_dec_and_test(&first->count))
>                kfree(first);
>        else
>                /* clear everything except queue->first */
>

The patches to get rid of the separate refcnt should obviate this
complexity.  Could just clear the queue in kobject release.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Nov. 14, 2010, 11:15 p.m. UTC | #3
From: Tom Herbert <therbert@google.com>
Date: Sun, 14 Nov 2010 14:40:00 -0800

>> So we'll need something like:
>>
>>        if (atomic_dec_and_test(&first->count))
>>                kfree(first);
>>        else
>>                /* clear everything except queue->first */
>>
> 
> The patches to get rid of the separate refcnt should obviate this
> complexity.  Could just clear the queue in kobject release.

True but we'll still need a patch for older kernels.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Fastabend Nov. 16, 2010, 2:06 a.m. UTC | #4
On 11/14/2010 3:15 PM, David Miller wrote:
> From: Tom Herbert <therbert@google.com>
> Date: Sun, 14 Nov 2010 14:40:00 -0800
> 
>>> So we'll need something like:
>>>
>>>        if (atomic_dec_and_test(&first->count))
>>>                kfree(first);
>>>        else
>>>                /* clear everything except queue->first */
>>>
>>
>> The patches to get rid of the separate refcnt should obviate this
>> complexity.  Could just clear the queue in kobject release.
> 
> True but we'll still need a patch for older kernels.

OK Thanks. I'll have a stable patch and a net-2.6 patch soon.

-- John
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Fastabend Nov. 16, 2010, 7:13 a.m. UTC | #5
On 11/15/2010 6:06 PM, John Fastabend wrote:
> On 11/14/2010 3:15 PM, David Miller wrote:
>> From: Tom Herbert <therbert@google.com>
>> Date: Sun, 14 Nov 2010 14:40:00 -0800
>>
>>>> So we'll need something like:
>>>>
>>>>        if (atomic_dec_and_test(&first->count))
>>>>                kfree(first);
>>>>        else
>>>>                /* clear everything except queue->first */
>>>>
>>>
>>> The patches to get rid of the separate refcnt should obviate this
>>> complexity.  Could just clear the queue in kobject release.
>>
>> True but we'll still need a patch for older kernels.
> 
> OK Thanks. I'll have a stable patch and a net-2.6 patch soon.
> 
> -- John

To address Tom's comment, queue->dev would need to be reset if the queue was cleared. In the latest patch I didn't bother and just clear the kobject.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c
index a5ff5a8..3315033 100644
--- a/net/core/net-sysfs.c
+++ b/net/core/net-sysfs.c
@@ -721,6 +721,11 @@  static void rx_queue_release(struct kobject *kobj)
 
 	if (atomic_dec_and_test(&first->count))
 		kfree(first);
+
+	/* cleanup kobject because we may need to reuse it if the
+	 * number of rx queues is increased again in the future
+	 */
+	memset(kobj, 0, sizeof(*kobj));
 }
 
 static struct kobj_type rx_queue_ktype = {