diff mbox

A possible bug in reqsk_queue_hash_req()

Message ID i2x9b948ee41004200335vc229a59cvc0a08c35c949d7dd@mail.gmail.com
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Li Yu April 20, 2010, 10:35 a.m. UTC
Hi,

     I found out a possible bug in reqsk_queue_hash_req(), it seem
that we should move "req->dl_next = lopt->syn_table[hash];" statement
into follow write lock protected scope.

     As I browsed source code, this function only can be call at rx
code path which is protected a spin lock over struct sock , but its
caller (  inet_csk_reqsk_queue_hash_add() ) is a GPL exported symbol,
so I think that we'd best move this statement into below write lock
protected scope.

     Below is the patch to play this change, please do not apply it on
source code, it's just for show.

    Thanks.

Yu

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Eric Dumazet April 20, 2010, 11:06 a.m. UTC | #1
Le mardi 20 avril 2010 à 18:35 +0800, Li Yu a écrit :
> Hi,
> 
>      I found out a possible bug in reqsk_queue_hash_req(), it seem
> that we should move "req->dl_next = lopt->syn_table[hash];" statement
> into follow write lock protected scope.
> 
>      As I browsed source code, this function only can be call at rx
> code path which is protected a spin lock over struct sock , but its
> caller (  inet_csk_reqsk_queue_hash_add() ) is a GPL exported symbol,
> so I think that we'd best move this statement into below write lock
> protected scope.
> 
>      Below is the patch to play this change, please do not apply it on
> source code, it's just for show.
> 
>     Thanks.
> 
> Yu
> 
> --- include/net/request_sock.h  2010-04-09 15:27:14.000000000 +0800
> +++ include/net/request_sock.h        2010-04-20 18:11:32.000000000 +0800
> @@ -247,9 +247,9 @@ static inline void reqsk_queue_hash_req(
>         req->expires = jiffies + timeout;
>         req->retrans = 0;
>         req->sk = NULL;
> -       req->dl_next = lopt->syn_table[hash];
> 
>         write_lock(&queue->syn_wait_lock);
> +       req->dl_next = lopt->syn_table[hash];
>         lopt->syn_table[hash] = req;
>         write_unlock(&queue->syn_wait_lock);
>  }

I believe its not really necessary, because we are the only possible
writer at this stage.

The write_lock() ... write_unlock() is there only to enforce a
synchronisation with readers.

All callers of this reqsk_queue_hash_req() must have the socket locked



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller April 20, 2010, 9:24 p.m. UTC | #2
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Tue, 20 Apr 2010 13:06:51 +0200

> I believe its not really necessary, because we are the only possible
> writer at this stage.
> 
> The write_lock() ... write_unlock() is there only to enforce a
> synchronisation with readers.
> 
> All callers of this reqsk_queue_hash_req() must have the socket locked

Right.

In fact there are quite a few snippets around the networking
where we use this trick of only locking around the single
pointer assignment that puts the object into the list.

And they were all written by Alexey Kuznetsov, so they must
be correct :-)
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Li Yu April 21, 2010, 2:02 a.m. UTC | #3
Eric Dumazet 写道:
> Le mardi 20 avril 2010 à 21:21 +0800, Li Yu a écrit :
> 
>> In my word, write lock also means mutual exclusion among all writers,
>> is it right?
>>
> 
> Yes, generally speaking.
> 
> But not on this use case.
> 
> This is documented in an include file, if you search for syn_wait_lock
> 
>>> All callers of this reqsk_queue_hash_req() must have the socket locked
>> See. If we always assumed the caller should hold the locked socket
>> first, this is not a bug, but I think we'd better add a comment at
>> header file.
> 
> It is documented, as a matter of fact :)
> 
> 

Great, this isn't a bug, you are right here :)

I just found out these comments about syn_wait_lock, it seem that we need to crossed reference documents for kernel API, a newbie like me, may confused at such similar problems.


> 
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

--- include/net/request_sock.h  2010-04-09 15:27:14.000000000 +0800
+++ include/net/request_sock.h        2010-04-20 18:11:32.000000000 +0800
@@ -247,9 +247,9 @@  static inline void reqsk_queue_hash_req(
        req->expires = jiffies + timeout;
        req->retrans = 0;
        req->sk = NULL;
-       req->dl_next = lopt->syn_table[hash];

        write_lock(&queue->syn_wait_lock);
+       req->dl_next = lopt->syn_table[hash];
        lopt->syn_table[hash] = req;
        write_unlock(&queue->syn_wait_lock);
 }