diff mbox

socket.7: Document SO_INCOMING_CPU

Message ID CABNn7+rHfUTjMtm3Biqfx83G6Rr7pY-LuHXY=JqA6N3H_SstZg@mail.gmail.com
State Not Applicable, archived
Delegated to: David Miller
Headers show

Commit Message

Francois Saint-Jacques Feb. 18, 2017, 4:06 a.m. UTC
This socket option is undocumented. Applies on the latest version
(man-pages-4.09-511).

Comments

Michael Kerrisk \(man-pages\) Feb. 19, 2017, 8:55 p.m. UTC | #1
[CC += Eric, so that he might review]

Hello Francois,

On 02/18/2017 05:06 AM, Francois Saint-Jacques wrote:
> This socket option is undocumented. Applies on the latest version
> (man-pages-4.09-511).
> 
> diff --git a/man7/socket.7 b/man7/socket.7
> index 3efd7a5d8..1a3ffa253 100644
> --- a/man7/socket.7
> +++ b/man7/socket.7
> @@ -490,6 +490,26 @@ flag on a socket
>  operation.
>  Expects an integer boolean flag.
>  .TP
> +.BR SO_INCOMING_CPU " (getsockopt since Linux 3.19, setsockopt since
> Linux 4.4)"
> +.\" getsocktop 2c8c56e15df3d4c2af3d656e44feb18789f75837
> +.\" setsocktop 70da268b569d32a9fddeea85dc18043de9d89f89
> +Sets or gets the cpu affinity of a socket. Expects an integer flag.
> +.sp
> +.in +4n
> +.nf
> +int cpu = 1;
> +socklen_t len = sizeof(cpu);
> +setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
> +.fi
> +.in
> +.sp
> +The typical use case is one listener per RX queue, as the associated listener
> +should only accept flows handled in softirq by the same cpu.  This provides
> +optimal NUMA behavior and keep cpu caches hot.
> +.TP
>  .B SO_KEEPALIVE
>  Enable sending of keep-alive messages on connection-oriented sockets.
>  Expects an integer boolean flag.

Thank you! Patch applied.

I have tried to enhance the description somewhat. I'm not sure whether
what I've written is quite correct (or whether it should be further
extended). Eric, could you please take a look at the following, and let 
me know if anything needs fixing:

       SO_INCOMING_CPU  (gettable  since Linux 3.19, settable since Linux
       4.4)
              Sets or gets the CPU affinity  of  a  socket.   Expects  an
              integer flag.

                  int cpu = 1;
                  socklen_t len = sizeof(cpu);
                  setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);

              Because  all  of the packets for a single stream (i.e., all
              packets for the same 4-tuple) arrive on the single RX queue
              that  is  associated with a particular CPU, the typical use
              case is to employ one listening process per RX queue,  with
              the  incoming  flow being handled by a listener on the same
              CPU that is handling the RX queue.  This  provides  optimal
              NUMA behavior and keeps CPU caches hot.

Cheers,

Michael
Francois Saint-Jacques Feb. 20, 2017, 3:26 p.m. UTC | #2
>               Because  all  of the packets for a single stream (i.e., all
>               packets for the same 4-tuple) arrive on the single RX queue

This is a configuration dependent behaviour and not always the
default. Maybe put a reference to `ethtool(8)` for more information
instead of possibly misleading the user?

François
Michael Kerrisk \(man-pages\) April 19, 2017, 1:20 p.m. UTC | #3
Ping Eric!

Would you have a chance to review the proposed text below, please.

Thanks,

Michael

On 02/19/2017 09:55 PM, Michael Kerrisk (man-pages) wrote:
> [CC += Eric, so that he might review]
> 
> Hello Francois,
> 
> On 02/18/2017 05:06 AM, Francois Saint-Jacques wrote:
>> This socket option is undocumented. Applies on the latest version
>> (man-pages-4.09-511).
>>
>> diff --git a/man7/socket.7 b/man7/socket.7
>> index 3efd7a5d8..1a3ffa253 100644
>> --- a/man7/socket.7
>> +++ b/man7/socket.7
>> @@ -490,6 +490,26 @@ flag on a socket
>>  operation.
>>  Expects an integer boolean flag.
>>  .TP
>> +.BR SO_INCOMING_CPU " (getsockopt since Linux 3.19, setsockopt since
>> Linux 4.4)"
>> +.\" getsocktop 2c8c56e15df3d4c2af3d656e44feb18789f75837
>> +.\" setsocktop 70da268b569d32a9fddeea85dc18043de9d89f89
>> +Sets or gets the cpu affinity of a socket. Expects an integer flag.
>> +.sp
>> +.in +4n
>> +.nf
>> +int cpu = 1;
>> +socklen_t len = sizeof(cpu);
>> +setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
>> +.fi
>> +.in
>> +.sp
>> +The typical use case is one listener per RX queue, as the associated listener
>> +should only accept flows handled in softirq by the same cpu.  This provides
>> +optimal NUMA behavior and keep cpu caches hot.
>> +.TP
>>  .B SO_KEEPALIVE
>>  Enable sending of keep-alive messages on connection-oriented sockets.
>>  Expects an integer boolean flag.
> 
> Thank you! Patch applied.
> 
> I have tried to enhance the description somewhat. I'm not sure whether
> what I've written is quite correct (or whether it should be further
> extended). Eric, could you please take a look at the following, and let 
> me know if anything needs fixing:
> 
>        SO_INCOMING_CPU  (gettable  since Linux 3.19, settable since Linux
>        4.4)
>               Sets or gets the CPU affinity  of  a  socket.   Expects  an
>               integer flag.
> 
>                   int cpu = 1;
>                   socklen_t len = sizeof(cpu);
>                   setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
> 
>               Because  all  of the packets for a single stream (i.e., all
>               packets for the same 4-tuple) arrive on the single RX queue
>               that  is  associated with a particular CPU, the typical use
>               case is to employ one listening process per RX queue,  with
>               the  incoming  flow being handled by a listener on the same
>               CPU that is handling the RX queue.  This  provides  optimal
>               NUMA behavior and keeps CPU caches hot.
> 
> Cheers,
> 
> Michael
>
Eric Dumazet April 19, 2017, 5:05 p.m. UTC | #4
On Wed, 2017-04-19 at 15:20 +0200, Michael Kerrisk (man-pages) wrote:
> Ping Eric!
> 
> Would you have a chance to review the proposed text below, please.
> 
> Thanks,
> 

Hi Michael

Sorry for the delay.

Note that setting the option is not supported if SO_REUSEPORT is used.

Socket will be selected from an array, either by a hash or BPF program
that has no access to this information.

Thanks !


> Michael
> 
> On 02/19/2017 09:55 PM, Michael Kerrisk (man-pages) wrote:
> > [CC += Eric, so that he might review]
> > 
> > Hello Francois,
> > 
> > On 02/18/2017 05:06 AM, Francois Saint-Jacques wrote:
> >> This socket option is undocumented. Applies on the latest version
> >> (man-pages-4.09-511).
> >>
> >> diff --git a/man7/socket.7 b/man7/socket.7
> >> index 3efd7a5d8..1a3ffa253 100644
> >> --- a/man7/socket.7
> >> +++ b/man7/socket.7
> >> @@ -490,6 +490,26 @@ flag on a socket
> >>  operation.
> >>  Expects an integer boolean flag.
> >>  .TP
> >> +.BR SO_INCOMING_CPU " (getsockopt since Linux 3.19, setsockopt since
> >> Linux 4.4)"
> >> +.\" getsocktop 2c8c56e15df3d4c2af3d656e44feb18789f75837
> >> +.\" setsocktop 70da268b569d32a9fddeea85dc18043de9d89f89
> >> +Sets or gets the cpu affinity of a socket. Expects an integer flag.
> >> +.sp
> >> +.in +4n
> >> +.nf
> >> +int cpu = 1;
> >> +socklen_t len = sizeof(cpu);
> >> +setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
> >> +.fi
> >> +.in
> >> +.sp
> >> +The typical use case is one listener per RX queue, as the associated listener
> >> +should only accept flows handled in softirq by the same cpu.  This provides
> >> +optimal NUMA behavior and keep cpu caches hot.
> >> +.TP
> >>  .B SO_KEEPALIVE
> >>  Enable sending of keep-alive messages on connection-oriented sockets.
> >>  Expects an integer boolean flag.
> > 
> > Thank you! Patch applied.
> > 
> > I have tried to enhance the description somewhat. I'm not sure whether
> > what I've written is quite correct (or whether it should be further
> > extended). Eric, could you please take a look at the following, and let 
> > me know if anything needs fixing:
> > 
> >        SO_INCOMING_CPU  (gettable  since Linux 3.19, settable since Linux
> >        4.4)
> >               Sets or gets the CPU affinity  of  a  socket.   Expects  an
> >               integer flag.
> > 
> >                   int cpu = 1;
> >                   socklen_t len = sizeof(cpu);
> >                   setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
> > 
> >               Because  all  of the packets for a single stream (i.e., all
> >               packets for the same 4-tuple) arrive on the single RX queue
> >               that  is  associated with a particular CPU, the typical use
> >               case is to employ one listening process per RX queue,  with
> >               the  incoming  flow being handled by a listener on the same
> >               CPU that is handling the RX queue.  This  provides  optimal
> >               NUMA behavior and keeps CPU caches hot.
> > 
> > Cheers,
> > 
> > Michael
> > 
> 
>
Michael Kerrisk \(man-pages\) April 19, 2017, 6:48 p.m. UTC | #5
Hi Eric,

[reodering for clarity]

>> On 02/19/2017 09:55 PM, Michael Kerrisk (man-pages) wrote:
>>> [CC += Eric, so that he might review]
>>>
>>> Hello Francois,
>>>
>>> On 02/18/2017 05:06 AM, Francois Saint-Jacques wrote:
>>>> This socket option is undocumented. Applies on the latest version
>>>> (man-pages-4.09-511).
>>>>
>>>> diff --git a/man7/socket.7 b/man7/socket.7
>>>> index 3efd7a5d8..1a3ffa253 100644
>>>> --- a/man7/socket.7
>>>> +++ b/man7/socket.7
>>>> @@ -490,6 +490,26 @@ flag on a socket
>>>>  operation.
>>>>  Expects an integer boolean flag.
>>>>  .TP
>>>> +.BR SO_INCOMING_CPU " (getsockopt since Linux 3.19, setsockopt since
>>>> Linux 4.4)"
>>>> +.\" getsocktop 2c8c56e15df3d4c2af3d656e44feb18789f75837
>>>> +.\" setsocktop 70da268b569d32a9fddeea85dc18043de9d89f89
>>>> +Sets or gets the cpu affinity of a socket. Expects an integer flag.
>>>> +.sp
>>>> +.in +4n
>>>> +.nf
>>>> +int cpu = 1;
>>>> +socklen_t len = sizeof(cpu);
>>>> +setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
>>>> +.fi
>>>> +.in
>>>> +.sp
>>>> +The typical use case is one listener per RX queue, as the associated listener
>>>> +should only accept flows handled in softirq by the same cpu.  This provides
>>>> +optimal NUMA behavior and keep cpu caches hot.
>>>> +.TP
>>>>  .B SO_KEEPALIVE
>>>>  Enable sending of keep-alive messages on connection-oriented sockets.
>>>>  Expects an integer boolean flag.
>>>
>>> Thank you! Patch applied.
>>>
>>> I have tried to enhance the description somewhat. I'm not sure whether
>>> what I've written is quite correct (or whether it should be further
>>> extended). Eric, could you please take a look at the following, and let 
>>> me know if anything needs fixing:
>>>
>>>        SO_INCOMING_CPU  (gettable  since Linux 3.19, settable since Linux
>>>        4.4)
>>>               Sets or gets the CPU affinity  of  a  socket.   Expects  an
>>>               integer flag.
>>>
>>>                   int cpu = 1;
>>>                   socklen_t len = sizeof(cpu);
>>>                   setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
>>>
>>>               Because  all  of the packets for a single stream (i.e., all
>>>               packets for the same 4-tuple) arrive on the single RX queue
>>>               that  is  associated with a particular CPU, the typical use
>>>               case is to employ one listening process per RX queue,  with
>>>               the  incoming  flow being handled by a listener on the same
>>>               CPU that is handling the RX queue.  This  provides  optimal
>>>               NUMA behavior and keeps CPU caches hot.

> Hi Michael
> 
> Sorry for the delay.

Thanks for the reply, but I think you are assuming I know more than 
I do. I'd like you to elaborate a little please. See below.

> Note that setting the option is not supported if SO_REUSEPORT is used.

Please define "not supported". Does this yield an API diagnostic?
If so, what is it?

> Socket will be selected from an array, either by a hash or BPF program
> that has no access to this information.

Sorry -- I'm lost here. How does this comment relate to the proposed
man page text above?

Thanks,

Michael
Eric Dumazet April 19, 2017, 8:13 p.m. UTC | #6
On Wed, 2017-04-19 at 20:48 +0200, Michael Kerrisk (man-pages) wrote:
> Hi Eric,
> 
> [reodering for clarity]
> 
> >> On 02/19/2017 09:55 PM, Michael Kerrisk (man-pages) wrote:
> >>> [CC += Eric, so that he might review]
> >>>
> >>> Hello Francois,
> >>>
> >>> On 02/18/2017 05:06 AM, Francois Saint-Jacques wrote:
> >>>> This socket option is undocumented. Applies on the latest version
> >>>> (man-pages-4.09-511).
> >>>>
> >>>> diff --git a/man7/socket.7 b/man7/socket.7
> >>>> index 3efd7a5d8..1a3ffa253 100644
> >>>> --- a/man7/socket.7
> >>>> +++ b/man7/socket.7
> >>>> @@ -490,6 +490,26 @@ flag on a socket
> >>>>  operation.
> >>>>  Expects an integer boolean flag.
> >>>>  .TP
> >>>> +.BR SO_INCOMING_CPU " (getsockopt since Linux 3.19, setsockopt since
> >>>> Linux 4.4)"
> >>>> +.\" getsocktop 2c8c56e15df3d4c2af3d656e44feb18789f75837
> >>>> +.\" setsocktop 70da268b569d32a9fddeea85dc18043de9d89f89
> >>>> +Sets or gets the cpu affinity of a socket. Expects an integer flag.
> >>>> +.sp
> >>>> +.in +4n
> >>>> +.nf
> >>>> +int cpu = 1;
> >>>> +socklen_t len = sizeof(cpu);
> >>>> +setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
> >>>> +.fi
> >>>> +.in
> >>>> +.sp
> >>>> +The typical use case is one listener per RX queue, as the associated listener
> >>>> +should only accept flows handled in softirq by the same cpu.  This provides
> >>>> +optimal NUMA behavior and keep cpu caches hot.
> >>>> +.TP
> >>>>  .B SO_KEEPALIVE
> >>>>  Enable sending of keep-alive messages on connection-oriented sockets.
> >>>>  Expects an integer boolean flag.
> >>>
> >>> Thank you! Patch applied.
> >>>
> >>> I have tried to enhance the description somewhat. I'm not sure whether
> >>> what I've written is quite correct (or whether it should be further
> >>> extended). Eric, could you please take a look at the following, and let 
> >>> me know if anything needs fixing:
> >>>
> >>>        SO_INCOMING_CPU  (gettable  since Linux 3.19, settable since Linux
> >>>        4.4)
> >>>               Sets or gets the CPU affinity  of  a  socket.   Expects  an
> >>>               integer flag.
> >>>
> >>>                   int cpu = 1;
> >>>                   socklen_t len = sizeof(cpu);
> >>>                   setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
> >>>
> >>>               Because  all  of the packets for a single stream (i.e., all
> >>>               packets for the same 4-tuple) arrive on the single RX queue
> >>>               that  is  associated with a particular CPU, the typical use
> >>>               case is to employ one listening process per RX queue,  with
> >>>               the  incoming  flow being handled by a listener on the same
> >>>               CPU that is handling the RX queue.  This  provides  optimal
> >>>               NUMA behavior and keeps CPU caches hot.
> 
> > Hi Michael
> > 
> > Sorry for the delay.
> 
> Thanks for the reply, but I think you are assuming I know more than 
> I do. I'd like you to elaborate a little please. See below.
> 
> > Note that setting the option is not supported if SO_REUSEPORT is used.
> 
> Please define "not supported". Does this yield an API diagnostic?
> If so, what is it?
> 
> > Socket will be selected from an array, either by a hash or BPF program
> > that has no access to this information.
> 
> Sorry -- I'm lost here. How does this comment relate to the proposed
> man page text above?

Simply that :

If an application uses both SO_INCOMING_CPU and SO_REUSEPORT, then
SO_REUSEPORT logic, selecting the socket to receive the packet, ignores
SO_INCOMING_CPU setting.

This does not need to be documented, because it is an implementation
detail/bug that could be changed, if someone cares enough.
Michael Kerrisk \(man-pages\) April 20, 2017, 2:43 p.m. UTC | #7
On 04/19/2017 10:13 PM, Eric Dumazet wrote:
> On Wed, 2017-04-19 at 20:48 +0200, Michael Kerrisk (man-pages) wrote:
>> Hi Eric,
>>
>> [reodering for clarity]
>>
>>>> On 02/19/2017 09:55 PM, Michael Kerrisk (man-pages) wrote:
>>>>> [CC += Eric, so that he might review]
>>>>>
>>>>> Hello Francois,
>>>>>
>>>>> On 02/18/2017 05:06 AM, Francois Saint-Jacques wrote:
>>>>>> This socket option is undocumented. Applies on the latest version
>>>>>> (man-pages-4.09-511).
>>>>>>
>>>>>> diff --git a/man7/socket.7 b/man7/socket.7
>>>>>> index 3efd7a5d8..1a3ffa253 100644
>>>>>> --- a/man7/socket.7
>>>>>> +++ b/man7/socket.7
>>>>>> @@ -490,6 +490,26 @@ flag on a socket
>>>>>>  operation.
>>>>>>  Expects an integer boolean flag.
>>>>>>  .TP
>>>>>> +.BR SO_INCOMING_CPU " (getsockopt since Linux 3.19, setsockopt since
>>>>>> Linux 4.4)"
>>>>>> +.\" getsocktop 2c8c56e15df3d4c2af3d656e44feb18789f75837
>>>>>> +.\" setsocktop 70da268b569d32a9fddeea85dc18043de9d89f89
>>>>>> +Sets or gets the cpu affinity of a socket. Expects an integer flag.
>>>>>> +.sp
>>>>>> +.in +4n
>>>>>> +.nf
>>>>>> +int cpu = 1;
>>>>>> +socklen_t len = sizeof(cpu);
>>>>>> +setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
>>>>>> +.fi
>>>>>> +.in
>>>>>> +.sp
>>>>>> +The typical use case is one listener per RX queue, as the associated listener
>>>>>> +should only accept flows handled in softirq by the same cpu.  This provides
>>>>>> +optimal NUMA behavior and keep cpu caches hot.
>>>>>> +.TP
>>>>>>  .B SO_KEEPALIVE
>>>>>>  Enable sending of keep-alive messages on connection-oriented sockets.
>>>>>>  Expects an integer boolean flag.
>>>>>
>>>>> Thank you! Patch applied.
>>>>>
>>>>> I have tried to enhance the description somewhat. I'm not sure whether
>>>>> what I've written is quite correct (or whether it should be further
>>>>> extended). Eric, could you please take a look at the following, and let 
>>>>> me know if anything needs fixing:
>>>>>
>>>>>        SO_INCOMING_CPU  (gettable  since Linux 3.19, settable since Linux
>>>>>        4.4)
>>>>>               Sets or gets the CPU affinity  of  a  socket.   Expects  an
>>>>>               integer flag.
>>>>>
>>>>>                   int cpu = 1;
>>>>>                   socklen_t len = sizeof(cpu);
>>>>>                   setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
>>>>>
>>>>>               Because  all  of the packets for a single stream (i.e., all
>>>>>               packets for the same 4-tuple) arrive on the single RX queue
>>>>>               that  is  associated with a particular CPU, the typical use
>>>>>               case is to employ one listening process per RX queue,  with
>>>>>               the  incoming  flow being handled by a listener on the same
>>>>>               CPU that is handling the RX queue.  This  provides  optimal
>>>>>               NUMA behavior and keeps CPU caches hot.
>>
>>> Hi Michael
>>>
>>> Sorry for the delay.
>>
>> Thanks for the reply, but I think you are assuming I know more than 
>> I do. I'd like you to elaborate a little please. See below.
>>
>>> Note that setting the option is not supported if SO_REUSEPORT is used.
>>
>> Please define "not supported". Does this yield an API diagnostic?
>> If so, what is it?
>>
>>> Socket will be selected from an array, either by a hash or BPF program
>>> that has no access to this information.
>>
>> Sorry -- I'm lost here. How does this comment relate to the proposed
>> man page text above?
> 
> Simply that :
> 
> If an application uses both SO_INCOMING_CPU and SO_REUSEPORT, then
> SO_REUSEPORT logic, selecting the socket to receive the packet, ignores
> SO_INCOMING_CPU setting.
> 
> This does not need to be documented, because it is an implementation
> detail/bug that could be changed, if someone cares enough.

Okay, thanks, Eric. I'll just merge the page text as it currently 
is then.

Cheers,

Michael
diff mbox

Patch

diff --git a/man7/socket.7 b/man7/socket.7
index 3efd7a5d8..1a3ffa253 100644
--- a/man7/socket.7
+++ b/man7/socket.7
@@ -490,6 +490,26 @@  flag on a socket
 operation.
 Expects an integer boolean flag.
 .TP
+.BR SO_INCOMING_CPU " (getsockopt since Linux 3.19, setsockopt since
Linux 4.4)"
+.\" getsocktop 2c8c56e15df3d4c2af3d656e44feb18789f75837
+.\" setsocktop 70da268b569d32a9fddeea85dc18043de9d89f89
+Sets or gets the cpu affinity of a socket. Expects an integer flag.
+.sp
+.in +4n
+.nf
+int cpu = 1;
+socklen_t len = sizeof(cpu);
+setsockopt(fd, SOL_SOCKET, SO_INCOMING_CPU, &cpu, &len);
+.fi
+.in
+.sp
+The typical use case is one listener per RX queue, as the associated listener
+should only accept flows handled in softirq by the same cpu.  This provides
+optimal NUMA behavior and keep cpu caches hot.
+.TP
 .B SO_KEEPALIVE
 Enable sending of keep-alive messages on connection-oriented sockets.
 Expects an integer boolean flag.