diff mbox series

[PATCHv2,net-next,04/12] sctp: implement make_datafrag for sctp_stream_interleave

Message ID ed729ca6023dea0812e4caa4d5da3259df375127.1512738021.git.lucien.xin@gmail.com
State Accepted, archived
Delegated to: David Miller
Headers show
Series sctp: Implement Stream Interleave: The I-DATA Chunk Supporting User Message Interleaving | expand

Commit Message

Xin Long Dec. 8, 2017, 1:04 p.m. UTC
To avoid hundreds of checks for the different process on I-DATA chunk,
struct sctp_stream_interleave is defined as a group of functions used
to replace the codes in some place where it needs to do different job
according to if the asoc intl_enabled is set.

With these ops, it only needs to initialize asoc->stream.si with
sctp_stream_interleave_0 for normal data if asoc intl_enable is 0,
or sctp_stream_interleave_1 for idata if asoc intl_enable is set in
sctp_stream_init.

After that, the members in asoc->stream.si can be used directly in
some special places without checking asoc intl_enable.

make_datafrag is the first member for sctp_stream_interleave, it's
used to make data or idata frags, called in sctp_datamsg_from_user.
The old function sctp_make_datafrag_empty needs to be adjust some
to fit in this ops.

Note that as idata and data chunks have different length, it also
defines data_chunk_len for sctp_stream_interleave to describe the
chunk size.

Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
---
 include/net/sctp/sm.h                |  5 +--
 include/net/sctp/stream_interleave.h | 44 ++++++++++++++++++++
 include/net/sctp/structs.h           | 12 ++++++
 net/sctp/Makefile                    |  2 +-
 net/sctp/chunk.c                     |  6 +--
 net/sctp/sm_make_chunk.c             | 21 ++++------
 net/sctp/stream.c                    |  1 +
 net/sctp/stream_interleave.c         | 79 ++++++++++++++++++++++++++++++++++++
 8 files changed, 149 insertions(+), 21 deletions(-)
 create mode 100644 include/net/sctp/stream_interleave.h
 create mode 100644 net/sctp/stream_interleave.c

Comments

David Laight Dec. 8, 2017, 2:06 p.m. UTC | #1
From: Xin Long
> Sent: 08 December 2017 13:04
...
> @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
>  				frag |= SCTP_DATA_SACK_IMM;
>  		}
> 
> -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> -						 0, GFP_KERNEL);
> +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> +						       GFP_KERNEL);

I know that none of the sctp code is very optimised, but that indirect
call is going to be horrid.

	David
Marcelo Ricardo Leitner Dec. 8, 2017, 2:56 p.m. UTC | #2
On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
> From: Xin Long
> > Sent: 08 December 2017 13:04
> ...
> > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> >  				frag |= SCTP_DATA_SACK_IMM;
> >  		}
> > 
> > -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > -						 0, GFP_KERNEL);
> > +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > +						       GFP_KERNEL);
> 
> I know that none of the sctp code is very optimised, but that indirect
> call is going to be horrid.

Yeah.. but there is no way to avoid the double derreference
considering we only have the asoc pointer in there and we have to
reach the contents of the data chunk operations struct, and the .si
part is the same as 'stream' part as it's a constant offset.

Due to the for() in there, we could add a variable to store
asoc->stream.si outside the for and then we can do only a single deref
inside it. Xin, can you please try and see if the generated code is
different?

Other suggestions?

  Marcelo
David Laight Dec. 8, 2017, 3:01 p.m. UTC | #3
From: Marcelo Ricardo Leitner
> Sent: 08 December 2017 14:57
> 
> On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
> > From: Xin Long
> > > Sent: 08 December 2017 13:04
> > ...
> > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> > >  				frag |= SCTP_DATA_SACK_IMM;
> > >  		}
> > >
> > > -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > -						 0, GFP_KERNEL);
> > > +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > > +						       GFP_KERNEL);
> >
> > I know that none of the sctp code is very optimised, but that indirect
> > call is going to be horrid.
> 
> Yeah.. but there is no way to avoid the double derreference
> considering we only have the asoc pointer in there and we have to
> reach the contents of the data chunk operations struct, and the .si
> part is the same as 'stream' part as it's a constant offset.
...

It isn't only the double indirect, the indirect call itself isn't 'fun'.

I think there are other hot paths where you've replaced a sizeof()
with a ?: clause.
Caching the result might be much better.

	David
Marcelo Ricardo Leitner Dec. 8, 2017, 3:15 p.m. UTC | #4
On Fri, Dec 08, 2017 at 03:01:31PM +0000, David Laight wrote:
> From: Marcelo Ricardo Leitner
> > Sent: 08 December 2017 14:57
> > 
> > On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
> > > From: Xin Long
> > > > Sent: 08 December 2017 13:04
> > > ...
> > > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> > > >  				frag |= SCTP_DATA_SACK_IMM;
> > > >  		}
> > > >
> > > > -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > > -						 0, GFP_KERNEL);
> > > > +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > > > +						       GFP_KERNEL);
> > >
> > > I know that none of the sctp code is very optimised, but that indirect
> > > call is going to be horrid.
> > 
> > Yeah.. but there is no way to avoid the double derreference
> > considering we only have the asoc pointer in there and we have to
> > reach the contents of the data chunk operations struct, and the .si
> > part is the same as 'stream' part as it's a constant offset.
> ...
> 
> It isn't only the double indirect, the indirect call itself isn't 'fun'.

I meant in this context.

The indirect call is so we don't have to flood the stack with
if (old data chunk fmt) {
	...
} else {
	...
}

So instead of this, we now have some key operations identified and
wrapped up behind this struct, allowing us to abstract whatever data
chunk format it is.

> 
> I think there are other hot paths where you've replaced a sizeof()
> with a ?: clause.
> Caching the result might be much better.

The only new ?: clause I could find this patchset is on patch 12 and
has nothing to do with sizeof().

The sizeof() results are indeed cached, as you can see in patch 4:
+static struct sctp_stream_interleave sctp_stream_interleave_0 = {
+       .data_chunk_len         = sizeof(struct sctp_data_chunk),
and the two helpers on it at the begining of the patch.

  Marcelo
David Laight Dec. 8, 2017, 3:32 p.m. UTC | #5
From: 'Marcelo Ricardo Leitner'
> Sent: 08 December 2017 15:16
> On Fri, Dec 08, 2017 at 03:01:31PM +0000, David Laight wrote:
> > From: Marcelo Ricardo Leitner
> > > Sent: 08 December 2017 14:57
> > >
> > > On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
> > > > From: Xin Long
> > > > > Sent: 08 December 2017 13:04
> > > > ...
> > > > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> > > > >  				frag |= SCTP_DATA_SACK_IMM;
> > > > >  		}
> > > > >
> > > > > -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > > > -						 0, GFP_KERNEL);
> > > > > +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > > > > +						       GFP_KERNEL);
> > > >
> > > > I know that none of the sctp code is very optimised, but that indirect
> > > > call is going to be horrid.
> > >
> > > Yeah.. but there is no way to avoid the double derreference
> > > considering we only have the asoc pointer in there and we have to
> > > reach the contents of the data chunk operations struct, and the .si
> > > part is the same as 'stream' part as it's a constant offset.
> > ...
> >
> > It isn't only the double indirect, the indirect call itself isn't 'fun'.
> 
> I meant in this context.
> 
> The indirect call is so we don't have to flood the stack with
> if (old data chunk fmt) {
> 	...
> } else {
> 	...
> }
> 
> So instead of this, we now have some key operations identified and
> wrapped up behind this struct, allowing us to abstract whatever data
> chunk format it is.

Nothing wrong with:
#define foo(asoc, ...) \
	if (asoc->new_fmt) \
		foo_new(asoc, __VA_LIST__); \
	else \
		foo_old(asoc, __VA_LIST__);

> > I think there are other hot paths where you've replaced a sizeof()
> > with a ?: clause.
> > Caching the result might be much better.
> 
> The only new ?: clause I could find this patchset is on patch 12 and
> has nothing to do with sizeof().
> 
> The sizeof() results are indeed cached, as you can see in patch 4:
> +static struct sctp_stream_interleave sctp_stream_interleave_0 = {
> +       .data_chunk_len         = sizeof(struct sctp_data_chunk),
> and the two helpers on it at the begining of the patch.

I was getting two bits mixed up.
But the code that reads data_chunk_len is following an awful lot of pointers.

	David
Neil Horman Dec. 8, 2017, 3:37 p.m. UTC | #6
On Fri, Dec 08, 2017 at 12:56:30PM -0200, Marcelo Ricardo Leitner wrote:
> On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
> > From: Xin Long
> > > Sent: 08 December 2017 13:04
> > ...
> > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> > >  				frag |= SCTP_DATA_SACK_IMM;
> > >  		}
> > > 
> > > -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > -						 0, GFP_KERNEL);
> > > +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > > +						       GFP_KERNEL);
> > 
> > I know that none of the sctp code is very optimised, but that indirect
> > call is going to be horrid.
> 
> Yeah.. but there is no way to avoid the double derreference
> considering we only have the asoc pointer in there and we have to
> reach the contents of the data chunk operations struct, and the .si
> part is the same as 'stream' part as it's a constant offset.
> 
> Due to the for() in there, we could add a variable to store
> asoc->stream.si outside the for and then we can do only a single deref
> inside it. Xin, can you please try and see if the generated code is
> different?
> 
> Other suggestions?
> 
Is it worth replacing the si struct with an index/enum value, and indexing an
array of method pointer structs?  That would save you at least one dereference.

Alternatively you could preform the dereference in two steps (i.e. declare an si
pointer on the stack and set it equal to asoc->stream.si, then deref
si->make_datafrag at call time.  That will at least give the compiler an
opportunity to preload the first pointer.

Neil

>   Marcelo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Marcelo Ricardo Leitner Dec. 8, 2017, 4 p.m. UTC | #7
On Fri, Dec 08, 2017 at 10:37:34AM -0500, Neil Horman wrote:
> On Fri, Dec 08, 2017 at 12:56:30PM -0200, Marcelo Ricardo Leitner wrote:
> > On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
> > > From: Xin Long
> > > > Sent: 08 December 2017 13:04
> > > ...
> > > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> > > >  				frag |= SCTP_DATA_SACK_IMM;
> > > >  		}
> > > > 
> > > > -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > > -						 0, GFP_KERNEL);
> > > > +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > > > +						       GFP_KERNEL);
> > > 
> > > I know that none of the sctp code is very optimised, but that indirect
> > > call is going to be horrid.
> > 
> > Yeah.. but there is no way to avoid the double derreference
> > considering we only have the asoc pointer in there and we have to
> > reach the contents of the data chunk operations struct, and the .si
> > part is the same as 'stream' part as it's a constant offset.
> > 
> > Due to the for() in there, we could add a variable to store
> > asoc->stream.si outside the for and then we can do only a single deref
> > inside it. Xin, can you please try and see if the generated code is
> > different?
> > 
> > Other suggestions?
> > 
> Is it worth replacing the si struct with an index/enum value, and indexing an
> array of method pointer structs?  That would save you at least one dereference.

Hmmm, maybe, yes. It would be like
sctp_stream_interleave[asoc->stream.si].make_datafrag(...)

Then same goes for pf->af, probably.

> 
> Alternatively you could preform the dereference in two steps (i.e. declare an si
> pointer on the stack and set it equal to asoc->stream.si, then deref
> si->make_datafrag at call time.  That will at least give the compiler an
> opportunity to preload the first pointer.

Yep, that was my 2nd paragraph above :-) but it only works for cases
such as this one.

  Marcelo

> 
> Neil
> 
> >   Marcelo
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Marcelo Ricardo Leitner Dec. 8, 2017, 4:02 p.m. UTC | #8
On Fri, Dec 08, 2017 at 03:32:54PM +0000, David Laight wrote:
> From: 'Marcelo Ricardo Leitner'
> > Sent: 08 December 2017 15:16
> > On Fri, Dec 08, 2017 at 03:01:31PM +0000, David Laight wrote:
> > > From: Marcelo Ricardo Leitner
> > > > Sent: 08 December 2017 14:57
> > > >
> > > > On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
> > > > > From: Xin Long
> > > > > > Sent: 08 December 2017 13:04
> > > > > ...
> > > > > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
> > > > > >  				frag |= SCTP_DATA_SACK_IMM;
> > > > > >  		}
> > > > > >
> > > > > > -		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
> > > > > > -						 0, GFP_KERNEL);
> > > > > > +		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
> > > > > > +						       GFP_KERNEL);
> > > > >
> > > > > I know that none of the sctp code is very optimised, but that indirect
> > > > > call is going to be horrid.
> > > >
> > > > Yeah.. but there is no way to avoid the double derreference
> > > > considering we only have the asoc pointer in there and we have to
> > > > reach the contents of the data chunk operations struct, and the .si
> > > > part is the same as 'stream' part as it's a constant offset.
> > > ...
> > >
> > > It isn't only the double indirect, the indirect call itself isn't 'fun'.
> > 
> > I meant in this context.
> > 
> > The indirect call is so we don't have to flood the stack with
> > if (old data chunk fmt) {
> > 	...
> > } else {
> > 	...
> > }
> > 
> > So instead of this, we now have some key operations identified and
> > wrapped up behind this struct, allowing us to abstract whatever data
> > chunk format it is.
> 
> Nothing wrong with:
> #define foo(asoc, ...) \
> 	if (asoc->new_fmt) \

Not all function pointers in sctp_stream_interleave have asoc as
a parameter, so we would have to have something like:

#define foo_asoc(asoc, ...) \
	if (asoc->new_fmt) \
		...

#define foo_chunk(chunk, ...) \
	if (chunk->asoc->new_fmt) \
		...

#define foo_ulpq(ulpq, ...) \
	if (ulpq->asoc->new_fmt) \
		...

and we're pretty much back to double deref.
Maybe some reworking on the parameters could alleviate some of these,
but not all.

> 		foo_new(asoc, __VA_LIST__); \
> 	else \
> 		foo_old(asoc, __VA_LIST__);
> 
> > > I think there are other hot paths where you've replaced a sizeof()
> > > with a ?: clause.
> > > Caching the result might be much better.
> > 
> > The only new ?: clause I could find this patchset is on patch 12 and
> > has nothing to do with sizeof().
> > 
> > The sizeof() results are indeed cached, as you can see in patch 4:
> > +static struct sctp_stream_interleave sctp_stream_interleave_0 = {
> > +       .data_chunk_len         = sizeof(struct sctp_data_chunk),
> > and the two helpers on it at the begining of the patch.
> 
> I was getting two bits mixed up.
> But the code that reads data_chunk_len is following an awful lot of pointers.

From path 4:
        max_data = asoc->pathmtu -
                   sctp_sk(asoc->base.sk)->pf->af->net_header_len -
-                  sizeof(struct sctphdr) - sizeof(struct sctp_data_chunk);
+                  sizeof(struct sctphdr) - sctp_datachk_len(&asoc->stream);

You're worried with the double deref in sctp_datachk_len() but on the
line right above it we have 4 derefs. There are several other cases
of double deref in sctp stack even on hot path. Not sure why you're
picking on this one, but ok.

  Marcelo
David Laight Dec. 8, 2017, 4:04 p.m. UTC | #9
From: Marcelo Ricardo Leitner
> Sent: 08 December 2017 16:00
...
> > Is it worth replacing the si struct with an index/enum value, and indexing an
> > array of method pointer structs?  That would save you at least one dereference.
> 
> Hmmm, maybe, yes. It would be like
> sctp_stream_interleave[asoc->stream.si].make_datafrag(...)

If you only expect 2 choices then an if () is likely
to produce better code that the above.

The actual implementation can be hidden inside a #define
or static inline function.

	David
Neil Horman Dec. 8, 2017, 4:08 p.m. UTC | #10
On Fri, Dec 08, 2017 at 04:04:58PM +0000, David Laight wrote:
> From: Marcelo Ricardo Leitner
> > Sent: 08 December 2017 16:00
> ...
> > > Is it worth replacing the si struct with an index/enum value, and indexing an
> > > array of method pointer structs?  That would save you at least one dereference.
> > 
> > Hmmm, maybe, yes. It would be like
> > sctp_stream_interleave[asoc->stream.si].make_datafrag(...)
> 
> If you only expect 2 choices then an if () is likely
> to produce better code that the above.
> 
> The actual implementation can be hidden inside a #define
> or static inline function.
> 
Thats the real question though, will we expect more than two interleaving
strategies?  Currently its a boolean operation so the answer seems like yes, but
is there a possiblity of a biased interleaving, or other worthwhile algorithm?

Neil

> 	David
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-sctp" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Xin Long Dec. 8, 2017, 4:17 p.m. UTC | #11
On Sat, Dec 9, 2017 at 12:00 AM, Marcelo Ricardo Leitner
<marcelo.leitner@gmail.com> wrote:
> On Fri, Dec 08, 2017 at 10:37:34AM -0500, Neil Horman wrote:
>> On Fri, Dec 08, 2017 at 12:56:30PM -0200, Marcelo Ricardo Leitner wrote:
>> > On Fri, Dec 08, 2017 at 02:06:04PM +0000, David Laight wrote:
>> > > From: Xin Long
>> > > > Sent: 08 December 2017 13:04
>> > > ...
>> > > > @@ -264,8 +264,8 @@ struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
>> > > >                                 frag |= SCTP_DATA_SACK_IMM;
>> > > >                 }
>> > > >
>> > > > -               chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
>> > > > -                                                0, GFP_KERNEL);
>> > > > +               chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
>> > > > +                                                      GFP_KERNEL);
>> > >
>> > > I know that none of the sctp code is very optimised, but that indirect
>> > > call is going to be horrid.
>> >
>> > Yeah.. but there is no way to avoid the double derreference
>> > considering we only have the asoc pointer in there and we have to
>> > reach the contents of the data chunk operations struct, and the .si
>> > part is the same as 'stream' part as it's a constant offset.
>> >
>> > Due to the for() in there, we could add a variable to store
>> > asoc->stream.si outside the for and then we can do only a single deref
>> > inside it. Xin, can you please try and see if the generated code is
>> > different?
>> >
>> > Other suggestions?
>> >
>> Is it worth replacing the si struct with an index/enum value, and indexing an
>> array of method pointer structs?  That would save you at least one dereference.
>
> Hmmm, maybe, yes. It would be like
> sctp_stream_interleave[asoc->stream.si].make_datafrag(...)
>
> Then same goes for pf->af, probably.
>
>>
>> Alternatively you could preform the dereference in two steps (i.e. declare an si
>> pointer on the stack and set it equal to asoc->stream.si, then deref
>> si->make_datafrag at call time.  That will at least give the compiler an
>> opportunity to preload the first pointer.
>
> Yep, that was my 2nd paragraph above :-) but it only works for cases
> such as this one.

Now:
  for(N) {
    ...
    chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
     0x000000000000fb58 <+360>: mov    0x848(%r13),%rax  <---- [a]
     0x000000000000fb5f <+367>: movzbl %cl,%ecx
     0x000000000000fb62 <+370>: mov    $0x14000c0,%r8d
     0x000000000000fb68 <+376>: mov    %r12d,%edx
     0x000000000000fb6b <+379>: mov    (%rsp),%rsi
     0x000000000000fb6f <+383>: mov    %r13,%rdi  <=(X)
     0x000000000000fb72 <+386>: callq  *0x8(%rax)  <---- [b]
     0x000000000000fb78 <+392>: mov    %rax,%r15
   }

   ret = N * ([a] + [b])


After using a variable:
  struct sctp_stream_interleave *si;
  ...
  si = asoc->stream.si;
     0x000000000000fb44 <+340>: mov    0x848(%r14),%rax
     0x000000000000fb4e <+350>: mov    %rax,0x20(%rsp) <----- [1]

  for(N) {
    ...
    chunk = si->make_datafrag(asoc, sinfo, len, frag, GFP_KERNEL);
     0x000000000000fb69 <+377>: mov    0x20(%rsp),%rax <----- [2]
     0x000000000000fb6e <+382>: movzbl %cl,%ecx
     0x000000000000fb71 <+385>: mov    $0x14000c0,%r8d
     0x000000000000fb77 <+391>: mov    %r12d,%edx
     0x000000000000fb7a <+394>: mov    (%rsp),%rsi
     0x000000000000fb7e <+398>: mov    0x28(%rsp),%rdi <=(Y)
     0x000000000000fb83 <+403>: callq  *0x8(%rax) <----- [3]
     0x000000000000fb89 <+409>: mov    %rax,%r14
   }

   ret = [1] + N * ([2] + [3])


Another small difference:
  as you can see, comparing to (X), (Y) is using 0x28(%rsp) in the loop,
  instead of %r13.

So that's what I can see from the related generated code.
If 0x848(%r13) is not worse than 0x28(%rsp) for cpu, I think
asoc->stream.si->make_datafrag() is even better. No ?
David Laight Dec. 8, 2017, 4:22 p.m. UTC | #12
From: Xin Long

> Sent: 08 December 2017 16:18

> 

...
> >> Alternatively you could preform the dereference in two steps (i.e. declare an si

> >> pointer on the stack and set it equal to asoc->stream.si, then deref

> >> si->make_datafrag at call time.  That will at least give the compiler an

> >> opportunity to preload the first pointer.


You want to save the function pointer itself.

...
> Another small difference:

>   as you can see, comparing to (X), (Y) is using 0x28(%rsp) in the loop,

>   instead of %r13.

> 

> So that's what I can see from the related generated code.

> If 0x848(%r13) is not worse than 0x28(%rsp) for cpu, I think

> asoc->stream.si->make_datafrag() is even better. No ?


That code must have far too many life local variables.
Otherwise there's be a caller saved register available.

	David
Xin Long Dec. 8, 2017, 5:23 p.m. UTC | #13
On Sat, Dec 9, 2017 at 12:22 AM, David Laight <David.Laight@aculab.com> wrote:
> From: Xin Long
>> Sent: 08 December 2017 16:18
>>
> ...
>> >> Alternatively you could preform the dereference in two steps (i.e. declare an si
>> >> pointer on the stack and set it equal to asoc->stream.si, then deref
>> >> si->make_datafrag at call time.  That will at least give the compiler an
>> >> opportunity to preload the first pointer.
>
> You want to save the function pointer itself.
>
> ...
>> Another small difference:
>>   as you can see, comparing to (X), (Y) is using 0x28(%rsp) in the loop,
>>   instead of %r13.
>>
>> So that's what I can see from the related generated code.
>> If 0x848(%r13) is not worse than 0x28(%rsp) for cpu, I think
>> asoc->stream.si->make_datafrag() is even better. No ?
>
> That code must have far too many life local variables.
> Otherwise there's be a caller saved register available.
>
Hi, David, Sorry,  I'm not sure we're worrying about the cpu cost or
codes style now ?

For cpu cost,  I think 0x848(%r13) operation must be better than the
generated code of if-else.

For the codes style, comparing to the if-else, I think this one is
more readable. (ignore extendible stuff first, as probably no more
new type of data chunk).
David Laight Dec. 8, 2017, 5:29 p.m. UTC | #14
From: Xin Long ...

> Hi, David, Sorry,  I'm not sure we're worrying about the cpu cost or

> codes style now ?

> 

> For cpu cost,  I think 0x848(%r13) operation must be better than the

> generated code of if-else.


Nope - the call xxx(%ryyy) is likely to be a data cache miss and a complete
cpu pipeline stall.

The conditional will be a data cache hit and the code (for one branch)
will be prefetched and speculatively executed.

Some very modern cpu might manage to predict indirect jumps, but for
most it is a full pipeline stall.

	David
Xin Long Dec. 8, 2017, 5:37 p.m. UTC | #15
On Sat, Dec 9, 2017 at 1:29 AM, David Laight <David.Laight@aculab.com> wrote:
> From: Xin Long ...
>> Hi, David, Sorry,  I'm not sure we're worrying about the cpu cost or
>> codes style now ?
>>
>> For cpu cost,  I think 0x848(%r13) operation must be better than the
>> generated code of if-else.
>
> Nope - the call xxx(%ryyy) is likely to be a data cache miss and a complete
> cpu pipeline stall.
>
> The conditional will be a data cache hit and the code (for one branch)
> will be prefetched and speculatively executed.
>
> Some very modern cpu might manage to predict indirect jumps, but for
> most it is a full pipeline stall.
Thanks for the CPU information.

The thing is with if-else can't avoid xxx(%ryyy) in this case, as Marcelo
said above. It seems if-else will just be a extra cost compare to this one.
Marcelo Ricardo Leitner Dec. 8, 2017, 8:37 p.m. UTC | #16
On Fri, Dec 08, 2017 at 11:08:56AM -0500, Neil Horman wrote:
> On Fri, Dec 08, 2017 at 04:04:58PM +0000, David Laight wrote:
> > From: Marcelo Ricardo Leitner
> > > Sent: 08 December 2017 16:00
> > ...
> > > > Is it worth replacing the si struct with an index/enum value, and indexing an
> > > > array of method pointer structs?  That would save you at least one dereference.
> > > 
> > > Hmmm, maybe, yes. It would be like
> > > sctp_stream_interleave[asoc->stream.si].make_datafrag(...)
> > 
> > If you only expect 2 choices then an if () is likely
> > to produce better code that the above.
> > 
> > The actual implementation can be hidden inside a #define
> > or static inline function.
> > 
> Thats the real question though, will we expect more than two interleaving
> strategies?  Currently its a boolean operation so the answer seems like yes, but
> is there a possiblity of a biased interleaving, or other worthwhile algorithm?

For the chunk format, I don't think so. It would require another RFC
update.
For other possibilities on having a 3rd choice in there, I also don't
think so. Stream scheduling is handled apart from it and rx buffer
stuff is being covered by it now, don't see how we could have a 3rd
option without another chunk format.

But that said, I don't think the macro or even inline wrappers are as
clear as the struct with function pointers here and in some cases it
even won't avoid the second deref. I wouldn't like to compromise code
readability and OO because of 1 fetch, specially considering that this
will be barely noticeable.

Neil's idea on using the array of structs and indexing on it is nice.
Saves a deref and keeps the abstractions without inserting too much
noise on it. Plus, it also allows replacing the struct pointer in
sctp_stream with a bit. If then placed before the union in there, we
can save up to the whole pointer. Looks like a good compromise to me.

  Marcelo
diff mbox series

Patch

diff --git a/include/net/sctp/sm.h b/include/net/sctp/sm.h
index 5389ae0..f950186 100644
--- a/include/net/sctp/sm.h
+++ b/include/net/sctp/sm.h
@@ -199,10 +199,9 @@  struct sctp_chunk *sctp_make_cwr(const struct sctp_association *asoc,
 				 const struct sctp_chunk *chunk);
 struct sctp_chunk *sctp_make_idata(const struct sctp_association *asoc,
 				   __u8 flags, int paylen, gfp_t gfp);
-struct sctp_chunk *sctp_make_datafrag_empty(struct sctp_association *asoc,
+struct sctp_chunk *sctp_make_datafrag_empty(const struct sctp_association *asoc,
 					    const struct sctp_sndrcvinfo *sinfo,
-					    int len, const __u8 flags,
-					    __u16 ssn, gfp_t gfp);
+					    int len, __u8 flags, gfp_t gfp);
 struct sctp_chunk *sctp_make_ecne(const struct sctp_association *asoc,
 				  const __u32 lowest_tsn);
 struct sctp_chunk *sctp_make_sack(const struct sctp_association *asoc);
diff --git a/include/net/sctp/stream_interleave.h b/include/net/sctp/stream_interleave.h
new file mode 100644
index 0000000..7b9fa8d
--- /dev/null
+++ b/include/net/sctp/stream_interleave.h
@@ -0,0 +1,44 @@ 
+/* SCTP kernel implementation
+ * (C) Copyright Red Hat Inc. 2017
+ *
+ * These are definitions used by the stream schedulers, defined in RFC
+ * draft ndata (https://tools.ietf.org/html/draft-ietf-tsvwg-sctp-ndata-11)
+ *
+ * This SCTP implementation is free software;
+ * you can redistribute it and/or modify it under the terms of
+ * the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This SCTP implementation  is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; without even the implied
+ *                 ************************
+ * warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with GNU CC; see the file COPYING.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ *
+ * Please send any bug reports or fixes you make to the
+ * email addresses:
+ *    lksctp developers <linux-sctp@vger.kernel.org>
+ *
+ * Written or modified by:
+ *   Xin Long <lucien.xin@gmail.com>
+ */
+
+#ifndef __sctp_stream_interleave_h__
+#define __sctp_stream_interleave_h__
+
+struct sctp_stream_interleave {
+	__u16	data_chunk_len;
+	/* (I-)DATA process */
+	struct sctp_chunk *(*make_datafrag)(const struct sctp_association *asoc,
+					    const struct sctp_sndrcvinfo *sinfo,
+					    int len, __u8 flags, gfp_t gfp);
+};
+
+void sctp_stream_interleave_init(struct sctp_stream *stream);
+
+#endif /* __sctp_stream_interleave_h__ */
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 7026a80..96cc898 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -89,6 +89,7 @@  struct sctp_stream;
 #include <net/sctp/tsnmap.h>
 #include <net/sctp/ulpevent.h>
 #include <net/sctp/ulpqueue.h>
+#include <net/sctp/stream_interleave.h>
 
 /* Structures useful for managing bind/connect. */
 
@@ -1389,11 +1390,22 @@  struct sctp_stream {
 			struct sctp_stream_out_ext *rr_next;
 		};
 	};
+	struct sctp_stream_interleave *si;
 };
 
 #define SCTP_STREAM_CLOSED		0x00
 #define SCTP_STREAM_OPEN		0x01
 
+static inline __u16 sctp_datachk_len(const struct sctp_stream *stream)
+{
+	return stream->si->data_chunk_len;
+}
+
+static inline __u16 sctp_datahdr_len(const struct sctp_stream *stream)
+{
+	return stream->si->data_chunk_len - sizeof(struct sctp_chunkhdr);
+}
+
 /* SCTP_GET_ASSOC_STATS counters */
 struct sctp_priv_assoc_stats {
 	/* Maximum observed rto in the association during subsequent
diff --git a/net/sctp/Makefile b/net/sctp/Makefile
index 1ca84a2..54bd9c1 100644
--- a/net/sctp/Makefile
+++ b/net/sctp/Makefile
@@ -14,7 +14,7 @@  sctp-y := sm_statetable.o sm_statefuns.o sm_sideeffect.o \
 	  tsnmap.o bind_addr.o socket.o primitive.o \
 	  output.o input.o debug.o stream.o auth.o \
 	  offload.o stream_sched.o stream_sched_prio.o \
-	  stream_sched_rr.o
+	  stream_sched_rr.o stream_interleave.o
 
 sctp_probe-y := probe.o
 
diff --git a/net/sctp/chunk.c b/net/sctp/chunk.c
index 7f8baa4..62adaaa 100644
--- a/net/sctp/chunk.c
+++ b/net/sctp/chunk.c
@@ -191,7 +191,7 @@  struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 	 */
 	max_data = asoc->pathmtu -
 		   sctp_sk(asoc->base.sk)->pf->af->net_header_len -
-		   sizeof(struct sctphdr) - sizeof(struct sctp_data_chunk);
+		   sizeof(struct sctphdr) - sctp_datachk_len(&asoc->stream);
 	max_data = SCTP_TRUNC4(max_data);
 
 	/* If the the peer requested that we authenticate DATA chunks
@@ -264,8 +264,8 @@  struct sctp_datamsg *sctp_datamsg_from_user(struct sctp_association *asoc,
 				frag |= SCTP_DATA_SACK_IMM;
 		}
 
-		chunk = sctp_make_datafrag_empty(asoc, sinfo, len, frag,
-						 0, GFP_KERNEL);
+		chunk = asoc->stream.si->make_datafrag(asoc, sinfo, len, frag,
+						       GFP_KERNEL);
 		if (!chunk) {
 			err = -ENOMEM;
 			goto errout;
diff --git a/net/sctp/sm_make_chunk.c b/net/sctp/sm_make_chunk.c
index b969397..23a7313 100644
--- a/net/sctp/sm_make_chunk.c
+++ b/net/sctp/sm_make_chunk.c
@@ -721,38 +721,31 @@  struct sctp_chunk *sctp_make_ecne(const struct sctp_association *asoc,
 /* Make a DATA chunk for the given association from the provided
  * parameters.  However, do not populate the data payload.
  */
-struct sctp_chunk *sctp_make_datafrag_empty(struct sctp_association *asoc,
+struct sctp_chunk *sctp_make_datafrag_empty(const struct sctp_association *asoc,
 					    const struct sctp_sndrcvinfo *sinfo,
-					    int data_len, __u8 flags, __u16 ssn,
-					    gfp_t gfp)
+					    int len, __u8 flags, gfp_t gfp)
 {
 	struct sctp_chunk *retval;
 	struct sctp_datahdr dp;
-	int chunk_len;
 
 	/* We assign the TSN as LATE as possible, not here when
 	 * creating the chunk.
 	 */
-	dp.tsn = 0;
+	memset(&dp, 0, sizeof(dp));
+	dp.ppid = sinfo->sinfo_ppid;
 	dp.stream = htons(sinfo->sinfo_stream);
-	dp.ppid   = sinfo->sinfo_ppid;
 
 	/* Set the flags for an unordered send.  */
-	if (sinfo->sinfo_flags & SCTP_UNORDERED) {
+	if (sinfo->sinfo_flags & SCTP_UNORDERED)
 		flags |= SCTP_DATA_UNORDERED;
-		dp.ssn = 0;
-	} else
-		dp.ssn = htons(ssn);
 
-	chunk_len = sizeof(dp) + data_len;
-	retval = sctp_make_data(asoc, flags, chunk_len, gfp);
+	retval = sctp_make_data(asoc, flags, sizeof(dp) + len, gfp);
 	if (!retval)
-		goto nodata;
+		return NULL;
 
 	retval->subh.data_hdr = sctp_addto_chunk(retval, sizeof(dp), &dp);
 	memcpy(&retval->sinfo, sinfo, sizeof(struct sctp_sndrcvinfo));
 
-nodata:
 	return retval;
 }
 
diff --git a/net/sctp/stream.c b/net/sctp/stream.c
index 76ea66b..8370e6c 100644
--- a/net/sctp/stream.c
+++ b/net/sctp/stream.c
@@ -167,6 +167,7 @@  int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 	sched->init(stream);
 
 in:
+	sctp_stream_interleave_init(stream);
 	if (!incnt)
 		goto out;
 
diff --git a/net/sctp/stream_interleave.c b/net/sctp/stream_interleave.c
new file mode 100644
index 0000000..397c3c1
--- /dev/null
+++ b/net/sctp/stream_interleave.c
@@ -0,0 +1,79 @@ 
+/* SCTP kernel implementation
+ * (C) Copyright Red Hat Inc. 2017
+ *
+ * This file is part of the SCTP kernel implementation
+ *
+ * These functions manipulate sctp stream queue/scheduling.
+ *
+ * This SCTP implementation is free software;
+ * you can redistribute it and/or modify it under the terms of
+ * the GNU General Public License as published by
+ * the Free Software Foundation; either version 2, or (at your option)
+ * any later version.
+ *
+ * This SCTP implementation is distributed in the hope that it
+ * will be useful, but WITHOUT ANY WARRANTY; without even the implied
+ *                 ************************
+ * warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+ * See the GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with GNU CC; see the file COPYING.  If not, see
+ * <http://www.gnu.org/licenses/>.
+ *
+ * Please send any bug reports or fixes you make to the
+ * email addresched(es):
+ *    lksctp developers <linux-sctp@vger.kernel.org>
+ *
+ * Written or modified by:
+ *    Xin Long <lucien.xin@gmail.com>
+ */
+
+#include <net/sctp/sctp.h>
+#include <net/sctp/sm.h>
+#include <linux/sctp.h>
+
+static struct sctp_chunk *sctp_make_idatafrag_empty(
+					const struct sctp_association *asoc,
+					const struct sctp_sndrcvinfo *sinfo,
+					int len, __u8 flags, gfp_t gfp)
+{
+	struct sctp_chunk *retval;
+	struct sctp_idatahdr dp;
+
+	memset(&dp, 0, sizeof(dp));
+	dp.stream = htons(sinfo->sinfo_stream);
+
+	if (sinfo->sinfo_flags & SCTP_UNORDERED)
+		flags |= SCTP_DATA_UNORDERED;
+
+	retval = sctp_make_idata(asoc, flags, sizeof(dp) + len, gfp);
+	if (!retval)
+		return NULL;
+
+	retval->subh.idata_hdr = sctp_addto_chunk(retval, sizeof(dp), &dp);
+	memcpy(&retval->sinfo, sinfo, sizeof(struct sctp_sndrcvinfo));
+
+	return retval;
+}
+
+static struct sctp_stream_interleave sctp_stream_interleave_0 = {
+	.data_chunk_len		= sizeof(struct sctp_data_chunk),
+	/* DATA process functions */
+	.make_datafrag		= sctp_make_datafrag_empty,
+};
+
+static struct sctp_stream_interleave sctp_stream_interleave_1 = {
+	.data_chunk_len		= sizeof(struct sctp_idata_chunk),
+	/* I-DATA process functions */
+	.make_datafrag		= sctp_make_idatafrag_empty,
+};
+
+void sctp_stream_interleave_init(struct sctp_stream *stream)
+{
+	struct sctp_association *asoc;
+
+	asoc = container_of(stream, struct sctp_association, stream);
+	stream->si = asoc->intl_enable ? &sctp_stream_interleave_1
+				       : &sctp_stream_interleave_0;
+}