diff mbox series

[iproute2,net-next,v2,3/4] ss: Buffer raw fields first, then render them as a table

Message ID fef5eed38c3f47f334136fa2d857b6478d497d03.1513039237.git.sbrivio@redhat.com
State Accepted, archived
Delegated to: stephen hemminger
Headers show
Series Abstract columns, properly space and wrap fields | expand

Commit Message

Stefano Brivio Dec. 12, 2017, 12:46 a.m. UTC
This allows us to measure the maximum field length for each
column before printing fields and will permit us to apply
optimal field spacing and distribution. Structure of the output
buffer with chunked allocation is described in comments.

Output is still unchanged, original spacing is used.

Running over one million sockets with -tul options by simply
modifying main() to loop 50,000 times over the *_show()
functions, buffering the whole output and rendering it at the
end, with 10 UDP sockets, 10 TCP sockets, while throwing
output away, doesn't show significant changes in execution time
on my laptop with an Intel i7-6600U CPU:

- before this patch:
$ time ./ss -tul > /dev/null
real	0m29.899s
user	0m2.017s
sys	0m27.801s

- after this patch:
$ time ./ss -tul > /dev/null
real	0m29.827s
user	0m1.942s
sys	0m27.812s

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
---
v2: rebase after conflict with 00ac78d39c29 ("ss: print tcpi_rcv_ssthresh")

 misc/ss.c | 271 +++++++++++++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 225 insertions(+), 46 deletions(-)

Comments

Eric Dumazet Feb. 13, 2019, 12:42 a.m. UTC | #1
On 12/11/2017 04:46 PM, Stefano Brivio wrote:
> This allows us to measure the maximum field length for each
> column before printing fields and will permit us to apply
> optimal field spacing and distribution. Structure of the output
> buffer with chunked allocation is described in comments.
> 
> Output is still unchanged, original spacing is used.
> 
> Running over one million sockets with -tul options by simply
> modifying main() to loop 50,000 times over the *_show()
> functions, buffering the whole output and rendering it at the
> end, with 10 UDP sockets, 10 TCP sockets, while throwing
> output away, doesn't show significant changes in execution time
> on my laptop with an Intel i7-6600U CPU:
> 
> - before this patch:
> $ time ./ss -tul > /dev/null
> real	0m29.899s
> user	0m2.017s
> sys	0m27.801s
> 
> - after this patch:
> $ time ./ss -tul > /dev/null
> real	0m29.827s
> user	0m1.942s
> sys	0m27.812s
> 

I do not get it.

"ss -emoi " uses almost 1KB per socket.

10,000,000 sockets -> we need about 10GB of memory  ???

This is a serious regression.
Stefano Brivio Feb. 13, 2019, 8:37 a.m. UTC | #2
On Tue, 12 Feb 2019 16:42:04 -0800
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> I do not get it.
> 
> "ss -emoi " uses almost 1KB per socket.
> 
> 10,000,000 sockets -> we need about 10GB of memory  ???
> 
> This is a serious regression.

I guess this is rather subjective: the worst case I considered back then
was the output of 'ss -tei0' (less than 500 bytes) for one million
sockets, which gives 500M of memory, which should in turn be fine on a
machine handling one million sockets.

Now, if 'ss -emoi' on 10 million sockets is an actual use case (out of
curiosity: how are you going to process that output? Would JSON help?),
I see two easy options to solve this:

1. flush the output every time we reach a given buffer size (1M
   perhaps). This might make the resulting blocks slightly unaligned,
   with occasional loss of readability on lines occurring every 1k to
   10k sockets approximately, even though after 1k sockets column sizes
   won't change much (it looks anyway better than the original), and I
   don't expect anybody to actually scroll that output

2. add a switch for unbuffered output, but then you need to remember to
   pass it manually, and the whole output would be as bad as the
   original in case you need the switch.

I'd rather go with 1., it's easy to implement (we already have partial
flushing with '--events') and it looks like a good compromise on
usability. Thoughts?
Stephen Hemminger Feb. 13, 2019, 4:51 p.m. UTC | #3
On Wed, 13 Feb 2019 09:37:11 +0100
Stefano Brivio <sbrivio@redhat.com> wrote:

> On Tue, 12 Feb 2019 16:42:04 -0800
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > I do not get it.
> > 
> > "ss -emoi " uses almost 1KB per socket.
> > 
> > 10,000,000 sockets -> we need about 10GB of memory  ???
> > 
> > This is a serious regression.  
> 
> I guess this is rather subjective: the worst case I considered back then
> was the output of 'ss -tei0' (less than 500 bytes) for one million
> sockets, which gives 500M of memory, which should in turn be fine on a
> machine handling one million sockets.
> 
> Now, if 'ss -emoi' on 10 million sockets is an actual use case (out of
> curiosity: how are you going to process that output? Would JSON help?),
> I see two easy options to solve this:
> 
> 1. flush the output every time we reach a given buffer size (1M
>    perhaps). This might make the resulting blocks slightly unaligned,
>    with occasional loss of readability on lines occurring every 1k to
>    10k sockets approximately, even though after 1k sockets column sizes
>    won't change much (it looks anyway better than the original), and I
>    don't expect anybody to actually scroll that output
> 
> 2. add a switch for unbuffered output, but then you need to remember to
>    pass it manually, and the whole output would be as bad as the
>    original in case you need the switch.
> 
> I'd rather go with 1., it's easy to implement (we already have partial
> flushing with '--events') and it looks like a good compromise on
> usability. Thoughts?
> 

I agree with eric. The benefits of buffering are not worth it.
Let's just choose a reasonable field width, if something is too big, columns won't line up
which i snot a big deal.

Unless you come up with a better solution, I am going to revert this.
Stefano Brivio Feb. 13, 2019, 5:22 p.m. UTC | #4
On Wed, 13 Feb 2019 08:51:01 -0800
Stephen Hemminger <stephen@networkplumber.org> wrote:

> On Wed, 13 Feb 2019 09:37:11 +0100
> Stefano Brivio <sbrivio@redhat.com> wrote:
> 
> > On Tue, 12 Feb 2019 16:42:04 -0800
> > Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >   
> > > I do not get it.
> > > 
> > > "ss -emoi " uses almost 1KB per socket.
> > > 
> > > 10,000,000 sockets -> we need about 10GB of memory  ???
> > > 
> > > This is a serious regression.    
> > 
> > I guess this is rather subjective: the worst case I considered back then
> > was the output of 'ss -tei0' (less than 500 bytes) for one million
> > sockets, which gives 500M of memory, which should in turn be fine on a
> > machine handling one million sockets.
> > 
> > Now, if 'ss -emoi' on 10 million sockets is an actual use case (out of
> > curiosity: how are you going to process that output? Would JSON help?),
> > I see two easy options to solve this:
> > 
> > 1. flush the output every time we reach a given buffer size (1M
> >    perhaps). This might make the resulting blocks slightly unaligned,
> >    with occasional loss of readability on lines occurring every 1k to
> >    10k sockets approximately, even though after 1k sockets column sizes
> >    won't change much (it looks anyway better than the original), and I
> >    don't expect anybody to actually scroll that output
> > 
> > 2. add a switch for unbuffered output, but then you need to remember to
> >    pass it manually, and the whole output would be as bad as the
> >    original in case you need the switch.
> > 
> > I'd rather go with 1., it's easy to implement (we already have partial
> > flushing with '--events') and it looks like a good compromise on
> > usability. Thoughts?
> >   
> I agree with eric. The benefits of buffering are not worth it.
> Let's just choose a reasonable field width, if something is too big, columns won't line up
> which i snot a big deal.

That's how it was before, and we couldn't even get fields aligned with
TCP and UDP sockets in a 80 columns wide terminal. See examples at:
https://patchwork.ozlabs.org/cover/847301/.

I tried, but I think it's impossible to find a "reasonable" field
width, especially when you mix a number of socket types.

> Unless you come up with a better solution, I am going to revert this.

That's why I asked for feedback about my proposals 1. and 2. above.
I'll go for 1. then.
Eric Dumazet Feb. 13, 2019, 5:31 p.m. UTC | #5
On 02/13/2019 12:37 AM, Stefano Brivio wrote:
> On Tue, 12 Feb 2019 16:42:04 -0800
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
>> I do not get it.
>>
>> "ss -emoi " uses almost 1KB per socket.
>>
>> 10,000,000 sockets -> we need about 10GB of memory  ???
>>
>> This is a serious regression.
> 
> I guess this is rather subjective: the worst case I considered back then
> was the output of 'ss -tei0' (less than 500 bytes) for one million
> sockets, which gives 500M of memory, which should in turn be fine on a
> machine handling one million sockets.
> 
> Now, if 'ss -emoi' on 10 million sockets is an actual use case (out of
> curiosity: how are you going to process that output? Would JSON help?),
> I see two easy options to solve this:


ss -temoi | parser (written in shell or awk or whatever...)

This is a use case, I just got bitten because using ss command
actually OOM my container, while trying to debug a busy GFE.

The host itself can have 10,000,000 TCP sockets, but usually sysadmin shells
run in a container with no more than 500 MB available. 

Otherwise, it would be too easy for a buggy program to OOM the whole machine
and have angry customers.

> 
> 1. flush the output every time we reach a given buffer size (1M
>    perhaps). This might make the resulting blocks slightly unaligned,
>    with occasional loss of readability on lines occurring every 1k to
>    10k sockets approximately, even though after 1k sockets column sizes
>    won't change much (it looks anyway better than the original), and I
>    don't expect anybody to actually scroll that output
> 
> 2. add a switch for unbuffered output, but then you need to remember to
>    pass it manually, and the whole output would be as bad as the
>    original in case you need the switch.
> 
> I'd rather go with 1., it's easy to implement (we already have partial
> flushing with '--events') and it looks like a good compromise on
> usability. Thoughts?
> 

1 seems fine, but a switch for 'please do not try to format' would be fine.

I wonder why we try to 'format' when stdout is a pipe or a regular file .
Eric Dumazet Feb. 13, 2019, 5:32 p.m. UTC | #6
On 02/13/2019 09:22 AM, Stefano Brivio wrote:

> That's why I asked for feedback about my proposals 1. and 2. above.
> I'll go for 1. then.
> 

If stdout is a regular file or a pipe, just do not try to align things ?
Stefano Brivio Feb. 13, 2019, 5:38 p.m. UTC | #7
On Wed, 13 Feb 2019 09:31:03 -0800
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> On 02/13/2019 12:37 AM, Stefano Brivio wrote:
> > On Tue, 12 Feb 2019 16:42:04 -0800
> > Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >   
> >> I do not get it.
> >>
> >> "ss -emoi " uses almost 1KB per socket.
> >>
> >> 10,000,000 sockets -> we need about 10GB of memory  ???
> >>
> >> This is a serious regression.  
> > 
> > I guess this is rather subjective: the worst case I considered back then
> > was the output of 'ss -tei0' (less than 500 bytes) for one million
> > sockets, which gives 500M of memory, which should in turn be fine on a
> > machine handling one million sockets.
> > 
> > Now, if 'ss -emoi' on 10 million sockets is an actual use case (out of
> > curiosity: how are you going to process that output? Would JSON help?),
> > I see two easy options to solve this:  
> 
> 
> ss -temoi | parser (written in shell or awk or whatever...)
> 
> This is a use case, I just got bitten because using ss command
> actually OOM my container, while trying to debug a busy GFE.
> 
> The host itself can have 10,000,000 TCP sockets, but usually sysadmin shells
> run in a container with no more than 500 MB available. 
> 
> Otherwise, it would be too easy for a buggy program to OOM the whole machine
> and have angry customers.

Ouch, I see.

> > 
> > 1. flush the output every time we reach a given buffer size (1M
> >    perhaps). This might make the resulting blocks slightly unaligned,
> >    with occasional loss of readability on lines occurring every 1k to
> >    10k sockets approximately, even though after 1k sockets column sizes
> >    won't change much (it looks anyway better than the original), and I
> >    don't expect anybody to actually scroll that output
> > 
> > 2. add a switch for unbuffered output, but then you need to remember to
> >    pass it manually, and the whole output would be as bad as the
> >    original in case you need the switch.
> > 
> > I'd rather go with 1., it's easy to implement (we already have partial
> > flushing with '--events') and it looks like a good compromise on
> > usability. Thoughts?
> >   
> 
> 1 seems fine, but a switch for 'please do not try to format' would be fine.

Let me try with 1. first then -- we already have a huge number of
switches.

> I wonder why we try to 'format' when stdout is a pipe or a regular file .

As stupid as it might sound, I just didn't think of fixing that :)

What would you suggest in that case, single whitespaces? Tabs?
Eric Dumazet Feb. 13, 2019, 6:01 p.m. UTC | #8
On 02/13/2019 09:38 AM, Stefano Brivio wrote:
>
> What would you suggest in that case, single whitespaces? Tabs?
> 

We have spaces at the moment, I would leave them.
Stefano Brivio Feb. 13, 2019, 9:17 p.m. UTC | #9
On Wed, 13 Feb 2019 09:31:03 -0800
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> On 02/13/2019 12:37 AM, Stefano Brivio wrote:
> > On Tue, 12 Feb 2019 16:42:04 -0800
> > Eric Dumazet <eric.dumazet@gmail.com> wrote:
> >   
> >> I do not get it.
> >>
> >> "ss -emoi " uses almost 1KB per socket.
> >>
> >> 10,000,000 sockets -> we need about 10GB of memory  ???
> >>
> >> This is a serious regression.  
> > 
> > I guess this is rather subjective: the worst case I considered back then
> > was the output of 'ss -tei0' (less than 500 bytes) for one million
> > sockets, which gives 500M of memory, which should in turn be fine on a
> > machine handling one million sockets.
> > 
> > Now, if 'ss -emoi' on 10 million sockets is an actual use case (out of
> > curiosity: how are you going to process that output? Would JSON help?),
> > I see two easy options to solve this:  
> 
> 
> ss -temoi | parser (written in shell or awk or whatever...)
> 
> This is a use case, I just got bitten because using ss command
> actually OOM my container, while trying to debug a busy GFE.
> 
> The host itself can have 10,000,000 TCP sockets, but usually sysadmin shells
> run in a container with no more than 500 MB available. 
> 
> Otherwise, it would be too easy for a buggy program to OOM the whole machine
> and have angry customers.
> 
> > 
> > 1. flush the output every time we reach a given buffer size (1M
> >    perhaps). This might make the resulting blocks slightly unaligned,
> >    with occasional loss of readability on lines occurring every 1k to
> >    10k sockets approximately, even though after 1k sockets column sizes
> >    won't change much (it looks anyway better than the original), and I
> >    don't expect anybody to actually scroll that output
> > 
> > 2. add a switch for unbuffered output, but then you need to remember to
> >    pass it manually, and the whole output would be as bad as the
> >    original in case you need the switch.
> > 
> > I'd rather go with 1., it's easy to implement (we already have partial
> > flushing with '--events') and it looks like a good compromise on
> > usability. Thoughts?
> >   
> 
> 1 seems fine, but a switch for 'please do not try to format' would be fine.
> 
> I wonder why we try to 'format' when stdout is a pipe or a regular file .

On a second thought: what about | less, or | grep [ports],
or > readable.log? I guess those might also be rather common use cases,
what do you think?

I'm tempted to skip this for the moment and just go with option 1.
Stephen Hemminger Feb. 13, 2019, 9:55 p.m. UTC | #10
On Wed, 13 Feb 2019 22:17:16 +0100
Stefano Brivio <sbrivio@redhat.com> wrote:

> On Wed, 13 Feb 2019 09:31:03 -0800
> Eric Dumazet <eric.dumazet@gmail.com> wrote:
> 
> > On 02/13/2019 12:37 AM, Stefano Brivio wrote:  
> > > On Tue, 12 Feb 2019 16:42:04 -0800
> > > Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > >     
> > >> I do not get it.
> > >>
> > >> "ss -emoi " uses almost 1KB per socket.
> > >>
> > >> 10,000,000 sockets -> we need about 10GB of memory  ???
> > >>
> > >> This is a serious regression.    
> > > 
> > > I guess this is rather subjective: the worst case I considered back then
> > > was the output of 'ss -tei0' (less than 500 bytes) for one million
> > > sockets, which gives 500M of memory, which should in turn be fine on a
> > > machine handling one million sockets.
> > > 
> > > Now, if 'ss -emoi' on 10 million sockets is an actual use case (out of
> > > curiosity: how are you going to process that output? Would JSON help?),
> > > I see two easy options to solve this:    
> > 
> > 
> > ss -temoi | parser (written in shell or awk or whatever...)
> > 
> > This is a use case, I just got bitten because using ss command
> > actually OOM my container, while trying to debug a busy GFE.
> > 
> > The host itself can have 10,000,000 TCP sockets, but usually sysadmin shells
> > run in a container with no more than 500 MB available. 
> > 
> > Otherwise, it would be too easy for a buggy program to OOM the whole machine
> > and have angry customers.
> >   
> > > 
> > > 1. flush the output every time we reach a given buffer size (1M
> > >    perhaps). This might make the resulting blocks slightly unaligned,
> > >    with occasional loss of readability on lines occurring every 1k to
> > >    10k sockets approximately, even though after 1k sockets column sizes
> > >    won't change much (it looks anyway better than the original), and I
> > >    don't expect anybody to actually scroll that output
> > > 
> > > 2. add a switch for unbuffered output, but then you need to remember to
> > >    pass it manually, and the whole output would be as bad as the
> > >    original in case you need the switch.
> > > 
> > > I'd rather go with 1., it's easy to implement (we already have partial
> > > flushing with '--events') and it looks like a good compromise on
> > > usability. Thoughts?
> > >     
> > 
> > 1 seems fine, but a switch for 'please do not try to format' would be fine.
> > 
> > I wonder why we try to 'format' when stdout is a pipe or a regular file .  
> 
> On a second thought: what about | less, or | grep [ports],
> or > readable.log? I guess those might also be rather common use cases,
> what do you think?
> 
> I'm tempted to skip this for the moment and just go with option 1.
> 

What I would favor:
	* use big enough columns that for the common case everything lines up fine
	* if column is to wide just print that element wider (which is what print %Ns does)
and
	* add json output for programs that want to parse
	* use print_uint etc for that

The buffering patch (in iproute2-next) can/will be reverted.
Stefano Brivio Feb. 13, 2019, 10:20 p.m. UTC | #11
On Wed, 13 Feb 2019 13:55:34 -0800
Stephen Hemminger <stephen@networkplumber.org> wrote:

> What I would favor:
> 	* use big enough columns that for the common case everything lines up fine
> 	* if column is to wide just print that element wider (which is what print %Ns does)

This is very close to what was done before, but as soon as you mix,
say, UNIX sockets with TCP sockets, "big enough" columns typically make
output for TCP sockets unreadable.

With buffering, instead, I can decide that a line split is needed, and
keep fields aligned no matter what.

> and
> 	* add json output for programs that want to parse
> 	* use print_uint etc for that

Sure, I think we all agree with this, but it's not going to be quick to
implement (even though it's perhaps a bit easier with abstracted columns
and buffering). Eric reported a problem and I'm trying to fix it
quickly.

> The buffering patch (in iproute2-next) can/will be reverted.

I think it received generally good feedback (also from users, later on)
and this is the first report of a serious issue -- it's also an issue
which looks easy to fix (I'm half way through that by now).

By the way, this patch was merged in iproute2 more than one year ago
(December 2017, by you).
Phil Sutter Feb. 13, 2019, 11:39 p.m. UTC | #12
Hi Stephen,

On Wed, Feb 13, 2019 at 01:55:34PM -0800, Stephen Hemminger wrote:
> On Wed, 13 Feb 2019 22:17:16 +0100
> Stefano Brivio <sbrivio@redhat.com> wrote:
> 
> > On Wed, 13 Feb 2019 09:31:03 -0800
> > Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > 
> > > On 02/13/2019 12:37 AM, Stefano Brivio wrote:  
> > > > On Tue, 12 Feb 2019 16:42:04 -0800
> > > > Eric Dumazet <eric.dumazet@gmail.com> wrote:
> > > >     
> > > >> I do not get it.
> > > >>
> > > >> "ss -emoi " uses almost 1KB per socket.
> > > >>
> > > >> 10,000,000 sockets -> we need about 10GB of memory  ???
> > > >>
> > > >> This is a serious regression.    
> > > > 
> > > > I guess this is rather subjective: the worst case I considered back then
> > > > was the output of 'ss -tei0' (less than 500 bytes) for one million
> > > > sockets, which gives 500M of memory, which should in turn be fine on a
> > > > machine handling one million sockets.
> > > > 
> > > > Now, if 'ss -emoi' on 10 million sockets is an actual use case (out of
> > > > curiosity: how are you going to process that output? Would JSON help?),
> > > > I see two easy options to solve this:    
> > > 
> > > 
> > > ss -temoi | parser (written in shell or awk or whatever...)
> > > 
> > > This is a use case, I just got bitten because using ss command
> > > actually OOM my container, while trying to debug a busy GFE.
> > > 
> > > The host itself can have 10,000,000 TCP sockets, but usually sysadmin shells
> > > run in a container with no more than 500 MB available. 
> > > 
> > > Otherwise, it would be too easy for a buggy program to OOM the whole machine
> > > and have angry customers.
> > >   
> > > > 
> > > > 1. flush the output every time we reach a given buffer size (1M
> > > >    perhaps). This might make the resulting blocks slightly unaligned,
> > > >    with occasional loss of readability on lines occurring every 1k to
> > > >    10k sockets approximately, even though after 1k sockets column sizes
> > > >    won't change much (it looks anyway better than the original), and I
> > > >    don't expect anybody to actually scroll that output
> > > > 
> > > > 2. add a switch for unbuffered output, but then you need to remember to
> > > >    pass it manually, and the whole output would be as bad as the
> > > >    original in case you need the switch.
> > > > 
> > > > I'd rather go with 1., it's easy to implement (we already have partial
> > > > flushing with '--events') and it looks like a good compromise on
> > > > usability. Thoughts?
> > > >     
> > > 
> > > 1 seems fine, but a switch for 'please do not try to format' would be fine.
> > > 
> > > I wonder why we try to 'format' when stdout is a pipe or a regular file .  
> > 
> > On a second thought: what about | less, or | grep [ports],
> > or > readable.log? I guess those might also be rather common use cases,
> > what do you think?
> > 
> > I'm tempted to skip this for the moment and just go with option 1.
> > 
> 
> What I would favor:
> 	* use big enough columns that for the common case everything lines up fine
> 	* if column is to wide just print that element wider (which is what print %Ns does)

This is pretty much the situation Stefano attempted to improve, minus
scaling the columns to max terminal width. ss output formatting being
quirky and unreadable with either small or large terminals was the
number one reason I heard so far why people prefer netstat.

> and
> 	* add json output for programs that want to parse
> 	* use print_uint etc for that

For Eric's use-case, skipping any buffering and tabular output if stdout
is not a TTY suffices. In fact, iproute2 does this already for colored
output (see check_enable_color() for reference).

Adding JSON output support everywhere is a nice feature when it comes to
scripting, but it won't help console users. Unless you expect CLI
frontends to come turning that JSON into human-readable output.

IMHO, JSON output wouldn't even help in this case - unless Eric indeed
prefers to write/use a JSON parser for his analysis instead of something
along 'ss | grep'.

> The buffering patch (in iproute2-next) can/will be reverted.

It's not fair to claim that despite Stefano's commitment to fix the
reported issues. His ss output rewrite is there since v4.15.0 and
according to git history it needed only two fixes so far. I've had
one-liners which required more follow-ups than that! Also, we're still
discovering issues introduced by all the jsonify patches. Allowing for
people to get things right not the first time but after a few tries is
important. If you want to revert something, start with features which
have a fundamental design issue in the exact situation they tried to
improve, like the MSG_PEEK | MSG_TRUNC thing Hangbin and me wrote.

Thanks, Phil
David Ahern Feb. 13, 2019, 11:47 p.m. UTC | #13
On 2/13/19 4:39 PM, Phil Sutter wrote:
>> What I would favor:
>> 	* use big enough columns that for the common case everything lines up fine
>> 	* if column is to wide just print that element wider (which is what print %Ns does)
> 
> This is pretty much the situation Stefano attempted to improve, minus
> scaling the columns to max terminal width. ss output formatting being
> quirky and unreadable with either small or large terminals was the
> number one reason I heard so far why people prefer netstat.

+1.

prior to Stefano's change ss was a PITA trying to read in an xterm. I
for one would run the command and then have to adjust the terminal to
get it to display an actual readable format.

> 
>> and
>> 	* add json output for programs that want to parse
>> 	* use print_uint etc for that
> 
> For Eric's use-case, skipping any buffering and tabular output if stdout
> is not a TTY suffices. In fact, iproute2 does this already for colored
> output (see check_enable_color() for reference).
> 
> Adding JSON output support everywhere is a nice feature when it comes to
> scripting, but it won't help console users. Unless you expect CLI
> frontends to come turning that JSON into human-readable output.
> 
> IMHO, JSON output wouldn't even help in this case - unless Eric indeed
> prefers to write/use a JSON parser for his analysis instead of something
> along 'ss | grep'.

I agree. json has its uses, console/xterm for humans is not one and
piping into something like jq to selectively pick columns is not a user
friendly solution.

> 
>> The buffering patch (in iproute2-next) can/will be reverted.
> 
> It's not fair to claim that despite Stefano's commitment to fix the
> reported issues. His ss output rewrite is there since v4.15.0 and
> according to git history it needed only two fixes so far. I've had
> one-liners which required more follow-ups than that! Also, we're still
> discovering issues introduced by all the jsonify patches. Allowing for
> people to get things right not the first time but after a few tries is
> important. If you want to revert something, start with features which
> have a fundamental design issue in the exact situation they tried to
> improve, like the MSG_PEEK | MSG_TRUNC thing Hangbin and me wrote.

I was just looking at the overhead of that. While it is deceiving to
twice as many recvmsg calls as you expect, the overhead of the peek in
reading 700k+ routes is on the order of 3% with the 32k min buffer size.
The true overhead of the dump functions for ip is the device index to
name mapping (just like the overhead of a batch is the name to index
mapping). I will send a v2 of my patches soon.
diff mbox series

Patch

diff --git a/misc/ss.c b/misc/ss.c
index 42310ba4120d..166267974c36 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -47,6 +47,8 @@ 
 #include <linux/vm_sockets_diag.h>
 
 #define MAGIC_SEQ 123456
+#define BUF_CHUNK (1024 * 1024)
+#define LEN_ALIGN(x) (((x) + 1) & ~1)
 
 #define DIAG_REQUEST(_req, _r)						    \
 	struct {							    \
@@ -127,24 +129,45 @@  struct column {
 	const char *header;
 	const char *ldelim;
 	int width;	/* Including delimiter. -1: fit to content, 0: hide */
-	int stored;	/* Characters buffered */
-	int printed;	/* Characters printed so far */
 };
 
 static struct column columns[] = {
-	{ ALIGN_LEFT,	"Netid",		"",	0,	0,	0 },
-	{ ALIGN_LEFT,	"State",		" ",	0,	0,	0 },
-	{ ALIGN_LEFT,	"Recv-Q",		" ",	7,	0,	0 },
-	{ ALIGN_LEFT,	"Send-Q",		" ",	7,	0,	0 },
-	{ ALIGN_RIGHT,	"Local Address:",	" ",	0,	0,	0 },
-	{ ALIGN_LEFT,	"Port",			"",	0,	0,	0 },
-	{ ALIGN_RIGHT,	"Peer Address:",	" ",	0,	0,	0 },
-	{ ALIGN_LEFT,	"Port",			"",	0,	0,	0 },
-	{ ALIGN_LEFT,	"",			"",	-1,	0,	0 },
+	{ ALIGN_LEFT,	"Netid",		"",	0 },
+	{ ALIGN_LEFT,	"State",		" ",	0 },
+	{ ALIGN_LEFT,	"Recv-Q",		" ",	7 },
+	{ ALIGN_LEFT,	"Send-Q",		" ",	7 },
+	{ ALIGN_RIGHT,	"Local Address:",	" ",	0 },
+	{ ALIGN_LEFT,	"Port",			"",	0 },
+	{ ALIGN_RIGHT,	"Peer Address:",	" ",	0 },
+	{ ALIGN_LEFT,	"Port",			"",	0 },
+	{ ALIGN_LEFT,	"",			"",	-1 },
 };
 
 static struct column *current_field = columns;
-static char field_buf[BUFSIZ];
+
+/* Output buffer: chained chunks of BUF_CHUNK bytes. Each field is written to
+ * the buffer as a variable size token. A token consists of a 16 bits length
+ * field, followed by a string which is not NULL-terminated.
+ *
+ * A new chunk is allocated and linked when the current chunk doesn't have
+ * enough room to store the current token as a whole.
+ */
+struct buf_chunk {
+	struct buf_chunk *next;	/* Next chained chunk */
+	char *end;		/* Current end of content */
+	char data[0];
+};
+
+struct buf_token {
+	uint16_t len;		/* Data length, excluding length descriptor */
+	char data[0];
+};
+
+static struct {
+	struct buf_token *cur;	/* Position of current token in chunk */
+	struct buf_chunk *head;	/* First chunk */
+	struct buf_chunk *tail;	/* Current chunk */
+} buffer;
 
 static const char *TCP_PROTO = "tcp";
 static const char *SCTP_PROTO = "sctp";
@@ -861,25 +884,109 @@  static const char *vsock_netid_name(int type)
 	}
 }
 
+/* Allocate and initialize a new buffer chunk */
+static struct buf_chunk *buf_chunk_new(void)
+{
+	struct buf_chunk *new = malloc(BUF_CHUNK);
+
+	if (!new)
+		abort();
+
+	new->next = NULL;
+
+	/* This is also the last block */
+	buffer.tail = new;
+
+	/* Next token will be stored at the beginning of chunk data area, and
+	 * its initial length is zero.
+	 */
+	buffer.cur = (struct buf_token *)new->data;
+	buffer.cur->len = 0;
+
+	new->end = buffer.cur->data;
+
+	return new;
+}
+
+/* Return available tail room in given chunk */
+static int buf_chunk_avail(struct buf_chunk *chunk)
+{
+	return BUF_CHUNK - offsetof(struct buf_chunk, data) -
+	       (chunk->end - chunk->data);
+}
+
+/* Update end pointer and token length, link new chunk if we hit the end of the
+ * current one. Return -EAGAIN if we got a new chunk, caller has to print again.
+ */
+static int buf_update(int len)
+{
+	struct buf_chunk *chunk = buffer.tail;
+	struct buf_token *t = buffer.cur;
+
+	/* Claim success if new content fits in the current chunk, and anyway
+	 * if this is the first token in the chunk: in the latter case,
+	 * allocating a new chunk won't help, so we'll just cut the output.
+	 */
+	if ((len < buf_chunk_avail(chunk) && len != -1 /* glibc < 2.0.6 */) ||
+	    t == (struct buf_token *)chunk->data) {
+		len = min(len, buf_chunk_avail(chunk));
+
+		/* Total field length can't exceed 2^16 bytes, cut as needed */
+		len = min(len, USHRT_MAX - t->len);
+
+		chunk->end += len;
+		t->len += len;
+		return 0;
+	}
+
+	/* Content truncated, time to allocate more */
+	chunk->next = buf_chunk_new();
+
+	/* Copy current token over to new chunk, including length descriptor */
+	memcpy(chunk->next->data, t, sizeof(t->len) + t->len);
+	chunk->next->end += t->len;
+
+	/* Discard partially written field in old chunk */
+	chunk->end -= t->len + sizeof(t->len);
+
+	return -EAGAIN;
+}
+
+/* Append content to buffer as part of the current field */
 static void out(const char *fmt, ...)
 {
 	struct column *f = current_field;
 	va_list args;
+	char *pos;
+	int len;
+
+	if (!f->width)
+		return;
+
+	if (!buffer.head)
+		buffer.head = buf_chunk_new();
+
+again:	/* Append to buffer: if we have a new chunk, print again */
 
+	pos = buffer.cur->data + buffer.cur->len;
 	va_start(args, fmt);
-	f->stored += vsnprintf(field_buf + f->stored, BUFSIZ - f->stored,
-			       fmt, args);
+
+	/* Limit to tail room. If we hit the limit, buf_update() will tell us */
+	len = vsnprintf(pos, buf_chunk_avail(buffer.tail), fmt, args);
 	va_end(args);
+
+	if (buf_update(len))
+		goto again;
 }
 
-static int print_left_spacing(struct column *f)
+static int print_left_spacing(struct column *f, int stored, int printed)
 {
 	int s;
 
 	if (f->width < 0 || f->align == ALIGN_LEFT)
 		return 0;
 
-	s = f->width - f->stored - f->printed;
+	s = f->width - stored - printed;
 	if (f->align == ALIGN_CENTER)
 		/* If count of total spacing is odd, shift right by one */
 		s = (s + 1) / 2;
@@ -890,14 +997,14 @@  static int print_left_spacing(struct column *f)
 	return 0;
 }
 
-static void print_right_spacing(struct column *f)
+static void print_right_spacing(struct column *f, int printed)
 {
 	int s;
 
 	if (f->width < 0 || f->align == ALIGN_RIGHT)
 		return;
 
-	s = f->width - f->printed;
+	s = f->width - printed;
 	if (f->align == ALIGN_CENTER)
 		s /= 2;
 
@@ -905,35 +1012,29 @@  static void print_right_spacing(struct column *f)
 		printf("%*c", s, ' ');
 }
 
-static int field_needs_delimiter(struct column *f)
-{
-	if (!f->stored)
-		return 0;
-
-	/* Was another field already printed on this line? */
-	for (f--; f >= columns; f--)
-		if (f->width)
-			return 1;
-
-	return 0;
-}
-
-/* Flush given field to screen together with delimiter and spacing */
+/* Done with field: update buffer pointer, start new token after current one */
 static void field_flush(struct column *f)
 {
+	struct buf_chunk *chunk = buffer.tail;
+	unsigned int pad = buffer.cur->len % 2;
+
 	if (!f->width)
 		return;
 
-	if (field_needs_delimiter(f))
-		f->printed = printf("%s", f->ldelim);
-
-	f->printed += print_left_spacing(f);
-	f->printed += printf("%s", field_buf);
-	print_right_spacing(f);
+	/* We need a new chunk if we can't store the next length descriptor.
+	 * Mind the gap between end of previous token and next aligned position
+	 * for length descriptor.
+	 */
+	if (buf_chunk_avail(chunk) - pad < sizeof(buffer.cur->len)) {
+		chunk->end += pad;
+		chunk->next = buf_chunk_new();
+		return;
+	}
 
-	*field_buf = 0;
-	f->printed = 0;
-	f->stored = 0;
+	buffer.cur = (struct buf_token *)(buffer.cur->data +
+					  LEN_ALIGN(buffer.cur->len));
+	buffer.cur->len = 0;
+	buffer.tail->end = buffer.cur->data;
 }
 
 static int field_is_last(struct column *f)
@@ -945,12 +1046,10 @@  static void field_next(void)
 {
 	field_flush(current_field);
 
-	if (field_is_last(current_field)) {
-		printf("\n");
+	if (field_is_last(current_field))
 		current_field = columns;
-	} else {
+	else
 		current_field++;
-	}
 }
 
 /* Walk through fields and flush them until we reach the desired one */
@@ -970,6 +1069,86 @@  static void print_header(void)
 	}
 }
 
+/* Get the next available token in the buffer starting from the current token */
+static struct buf_token *buf_token_next(struct buf_token *cur)
+{
+	struct buf_chunk *chunk = buffer.tail;
+
+	/* If we reached the end of chunk contents, get token from next chunk */
+	if (cur->data + LEN_ALIGN(cur->len) == chunk->end) {
+		buffer.tail = chunk = chunk->next;
+		return chunk ? (struct buf_token *)chunk->data : NULL;
+	}
+
+	return (struct buf_token *)(cur->data + LEN_ALIGN(cur->len));
+}
+
+/* Free up all allocated buffer chunks */
+static void buf_free_all(void)
+{
+	struct buf_chunk *tmp;
+
+	for (buffer.tail = buffer.head; buffer.tail; ) {
+		tmp = buffer.tail;
+		buffer.tail = buffer.tail->next;
+		free(tmp);
+	}
+	buffer.head = NULL;
+}
+
+/* Render buffered output with spacing and delimiters, then free up buffers */
+static void render(void)
+{
+	struct buf_token *token = (struct buf_token *)buffer.head->data;
+	int printed, line_started = 0, need_newline = 0;
+	struct column *f;
+
+	/* Ensure end alignment of last token, it wasn't necessarily flushed */
+	buffer.tail->end += buffer.cur->len % 2;
+
+	/* Rewind and replay */
+	buffer.tail = buffer.head;
+
+	f = columns;
+	while (!f->width)
+		f++;
+
+	while (token) {
+		/* Print left delimiter only if we already started a line */
+		if (line_started++)
+			printed = printf("%s", current_field->ldelim);
+		else
+			printed = 0;
+
+		/* Print field content from token data with spacing */
+		printed += print_left_spacing(f, token->len, printed);
+		printed += fwrite(token->data, 1, token->len, stdout);
+		print_right_spacing(f, printed);
+
+		/* Variable field size or overflow, won't align to screen */
+		if (printed > f->width)
+			need_newline = 1;
+
+		/* Go to next non-empty field, deal with end-of-line */
+		do {
+			if (field_is_last(f)) {
+				if (need_newline) {
+					printf("\n");
+					need_newline = 0;
+				}
+				f = columns;
+				line_started = 0;
+			} else {
+				f++;
+			}
+		} while (!f->width);
+
+		token = buf_token_next(token);
+	}
+
+	buf_free_all();
+}
+
 static void sock_state_print(struct sockstat *s)
 {
 	const char *sock_name;
@@ -4730,7 +4909,7 @@  int main(int argc, char *argv[])
 	if (show_users || show_proc_ctx || show_sock_ctx)
 		user_ent_destroy();
 
-	field_next();
+	render();
 
 	return 0;
 }