diff mbox series

read_lines_notify: fix segfault if a program prints a line starting with \0

Message ID 20210621060748.328497-1-dominique.martinet@atmark-techno.com
State Accepted
Headers show
Series read_lines_notify: fix segfault if a program prints a line starting with \0 | expand

Commit Message

Dominique MARTINET June 21, 2021, 6:07 a.m. UTC
if a program prints something starting with \0, string_split will
successfully allocate a string array of one element which is null
(so nlines == 0 on the next line)

There might be other problem with string length (we know how many
bytes we read), so instead of trying to work around it in all cases
just replace all nul bytes with @ like we often see in e.g. less.

This can be reproduced by executing a shell script that does
e.g. dd if=/dev/zero bs=1 count=1

Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
---

I wasn't sure which way to go between trying to check for all corner
cases or this, but read_lines_notify isn't meant to handle a lot of data
anyway and we're not either retransmitting things verbatim (due to
info/error header) so I ended up preferring this quite a bit.

Please say if you want me to just skip printing nul bytes or something
else.

 core/util.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Stefano Babic June 21, 2021, 10:17 a.m. UTC | #1
Hi Dominique,

On 21.06.21 08:07, Dominique Martinet wrote:
> if a program prints something starting with \0, string_split will
> successfully allocate a string array of one element which is null
> (so nlines == 0 on the next line)
> 
> There might be other problem with string length (we know how many
> bytes we read), so instead of trying to work around it in all cases
> just replace all nul bytes with @ like we often see in e.g. less.
> 
> This can be reproduced by executing a shell script that does
> e.g. dd if=/dev/zero bs=1 count=1
> 
> Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
> ---
> 
> I wasn't sure which way to go between trying to check for all corner
> cases or this, but read_lines_notify isn't meant to handle a lot of data
> anyway and we're not either retransmitting things verbatim (due to
> info/error header) so I ended up preferring this quite a bit.
> 
> Please say if you want me to just skip printing nul bytes or something
> else.

IMHO we can agree that SWUpdate is expecting a sort of bunched strings 
surrounded by linefeed, and a zero char should be dropped and filtered. 
I would prefer to skip printing nul bytes.

Best regards,
Stefano


> 
>   core/util.c | 6 ++++++
>   1 file changed, 6 insertions(+)
> 
> diff --git a/core/util.c b/core/util.c
> index 3b81c091b0e0..ebeb8587644e 100644
> --- a/core/util.c
> +++ b/core/util.c
> @@ -1001,6 +1001,12 @@ int read_lines_notify(int fd, char *buf, int buf_size, int *buf_offset,
>   	if (n <= 0)
>   		return -errno;
>   
> +	/* replace zeroes with @ signs */
> +	for (unsigned int index = 0; index < n; index++) {
> +		if (!buf[offset+index])
> +			buf[offset+index] = '@';
> +	}
> +
>   	n += offset;
>   	buf[n] = '\0';
>   
>
Dominique MARTINET June 21, 2021, 11:21 p.m. UTC | #2
Stefano Babic wrote on Mon, Jun 21, 2021 at 12:17:39PM +0200:
> > I wasn't sure which way to go between trying to check for all corner
> > cases or this, but read_lines_notify isn't meant to handle a lot of data
> > anyway and we're not either retransmitting things verbatim (due to
> > info/error header) so I ended up preferring this quite a bit.
> > 
> > Please say if you want me to just skip printing nul bytes or something
> > else.
> 
> IMHO we can agree that SWUpdate is expecting a sort of bunched strings
> surrounded by linefeed, and a zero char should be dropped and filtered. I
> would prefer to skip printing nul bytes.

I can definitely agree I wasn't expecting nul bytes, and we can do
whatever we want with these - it was a bug in one of my scripts that is
now fixed.
As long as swupdate doesn't crash I'm happy.


The problem I have with "just skipping" is that basically we know how
much we read, but string_split expects a nul-delimited string and will
stop at the first nul byte yet not give the length of the string it
actually processed (we can't just sum the length of each chunk we
printed because strtok will merge consecutive delimiters)

It means that whatever solution we chose, we'll need to scan for early
nul bytes (either through strlen() or a loop like I did), unless we just
drop valid text after one (which I think we can agree we don't want?)


So I see two ways of handling it as "just skip":
 - after read, iterate over the string in strlen() chunks with the whole
rest of the function (moving it to a different e.g. split_lines_notify
helper, keeping only the read+strlen loop in read_lines_notify)
This adds complexity with the leftover data everytime but should just
work - I'm noticing just now it can be optimized a bit if there was only
a single line but it still sounds messy to me.

 - after read, iterate over the string and call memmove() as many times
as required to eliminate chunks of nul bytes, then adjust the size.
Note we need to be careful of keeping the nul bytes accounted in the
return value (at least add 1) in that case to not drop valid text after
a buffer with only nul bytes if the program had exited as that would end
the loop early in run_system_cmd....

- implement a second string_split helper or add an optional argument to
it to make it work over fixed-lenght buffers, skipping nul bytes in the
process -- that's definitely simpler as this function knows where it
stands within the buffer, and just require replacing strtok with a loop
akin to what we already have at the start of the buffer.


So complexity-wise I think it's much simpler to just replace nuls with
an arbitrary character, but if you're fine with either approach or have
a better idea I'll send a v2.
(rewording my preference order: current patch > 3rd new string_split
suggestion > 1st > 2nd? I guess)


Thanks,
Stefano Babic June 22, 2021, 7:05 a.m. UTC | #3
Hi Dominique,

On 22.06.21 01:21, Dominique Martinet wrote:
> Stefano Babic wrote on Mon, Jun 21, 2021 at 12:17:39PM +0200:
>>> I wasn't sure which way to go between trying to check for all corner
>>> cases or this, but read_lines_notify isn't meant to handle a lot of data
>>> anyway and we're not either retransmitting things verbatim (due to
>>> info/error header) so I ended up preferring this quite a bit.
>>>
>>> Please say if you want me to just skip printing nul bytes or something
>>> else.
>>
>> IMHO we can agree that SWUpdate is expecting a sort of bunched strings
>> surrounded by linefeed, and a zero char should be dropped and filtered. I
>> would prefer to skip printing nul bytes.
> 
> I can definitely agree I wasn't expecting nul bytes, and we can do
> whatever we want with these - it was a bug in one of my scripts that is
> now fixed.
> As long as swupdate doesn't crash I'm happy.

Fully agree.

> 
> 
> The problem I have with "just skipping" is that basically we know how
> much we read, but string_split expects a nul-delimited string and will
> stop at the first nul byte yet not give the length of the string it
> actually processed (we can't just sum the length of each chunk we
> printed because strtok will merge consecutive delimiters)
> 
> It means that whatever solution we chose, we'll need to scan for early
> nul bytes (either through strlen() or a loop like I did), unless we just
> drop valid text after one (which I think we can agree we don't want?)
> 
> 
> So I see two ways of handling it as "just skip":
>   - after read, iterate over the string in strlen() chunks with the whole
> rest of the function (moving it to a different e.g. split_lines_notify
> helper, keeping only the read+strlen loop in read_lines_notify)
> This adds complexity with the leftover data everytime but should just
> work - I'm noticing just now it can be optimized a bit if there was only
> a single line but it still sounds messy to me.
> 
>   - after read, iterate over the string and call memmove() as many times
> as required to eliminate chunks of nul bytes, then adjust the size.
> Note we need to be careful of keeping the nul bytes accounted in the
> return value (at least add 1) in that case to not drop valid text after
> a buffer with only nul bytes if the program had exited as that would end
> the loop early in run_system_cmd....
> 
> - implement a second string_split helper or add an optional argument to
> it to make it work over fixed-lenght buffers, skipping nul bytes in the
> process -- that's definitely simpler as this function knows where it
> stands within the buffer, and just require replacing strtok with a loop
> akin to what we already have at the start of the buffer.
> 
> 
> So complexity-wise I think it's much simpler to just replace nuls with
> an arbitrary character, but if you're fine with either approach or have
> a better idea I'll send a v2.

No, you convinced me - it is not worth to add this complexity. As you, I 
want just to fix the crash, Nul bytes from script as output is not a use 
case. It is fine then just replacing the Nul with a char as you proposed.

> (rewording my preference order: current patch > 3rd new string_split
> suggestion > 1st > 2nd? I guess)
> 

Best regards,
Stefano

> 
> Thanks,
>
diff mbox series

Patch

diff --git a/core/util.c b/core/util.c
index 3b81c091b0e0..ebeb8587644e 100644
--- a/core/util.c
+++ b/core/util.c
@@ -1001,6 +1001,12 @@  int read_lines_notify(int fd, char *buf, int buf_size, int *buf_offset,
 	if (n <= 0)
 		return -errno;
 
+	/* replace zeroes with @ signs */
+	for (unsigned int index = 0; index < n; index++) {
+		if (!buf[offset+index])
+			buf[offset+index] = '@';
+	}
+
 	n += offset;
 	buf[n] = '\0';