diff mbox series

[2/2] suricatta process notification: improve ipc_get_status scheduling

Message ID 20240321054146.2303245-2-dominique.martinet@atmark-techno.com
State Accepted
Headers show
Series [1/2] network_thread: block SIGPIPE in network thread | expand

Commit Message

Dominique MARTINET March 21, 2024, 5:41 a.m. UTC
The process notification thread gets messages from swupdate to submit
them to the hawkbit server.

Before this patch, ipc_get_status_timeout() was called in a tight loop,
probably with the assumption that it would wait until the timeout
happens if no message was available.
Unfortunately, the server just replies immediately that there is no
message, so this would just hammer the server in a tight loop.

Furthermore, on slow devices if the server takes more than 100ms to
reply we close our socket so the server would crash to sigpipe without
the previous fix.

Adjust this loop to:
- since ipc_get_status() always returns immediately the timeout is
useless, use the version without timeout.
- wait 100ms before calling ipc_get_status() again if no message was
available.

Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
---
 suricatta/server_hawkbit.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Stefano Babic March 21, 2024, 3:23 p.m. UTC | #1
Hi Dominique,

On 21.03.24 06:41, Dominique Martinet wrote:
> The process notification thread gets messages from swupdate to submit
> them to the hawkbit server.
>
> Before this patch, ipc_get_status_timeout() was called in a tight loop,
> probably with the assumption that it would wait until the timeout
> happens if no message was available.

Correct.

> Unfortunately, the server just replies immediately that there is no
> message, so this would just hammer the server in a tight loop.
>
> Furthermore, on slow devices if the server takes more than 100ms to
> reply we close our socket so the server would crash to sigpipe without
> the previous fix.

Thanks for the previous fix, I agree there is a race and the patch fixes it.

>
> Adjust this loop to:
> - since ipc_get_status() always returns immediately the timeout is
> useless, use the version without timeout.
> - wait 100ms before calling ipc_get_status() again if no message was
> available.

It looks to me safe.

Acked-by: Stefano Babic <stefano.babic@swupdate.org>

Best regards,
Stefano

>
> Signed-off-by: Dominique Martinet <dominique.martinet@atmark-techno.com>
> ---
>   suricatta/server_hawkbit.c | 6 +++++-
>   1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/suricatta/server_hawkbit.c b/suricatta/server_hawkbit.c
> index 1633c5cca65b..7ceb1dc038fd 100644
> --- a/suricatta/server_hawkbit.c
> +++ b/suricatta/server_hawkbit.c
> @@ -997,7 +997,7 @@ static void *process_notification_thread(void *data)
>   	for (;;) {
>   		ipc_message msg;
>   		bool data_avail = false;
> -		int ret = ipc_get_status_timeout(&msg, 100);
> +		int ret = ipc_get_status(&msg);
>
>   		data_avail = ret > 0 && (strlen(msg.data.status.desc) != 0);
>
> @@ -1060,6 +1060,10 @@ static void *process_notification_thread(void *data)
>
>   		if (stop && !data_avail)
>   			break;
> +
> +		// wait a bit for next message...
> +		if (!data_avail)
> +			usleep(100000);
>   	}
>
>   	pthread_mutex_unlock(&notifylock);
Dominique MARTINET March 25, 2024, 1:32 a.m. UTC | #2
Stefano Babic wrote on Thu, Mar 21, 2024 at 04:23:33PM +0100:
> Hi Dominique,

Thanks for the prompt review!

> > Adjust this loop to:
> > - since ipc_get_status() always returns immediately the timeout is
> > useless, use the version without timeout.

Just an extra comment here: I got pointed out during internal review
that this was the last user of ipc_get_status_timeout() in tree.

Given that function is user-facing (network_ipc) I don't think we should
remove it in this patch, but we might want to consider if it's useful to
keep and mark it deprecated (__attribute__((deprecated)) or equivalent?)
in a follow-up commit as we don't have any waiting-style ipc as far as I
can see, so leaving it might cause more misunderstandings like we had
here.
Stefano Babic March 27, 2024, 9:02 a.m. UTC | #3
Hi Dominique,

On 25.03.24 02:32, Dominique Martinet wrote:
> Stefano Babic wrote on Thu, Mar 21, 2024 at 04:23:33PM +0100:
>> Hi Dominique,
>
> Thanks for the prompt review!
>
>>> Adjust this loop to:
>>> - since ipc_get_status() always returns immediately the timeout is
>>> useless, use the version without timeout.
>
> Just an extra comment here: I got pointed out during internal review
> that this was the last user of ipc_get_status_timeout() in tree.
>
> Given that function is user-facing (network_ipc) I don't think we should
> remove it in this patch, but we might want to consider if it's useful to
> keep and mark it deprecated (__attribute__((deprecated)) or equivalent?)

Well, the whole interface is mostly deprecated (for external usage) and
replaced by the progress interface. But last one requires more for the
integrator, as it should runs it as background task, it is event driven,
etc. So older application should be ported to use this interface, and it
is not what many want. I won't remove function from the API to avoid
breakages and to remain backward compatible, everyone should decide the
most suitable way for the own project.

Best regards,
Stefano

> in a follow-up commit as we don't have any waiting-style ipc as far as I
> can see, so leaving it might cause more misunderstandings like we had
> here.
>
diff mbox series

Patch

diff --git a/suricatta/server_hawkbit.c b/suricatta/server_hawkbit.c
index 1633c5cca65b..7ceb1dc038fd 100644
--- a/suricatta/server_hawkbit.c
+++ b/suricatta/server_hawkbit.c
@@ -997,7 +997,7 @@  static void *process_notification_thread(void *data)
 	for (;;) {
 		ipc_message msg;
 		bool data_avail = false;
-		int ret = ipc_get_status_timeout(&msg, 100);
+		int ret = ipc_get_status(&msg);
 
 		data_avail = ret > 0 && (strlen(msg.data.status.desc) != 0);
 
@@ -1060,6 +1060,10 @@  static void *process_notification_thread(void *data)
 
 		if (stop && !data_avail)
 			break;
+
+		// wait a bit for next message...
+		if (!data_avail)
+			usleep(100000);
 	}
 
 	pthread_mutex_unlock(&notifylock);