diff mbox series

[ovs-dev] raft: Report jsonrpc backlog in kilobytes.

Message ID 20201020111644.3207915-1-i.maximets@ovn.org
State Accepted
Headers show
Series [ovs-dev] raft: Report jsonrpc backlog in kilobytes. | expand

Commit Message

Ilya Maximets Oct. 20, 2020, 11:16 a.m. UTC
While sending snapshots backlog on raft connections could quickly
grow over 4GB and this will overflow raft-backlog counter.

Let's report it in kB instead. (Using kB and not KB to match with
ru_maxrss counter reported by kernel)

Fixes: 3423cd97f88f ("ovsdb: Add raft memory usage to memory report.")
Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
---
 ovsdb/raft.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Dumitru Ceara Oct. 20, 2020, 11:51 a.m. UTC | #1
On 10/20/20 1:16 PM, Ilya Maximets wrote:
> While sending snapshots backlog on raft connections could quickly
> grow over 4GB and this will overflow raft-backlog counter.
> 
> Let's report it in kB instead. (Using kB and not KB to match with
> ru_maxrss counter reported by kernel)
> 
> Fixes: 3423cd97f88f ("ovsdb: Add raft memory usage to memory report.")
> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
> ---
>  ovsdb/raft.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/ovsdb/raft.c b/ovsdb/raft.c
> index 708b0624c..3411323aa 100644
> --- a/ovsdb/raft.c
> +++ b/ovsdb/raft.c
> @@ -1020,13 +1020,14 @@ void
>  raft_get_memory_usage(const struct raft *raft, struct simap *usage)
>  {
>      struct raft_conn *conn;
> +    uint64_t backlog = 0;
>      int cnt = 0;
>  
>      LIST_FOR_EACH (conn, list_node, &raft->conns) {
> -        simap_increase(usage, "raft-backlog",
> -                       jsonrpc_session_get_backlog(conn->js));
> +        backlog += jsonrpc_session_get_backlog(conn->js);
>          cnt++;
>      }
> +    simap_increase(usage, "raft-backlog-kB", backlog / 1000);
>      simap_increase(usage, "raft-connections", cnt);
>  }
>  
> 

Hi Ilya,

This change looks OK to me.  However, since "raft-backlog" was already
displayed in v2.14.0, should we consider not breaking backwards compatibility
and reporting both "raft-backlog" (potentially overflowed) and
"raft-backlog-kB" at "ovs-appctl .. memory/show"?

On the other hand, ovn-k8s parses the output of "memory/show" but ignores
"raft-backlog" for now [1] so at least from that perspective it should be fine.

Is this considered a user visible change?

Regards,
Dumitru

[1]
https://github.com/ovn-org/ovn-kubernetes/blob/dd1289c3ff3543798786baac30c9e9ad51aa27a4/go-controller/pkg/metrics/ovn_db.go#L292
Ilya Maximets Oct. 20, 2020, 4:20 p.m. UTC | #2
On 10/20/20 1:51 PM, Dumitru Ceara wrote:
> On 10/20/20 1:16 PM, Ilya Maximets wrote:
>> While sending snapshots backlog on raft connections could quickly
>> grow over 4GB and this will overflow raft-backlog counter.
>>
>> Let's report it in kB instead. (Using kB and not KB to match with
>> ru_maxrss counter reported by kernel)
>>
>> Fixes: 3423cd97f88f ("ovsdb: Add raft memory usage to memory report.")
>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>> ---
>>  ovsdb/raft.c | 5 +++--
>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>
>> diff --git a/ovsdb/raft.c b/ovsdb/raft.c
>> index 708b0624c..3411323aa 100644
>> --- a/ovsdb/raft.c
>> +++ b/ovsdb/raft.c
>> @@ -1020,13 +1020,14 @@ void
>>  raft_get_memory_usage(const struct raft *raft, struct simap *usage)
>>  {
>>      struct raft_conn *conn;
>> +    uint64_t backlog = 0;
>>      int cnt = 0;
>>  
>>      LIST_FOR_EACH (conn, list_node, &raft->conns) {
>> -        simap_increase(usage, "raft-backlog",
>> -                       jsonrpc_session_get_backlog(conn->js));
>> +        backlog += jsonrpc_session_get_backlog(conn->js);
>>          cnt++;
>>      }
>> +    simap_increase(usage, "raft-backlog-kB", backlog / 1000);
>>      simap_increase(usage, "raft-connections", cnt);
>>  }
>>  
>>
> 
> Hi Ilya,
> 
> This change looks OK to me.  However, since "raft-backlog" was already
> displayed in v2.14.0, should we consider not breaking backwards compatibility
> and reporting both "raft-backlog" (potentially overflowed) and
> "raft-backlog-kB" at "ovs-appctl .. memory/show"?

I don't think that reporting overflowed value makes any sense.  I thought
about reporting just 'raft-backlog' if there is no overflow.  It could look
like this (not so pretty):

diff --git a/ovsdb/raft.c b/ovsdb/raft.c
index 3411323aa..b967fff3d 100644
--- a/ovsdb/raft.c
+++ b/ovsdb/raft.c
@@ -1021,13 +1021,23 @@ raft_get_memory_usage(const struct raft *raft, struct simap *usage)
 {
     struct raft_conn *conn;
     uint64_t backlog = 0;
+    uint64_t old_backlog;
     int cnt = 0;
 
     LIST_FOR_EACH (conn, list_node, &raft->conns) {
         backlog += jsonrpc_session_get_backlog(conn->js);
         cnt++;
     }
-    simap_increase(usage, "raft-backlog-kB", backlog / 1000);
+
+    old_backlog = simap_get(usage, "raft-backlog");
+    if (simap_contains(usage, "raft-backlog-kB")) {
+        simap_increase(usage, "raft-backlog-kB", backlog / 1000);
+    } else if (backlog + old_backlog > UINT_MAX) {
+        simap_put(usage, "raft-backlog-kB", (backlog + old_backlog) / 1000);
+        simap_find_and_delete(usage, "raft-backlog");
+    } else {
+        simap_increase(usage, "raft-backlog", backlog);
+    }
     simap_increase(usage, "raft-connections", cnt);
 }
 
---

Do you think this is better?

> 
> On the other hand, ovn-k8s parses the output of "memory/show" but ignores
> "raft-backlog" for now [1] so at least from that perspective it should be fine.
> 
> Is this considered a user visible change?

I'd consider this as a debug information.  The output format is not documented,
so I'd like to not care much about this interface.

> 
> Regards,
> Dumitru
> 
> [1]
> https://github.com/ovn-org/ovn-kubernetes/blob/dd1289c3ff3543798786baac30c9e9ad51aa27a4/go-controller/pkg/metrics/ovn_db.go#L292
>
Dumitru Ceara Oct. 21, 2020, 7:57 a.m. UTC | #3
On 10/20/20 6:20 PM, Ilya Maximets wrote:
> On 10/20/20 1:51 PM, Dumitru Ceara wrote:
>> On 10/20/20 1:16 PM, Ilya Maximets wrote:
>>> While sending snapshots backlog on raft connections could quickly
>>> grow over 4GB and this will overflow raft-backlog counter.
>>>
>>> Let's report it in kB instead. (Using kB and not KB to match with
>>> ru_maxrss counter reported by kernel)
>>>
>>> Fixes: 3423cd97f88f ("ovsdb: Add raft memory usage to memory report.")
>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>>> ---
>>>  ovsdb/raft.c | 5 +++--
>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/ovsdb/raft.c b/ovsdb/raft.c
>>> index 708b0624c..3411323aa 100644
>>> --- a/ovsdb/raft.c
>>> +++ b/ovsdb/raft.c
>>> @@ -1020,13 +1020,14 @@ void
>>>  raft_get_memory_usage(const struct raft *raft, struct simap *usage)
>>>  {
>>>      struct raft_conn *conn;
>>> +    uint64_t backlog = 0;
>>>      int cnt = 0;
>>>  
>>>      LIST_FOR_EACH (conn, list_node, &raft->conns) {
>>> -        simap_increase(usage, "raft-backlog",
>>> -                       jsonrpc_session_get_backlog(conn->js));
>>> +        backlog += jsonrpc_session_get_backlog(conn->js);
>>>          cnt++;
>>>      }
>>> +    simap_increase(usage, "raft-backlog-kB", backlog / 1000);
>>>      simap_increase(usage, "raft-connections", cnt);
>>>  }
>>>  
>>>
>>
>> Hi Ilya,
>>
>> This change looks OK to me.  However, since "raft-backlog" was already
>> displayed in v2.14.0, should we consider not breaking backwards compatibility
>> and reporting both "raft-backlog" (potentially overflowed) and
>> "raft-backlog-kB" at "ovs-appctl .. memory/show"?
> 
> I don't think that reporting overflowed value makes any sense.  I thought
> about reporting just 'raft-backlog' if there is no overflow.  It could look
> like this (not so pretty):
> 
> diff --git a/ovsdb/raft.c b/ovsdb/raft.c
> index 3411323aa..b967fff3d 100644
> --- a/ovsdb/raft.c
> +++ b/ovsdb/raft.c
> @@ -1021,13 +1021,23 @@ raft_get_memory_usage(const struct raft *raft, struct simap *usage)
>  {
>      struct raft_conn *conn;
>      uint64_t backlog = 0;
> +    uint64_t old_backlog;
>      int cnt = 0;
>  
>      LIST_FOR_EACH (conn, list_node, &raft->conns) {
>          backlog += jsonrpc_session_get_backlog(conn->js);
>          cnt++;
>      }
> -    simap_increase(usage, "raft-backlog-kB", backlog / 1000);
> +
> +    old_backlog = simap_get(usage, "raft-backlog");
> +    if (simap_contains(usage, "raft-backlog-kB")) {
> +        simap_increase(usage, "raft-backlog-kB", backlog / 1000);
> +    } else if (backlog + old_backlog > UINT_MAX) {
> +        simap_put(usage, "raft-backlog-kB", (backlog + old_backlog) / 1000);
> +        simap_find_and_delete(usage, "raft-backlog");
> +    } else {
> +        simap_increase(usage, "raft-backlog", backlog);
> +    }
>      simap_increase(usage, "raft-connections", cnt);
>  }
>  
> ---
> 
> Do you think this is better?
> 

I agree with you, it doesn't look so pretty :)

>>
>> On the other hand, ovn-k8s parses the output of "memory/show" but ignores
>> "raft-backlog" for now [1] so at least from that perspective it should be fine.
>>
>> Is this considered a user visible change?
> 
> I'd consider this as a debug information.  The output format is not documented,
> so I'd like to not care much about this interface.
> 

Ok, in this case I think your original patch is fine. For that:

Acked-by: Dumitru Ceara <dceara@redhat.com>

Thanks,
Dumitru
Ilya Maximets Oct. 26, 2020, 1:10 a.m. UTC | #4
On 10/21/20 9:57 AM, Dumitru Ceara wrote:
> On 10/20/20 6:20 PM, Ilya Maximets wrote:
>> On 10/20/20 1:51 PM, Dumitru Ceara wrote:
>>> On 10/20/20 1:16 PM, Ilya Maximets wrote:
>>>> While sending snapshots backlog on raft connections could quickly
>>>> grow over 4GB and this will overflow raft-backlog counter.
>>>>
>>>> Let's report it in kB instead. (Using kB and not KB to match with
>>>> ru_maxrss counter reported by kernel)
>>>>
>>>> Fixes: 3423cd97f88f ("ovsdb: Add raft memory usage to memory report.")
>>>> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
>>>> ---
>>>>  ovsdb/raft.c | 5 +++--
>>>>  1 file changed, 3 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/ovsdb/raft.c b/ovsdb/raft.c
>>>> index 708b0624c..3411323aa 100644
>>>> --- a/ovsdb/raft.c
>>>> +++ b/ovsdb/raft.c
>>>> @@ -1020,13 +1020,14 @@ void
>>>>  raft_get_memory_usage(const struct raft *raft, struct simap *usage)
>>>>  {
>>>>      struct raft_conn *conn;
>>>> +    uint64_t backlog = 0;
>>>>      int cnt = 0;
>>>>  
>>>>      LIST_FOR_EACH (conn, list_node, &raft->conns) {
>>>> -        simap_increase(usage, "raft-backlog",
>>>> -                       jsonrpc_session_get_backlog(conn->js));
>>>> +        backlog += jsonrpc_session_get_backlog(conn->js);
>>>>          cnt++;
>>>>      }
>>>> +    simap_increase(usage, "raft-backlog-kB", backlog / 1000);
>>>>      simap_increase(usage, "raft-connections", cnt);
>>>>  }
>>>>  
>>>>
>>>
>>> Hi Ilya,
>>>
>>> This change looks OK to me.  However, since "raft-backlog" was already
>>> displayed in v2.14.0, should we consider not breaking backwards compatibility
>>> and reporting both "raft-backlog" (potentially overflowed) and
>>> "raft-backlog-kB" at "ovs-appctl .. memory/show"?
>>
>> I don't think that reporting overflowed value makes any sense.  I thought
>> about reporting just 'raft-backlog' if there is no overflow.  It could look
>> like this (not so pretty):
>>
>> diff --git a/ovsdb/raft.c b/ovsdb/raft.c
>> index 3411323aa..b967fff3d 100644
>> --- a/ovsdb/raft.c
>> +++ b/ovsdb/raft.c
>> @@ -1021,13 +1021,23 @@ raft_get_memory_usage(const struct raft *raft, struct simap *usage)
>>  {
>>      struct raft_conn *conn;
>>      uint64_t backlog = 0;
>> +    uint64_t old_backlog;
>>      int cnt = 0;
>>  
>>      LIST_FOR_EACH (conn, list_node, &raft->conns) {
>>          backlog += jsonrpc_session_get_backlog(conn->js);
>>          cnt++;
>>      }
>> -    simap_increase(usage, "raft-backlog-kB", backlog / 1000);
>> +
>> +    old_backlog = simap_get(usage, "raft-backlog");
>> +    if (simap_contains(usage, "raft-backlog-kB")) {
>> +        simap_increase(usage, "raft-backlog-kB", backlog / 1000);
>> +    } else if (backlog + old_backlog > UINT_MAX) {
>> +        simap_put(usage, "raft-backlog-kB", (backlog + old_backlog) / 1000);
>> +        simap_find_and_delete(usage, "raft-backlog");
>> +    } else {
>> +        simap_increase(usage, "raft-backlog", backlog);
>> +    }
>>      simap_increase(usage, "raft-connections", cnt);
>>  }
>>  
>> ---
>>
>> Do you think this is better?
>>
> 
> I agree with you, it doesn't look so pretty :)
> 
>>>
>>> On the other hand, ovn-k8s parses the output of "memory/show" but ignores
>>> "raft-backlog" for now [1] so at least from that perspective it should be fine.
>>>
>>> Is this considered a user visible change?
>>
>> I'd consider this as a debug information.  The output format is not documented,
>> so I'd like to not care much about this interface.
>>
> 
> Ok, in this case I think your original patch is fine. For that:
> 
> Acked-by: Dumitru Ceara <dceara@redhat.com>

Thanks!
Applied to master and branch-2.14.

Best regards, Ilya Maximets.
diff mbox series

Patch

diff --git a/ovsdb/raft.c b/ovsdb/raft.c
index 708b0624c..3411323aa 100644
--- a/ovsdb/raft.c
+++ b/ovsdb/raft.c
@@ -1020,13 +1020,14 @@  void
 raft_get_memory_usage(const struct raft *raft, struct simap *usage)
 {
     struct raft_conn *conn;
+    uint64_t backlog = 0;
     int cnt = 0;
 
     LIST_FOR_EACH (conn, list_node, &raft->conns) {
-        simap_increase(usage, "raft-backlog",
-                       jsonrpc_session_get_backlog(conn->js));
+        backlog += jsonrpc_session_get_backlog(conn->js);
         cnt++;
     }
+    simap_increase(usage, "raft-backlog-kB", backlog / 1000);
     simap_increase(usage, "raft-connections", cnt);
 }