diff mbox

[net-next,v3] block/drbd: align properly u64 in nl messages

Message ID 1462786820-15519-1-git-send-email-nicolas.dichtel@6wind.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Nicolas Dichtel May 9, 2016, 9:40 a.m. UTC
The attribute 0 is never used in drbd, so let's use it as pad attribute
in netlink messages. This minimizes the patch.

Note that this patch is only compile-tested.

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
---

v2 -> v3:
  use 0 as padattr instead of adding new attributes

v1 -> v2:
 rework the patch to handle all cases

Maybe prefixing genl_magic_func.h and genl_magic_struct.h by 'drbd_'
could be interesting so that new module won't use it. What is your
opinion?

 drivers/block/drbd/drbd_nl.c      | 28 ++++++++++++++++------------
 include/linux/genl_magic_struct.h |  7 ++++++-
 2 files changed, 22 insertions(+), 13 deletions(-)

Comments

Lars Ellenberg May 9, 2016, 1:15 p.m. UTC | #1
On Mon, May 09, 2016 at 11:40:20AM +0200, Nicolas Dichtel wrote:
> The attribute 0 is never used in drbd, so let's use it as pad attribute
> in netlink messages. This minimizes the patch.
> 
> Note that this patch is only compile-tested.
> 
> Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
> Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
> ---
> 
> v2 -> v3:
>   use 0 as padattr instead of adding new attributes

Thanks.

> v1 -> v2:
>  rework the patch to handle all cases
> 
> Maybe prefixing genl_magic_func.h and genl_magic_struct.h by 'drbd_'
> could be interesting so that new module won't use it. What is your
> opinion?

This was supposed to not be DRBD specific.  But it might even still
need some massaging before it was truly generic. And obviously,
it does not meet the taste of genetlink folks, to say the least :(

I don't care either way.

    Lars Ellenberg
Nicolas Dichtel May 10, 2016, 9:09 a.m. UTC | #2
Le 09/05/2016 15:15, Lars Ellenberg a écrit :
> On Mon, May 09, 2016 at 11:40:20AM +0200, Nicolas Dichtel wrote:
[snip]
>> Maybe prefixing genl_magic_func.h and genl_magic_struct.h by 'drbd_'
>> could be interesting so that new module won't use it. What is your
>> opinion?
> 
> This was supposed to not be DRBD specific.  But it might even still
> need some massaging before it was truly generic. And obviously,
> it does not meet the taste of genetlink folks, to say the least :(
Yes, this file is not generic and netlink APIs are never defined like this.
These tons of macro complexifies the code too much. It's overengineering for
what purpose?
Small examples:
 - the drbd netlink API is not exported via uapi (I wonder how apps using this
   API get it)
 - v2 of the patch is nacked because adding a new attribute may break existing
   apps (in networking code, a lot of new attributes are added in each version)
 - it's not possible to grep to show the definition of an attribute ('git grep
   -w T_bits_total' returns only 1 line)


Regards,
Nicolas
Lars Ellenberg May 10, 2016, 9:40 a.m. UTC | #3
On Tue, May 10, 2016 at 11:09:53AM +0200, Nicolas Dichtel wrote:
> Le 09/05/2016 15:15, Lars Ellenberg a écrit :
> > On Mon, May 09, 2016 at 11:40:20AM +0200, Nicolas Dichtel wrote:
> [snip]
> >> Maybe prefixing genl_magic_func.h and genl_magic_struct.h by 'drbd_'
> >> could be interesting so that new module won't use it. What is your
> >> opinion?
> > 
> > This was supposed to not be DRBD specific.  But it might even still
> > need some massaging before it was truly generic. And obviously,
> > it does not meet the taste of genetlink folks, to say the least :(
> Yes, this file is not generic and netlink APIs are never defined like this.
> These tons of macro complexifies the code too much. It's overengineering for
> what purpose?

If we introduce a new config option,
we have to add it to the config scanner (one line),
define min, max, default and scale (four short defines),
and add it to the netlink definition here (one line).
Done, rest of the code is generated,
both on the kernel side,
and on the drbd-utils side used to talk to the kernel.
We found that to be very convenient.

> Small examples:
>  - the drbd netlink API is not exported via uapi (I wonder how apps using this
>    API get it)

There used to be a time where there was no "uapi".
(I wonder how apps ever worked back then).

>  - v2 of the patch is nacked because adding a new attribute may break existing

No.

But because the "new" attributes you chose have not been new,
but already used (though not yet merged back into mainline yet).
(Which you did not realize, and had no obvious way of knowing.
 Could have been fixed.).

And because your patch introduced useless new members to the structs.
(Could also have been fixed).

And because I did not see any use defining that many new "padding attributes"
for no reason, where the obvious (to me) choice was to use 0, and you
did not even try to explain why that would have been a bad choice.

>    apps (in networking code, a lot of new attributes are added in each version)


>  - it's not possible to grep to show the definition of an attribute ('git grep
>    -w T_bits_total' returns only 1 line)

Opencoded, it would return 2.

 ;-)

Is this going somewhere?

Cheers,

    Lars
Nicolas Dichtel May 10, 2016, 10:06 a.m. UTC | #4
Le 10/05/2016 11:40, Lars Ellenberg a écrit :
> On Tue, May 10, 2016 at 11:09:53AM +0200, Nicolas Dichtel wrote:
>> Le 09/05/2016 15:15, Lars Ellenberg a écrit :
>>> On Mon, May 09, 2016 at 11:40:20AM +0200, Nicolas Dichtel wrote:
>> [snip]
>>>> Maybe prefixing genl_magic_func.h and genl_magic_struct.h by 'drbd_'
>>>> could be interesting so that new module won't use it. What is your
>>>> opinion?
>>>
>>> This was supposed to not be DRBD specific.  But it might even still
>>> need some massaging before it was truly generic. And obviously,
>>> it does not meet the taste of genetlink folks, to say the least :(
>> Yes, this file is not generic and netlink APIs are never defined like this.
>> These tons of macro complexifies the code too much. It's overengineering for
>> what purpose?
> 
> If we introduce a new config option,
> we have to add it to the config scanner (one line),
> define min, max, default and scale (four short defines),
> and add it to the netlink definition here (one line).
> Done, rest of the code is generated,
> both on the kernel side,
> and on the drbd-utils side used to talk to the kernel.
> We found that to be very convenient.
Ok.

> 
>> Small examples:
>>  - the drbd netlink API is not exported via uapi (I wonder how apps using this
>>    API get it)
> 
> There used to be a time where there was no "uapi".
> (I wonder how apps ever worked back then).
At that time, include/linux/ was exported ;-)

> 
>>  - v2 of the patch is nacked because adding a new attribute may break existing
> 
> No.
> 
> But because the "new" attributes you chose have not been new,
> but already used (though not yet merged back into mainline yet).
> (Which you did not realize, and had no obvious way of knowing.
>  Could have been fixed.).
Ok.

> 
> And because your patch introduced useless new members to the structs.
> (Could also have been fixed).
> 
> And because I did not see any use defining that many new "padding attributes"
> for no reason, where the obvious (to me) choice was to use 0, and you
> did not even try to explain why that would have been a bad choice.
Because some nl APIs were wrongly use 0 as a valid attribute we make the choice
of always adding a new attribute for padding to be sure to not break existing API.
And yes, in drdb it does not seem to be the case.

> Is this going somewhere?
I'm just trying to understand things.


Regards,
Nicolas
David Miller May 10, 2016, 3:39 p.m. UTC | #5
From: Lars Ellenberg <lars.ellenberg@linbit.com>
Date: Tue, 10 May 2016 11:40:23 +0200

> If we introduce a new config option,
> we have to add it to the config scanner (one line),
> define min, max, default and scale (four short defines),
> and add it to the netlink definition here (one line).
> Done, rest of the code is generated,
> both on the kernel side,
> and on the drbd-utils side used to talk to the kernel.
> We found that to be very convenient.

But it entirely misses the core design point of netlink.

Sender and receive _DO NOT_ need to coordinate at all.  That's the
whole point.  So tightly coupling such coordination is going to run
you into all kinds of problems.

When implemented properly, the sender can emit whatever attributes it
knows about and can generate, and the receive scans the attributes one
by one and picks out the ones it understands and processes them.

If you go against this model then you have no clean way to extend
things whilst allowing existing software to continue working.

If the drbd stuff had been posting to the networking list, we really
would have screamed loudly about these auto-generates structs and
macros and whatnot.

Anyways, back to the topic, can you please just relent and come to some
kind of agreement about the fix for this alignment bug?  This is taking
a very long time and patches are just rotting in patchwork with no
resolution.

Thanks.
Lars Ellenberg May 10, 2016, 7:09 p.m. UTC | #6
On Tue, May 10, 2016 at 11:39:49AM -0400, David Miller wrote:
> From: Lars Ellenberg <lars.ellenberg@linbit.com>
> Date: Tue, 10 May 2016 11:40:23 +0200

excuse me for reordering the original:

> Anyways, back to the topic, can you please just relent and come to
> some kind of agreement about the fix for this alignment bug?

I thought we did?  I'm fine with the "v3",
it even carries my signed-of-by.

Whether or not Nicholas wants to prefix those headers with drbd_,
I don't really care.

> This is taking a very long time and patches are just rotting in
> patchwork with no resolution.  Why would 

Nicholas asked how to go about DRBD,
I suggested to use 0 as a padding attribute,
and after taking a detour, he did. All good.


Rest of original:

> > If we introduce a new config option,
> > we have to add it to the config scanner (one line),
> > define min, max, default and scale (four short defines),
> > and add it to the netlink definition here (one line).
> > Done, rest of the code is generated,
> > both on the kernel side,
> > and on the drbd-utils side used to talk to the kernel.
> > We found that to be very convenient.
> 
> But it entirely misses the core design point of netlink.
> 
> Sender and receive _DO NOT_ need to coordinate at all.  That's the
> whole point.  So tightly coupling such coordination is going to run
> you into all kinds of problems.
> 
> When implemented properly, the sender can emit whatever attributes it
> knows about and can generate, and the receive scans the attributes one
> by one and picks out the ones it understands and processes them.
> 
> If you go against this model
> then you have no clean way to

We don't.
We extend (not violate) that model, so the sender *may* indicate
to the recipient that for some particular attribute, the sender would
rather have an "I don't understand this" return than a silent ignore.
And that we can indicate in the definition of the attributes which ones
are required to make a message meaningful.

> extend things whilst allowing existing software to continue working.

*that* is exactly why we use netlink,
and why we do things with it the way we do.
Actually I think what we are doing there is, comparatively, "elegant".
You obviously don't have to agree.

I could discuss this in more detail,
but I assume you are not really interested,
at least not here and now.

Thanks,

    Lars
David Miller May 10, 2016, 7:26 p.m. UTC | #7
From: Lars Ellenberg <lars.ellenberg@linbit.com>
Date: Tue, 10 May 2016 21:09:03 +0200

> On Tue, May 10, 2016 at 11:39:49AM -0400, David Miller wrote:
>> From: Lars Ellenberg <lars.ellenberg@linbit.com>
>> Date: Tue, 10 May 2016 11:40:23 +0200
> 
> excuse me for reordering the original:
> 
>> Anyways, back to the topic, can you please just relent and come to
>> some kind of agreement about the fix for this alignment bug?
> 
> I thought we did?  I'm fine with the "v3",
> it even carries my signed-of-by.

My bad, I missed that, I'll apply v3 thanks a lot!
diff mbox

Patch

diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 1fd1dccebb6b..0bac9c8246bc 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -3633,14 +3633,15 @@  static int nla_put_status_info(struct sk_buff *skb, struct drbd_device *device,
 		goto nla_put_failure;
 	if (nla_put_u32(skb, T_sib_reason, sib ? sib->sib_reason : SIB_GET_STATUS_REPLY) ||
 	    nla_put_u32(skb, T_current_state, device->state.i) ||
-	    nla_put_u64(skb, T_ed_uuid, device->ed_uuid) ||
-	    nla_put_u64(skb, T_capacity, drbd_get_capacity(device->this_bdev)) ||
-	    nla_put_u64(skb, T_send_cnt, device->send_cnt) ||
-	    nla_put_u64(skb, T_recv_cnt, device->recv_cnt) ||
-	    nla_put_u64(skb, T_read_cnt, device->read_cnt) ||
-	    nla_put_u64(skb, T_writ_cnt, device->writ_cnt) ||
-	    nla_put_u64(skb, T_al_writ_cnt, device->al_writ_cnt) ||
-	    nla_put_u64(skb, T_bm_writ_cnt, device->bm_writ_cnt) ||
+	    nla_put_u64_0pad(skb, T_ed_uuid, device->ed_uuid) ||
+	    nla_put_u64_0pad(skb, T_capacity,
+			     drbd_get_capacity(device->this_bdev)) ||
+	    nla_put_u64_0pad(skb, T_send_cnt, device->send_cnt) ||
+	    nla_put_u64_0pad(skb, T_recv_cnt, device->recv_cnt) ||
+	    nla_put_u64_0pad(skb, T_read_cnt, device->read_cnt) ||
+	    nla_put_u64_0pad(skb, T_writ_cnt, device->writ_cnt) ||
+	    nla_put_u64_0pad(skb, T_al_writ_cnt, device->al_writ_cnt) ||
+	    nla_put_u64_0pad(skb, T_bm_writ_cnt, device->bm_writ_cnt) ||
 	    nla_put_u32(skb, T_ap_bio_cnt, atomic_read(&device->ap_bio_cnt)) ||
 	    nla_put_u32(skb, T_ap_pending_cnt, atomic_read(&device->ap_pending_cnt)) ||
 	    nla_put_u32(skb, T_rs_pending_cnt, atomic_read(&device->rs_pending_cnt)))
@@ -3657,13 +3658,16 @@  static int nla_put_status_info(struct sk_buff *skb, struct drbd_device *device,
 			goto nla_put_failure;
 
 		if (nla_put_u32(skb, T_disk_flags, device->ldev->md.flags) ||
-		    nla_put_u64(skb, T_bits_total, drbd_bm_bits(device)) ||
-		    nla_put_u64(skb, T_bits_oos, drbd_bm_total_weight(device)))
+		    nla_put_u64_0pad(skb, T_bits_total, drbd_bm_bits(device)) ||
+		    nla_put_u64_0pad(skb, T_bits_oos,
+				     drbd_bm_total_weight(device)))
 			goto nla_put_failure;
 		if (C_SYNC_SOURCE <= device->state.conn &&
 		    C_PAUSED_SYNC_T >= device->state.conn) {
-			if (nla_put_u64(skb, T_bits_rs_total, device->rs_total) ||
-			    nla_put_u64(skb, T_bits_rs_failed, device->rs_failed))
+			if (nla_put_u64_0pad(skb, T_bits_rs_total,
+					     device->rs_total) ||
+			    nla_put_u64_0pad(skb, T_bits_rs_failed,
+					     device->rs_failed))
 				goto nla_put_failure;
 		}
 	}
diff --git a/include/linux/genl_magic_struct.h b/include/linux/genl_magic_struct.h
index eecd19b37001..6270a56e5edc 100644
--- a/include/linux/genl_magic_struct.h
+++ b/include/linux/genl_magic_struct.h
@@ -62,6 +62,11 @@  extern void CONCAT_(GENL_MAGIC_FAMILY, _genl_unregister)(void);
 
 /* MAGIC helpers							{{{2 */
 
+static inline int nla_put_u64_0pad(struct sk_buff *skb, int attrtype, u64 value)
+{
+	return nla_put_64bit(skb, attrtype, sizeof(u64), &value, 0);
+}
+
 /* possible field types */
 #define __flg_field(attr_nr, attr_flag, name) \
 	__field(attr_nr, attr_flag, name, NLA_U8, char, \
@@ -80,7 +85,7 @@  extern void CONCAT_(GENL_MAGIC_FAMILY, _genl_unregister)(void);
 			nla_get_u32, nla_put_u32, true)
 #define __u64_field(attr_nr, attr_flag, name)	\
 	__field(attr_nr, attr_flag, name, NLA_U64, __u64, \
-			nla_get_u64, nla_put_u64, false)
+			nla_get_u64, nla_put_u64_0pad, false)
 #define __str_field(attr_nr, attr_flag, name, maxlen) \
 	__array(attr_nr, attr_flag, name, NLA_NUL_STRING, char, maxlen, \
 			nla_strlcpy, nla_put, false)