diff mbox

Allow userspace code to use flag IFA_F_SECONDARY to specify an ip address to be primary or secondary ip on an interface

Message ID CAK3+h2y8eAKFC7BbnDwfZLzTw=t-G9qWSqznp4ggwDy7R_O6Ag@mail.gmail.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Vincent Li Sept. 24, 2013, 9:34 p.m. UTC
Thanks Julian for the comments, I imagined it would not be so simple
as it changed old behavior with ip binary and some actions in
__inet_del_ifa() that I am not fully aware of. my intention is to
preserve the old behavior and extend the flexibility, I am unable to
come up with a good patch to achieve the intended behavior.

I had to patch the ip binary to sort of preserve original ip binary
behavior with the kernel patch I provided., the ip command patch
below:

                if (strcmp(*argv, "peer") == 0 ||
@@ -1307,6 +1308,11 @@ static int ipaddr_modify(int cmd, int flags,
int argc, char **argv)
                                invarg("invalid scope value.", *argv);
                        req.ifa.ifa_scope = scope;
                        scoped = 1;
+                } else if (strcmp(*argv, "secondary") == 0 ||
+                           strcmp(*argv, "temporary") == 0) {
+                        req.ifa.ifa_flags |= IFA_F_SECONDARY;
+                } else if (strcmp(*argv, "primary") == 0) {
+                        req.ifa.ifa_flags &= ~IFA_F_SECONDARY;
                } else if (strcmp(*argv, "dev") == 0) {
                        NEXT_ARG();
                        d = *argv;

if someone can point me to the right patch directions or coming up
with better patches, it is very much appreciated.


On Tue, Sep 24, 2013 at 2:13 PM, Julian Anastasov <ja@ssi.bg> wrote:
>
>         Hello,
>
> On Tue, 24 Sep 2013, Vincent Li wrote:
>
>> the current behavior is when an IP is added to an interface, the primary
>> or secondary attributes is depending on the order of ip added to the interface
>> the first IP will be primary and second, third,... or alias IP will be secondary
>> if the IP subnet matches
>>
>> this patch add the flexiblity to allow user to specify an argument 'primary' or 'secondary'
>> (use 'ip addr add ip/mask primary|secondary dev ethX ' from iproute2 for example) to specify
>> an IP address to be  primary or secondary.
>>
>> the reason for this patch is that we have a multi blade cluster platform sharing 'floating management ip'
>> and also that each blade has its own management ip on the management interface, so whichever blade in the
>> cluster becomes primary blade, the 'floating mangaement ip' follows it, and we want any of our traffic
>> originated from the primary blade source from the 'floating management ip' for consistency. but in this
>> case, since the local blade management ip is always the primary ip on the mangaement interface and 'floating
>> management ip' is always secondary, kernel always choose the primary ip as source ip address. thus we would
>> like to add the flexibility in kernel to allow us to specify which ip to be primary or secondary.
>>
>> Signed-off-by: Vincent Li <vincent.mc.li@gmail.com>
>> ---
>>  net/ipv4/devinet.c |    9 +++++++--
>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
>> index a1b5bcb..bfc702a 100644
>> --- a/net/ipv4/devinet.c
>> +++ b/net/ipv4/devinet.c
>> @@ -440,9 +440,11 @@ static int __inet_insert_ifa(struct in_ifaddr *ifa, struct nlmsghdr *nlh,
>>               return 0;
>>       }
>>
>> -     ifa->ifa_flags &= ~IFA_F_SECONDARY;
>>       last_primary = &in_dev->ifa_list;
>>
>> +     if((*last_primary) == NULL)
>> +             ifa->ifa_flags &= ~IFA_F_SECONDARY;
>> +
>>       for (ifap = &in_dev->ifa_list; (ifa1 = *ifap) != NULL;
>>            ifap = &ifa1->ifa_next) {
>>               if (!(ifa1->ifa_flags & IFA_F_SECONDARY) &&
>> @@ -458,7 +460,10 @@ static int __inet_insert_ifa(struct in_ifaddr *ifa, struct nlmsghdr *nlh,
>>                               inet_free_ifa(ifa);
>>                               return -EINVAL;
>>                       }
>> -                     ifa->ifa_flags |= IFA_F_SECONDARY;
>
>         There is some confusion here, when ifa has
> IFA_F_SECONDARY bit set, in the 'else' we set it again.
> I guess the 'else' part is not needed.
>
>> +                        if (!(ifa->ifa_flags & IFA_F_SECONDARY))
>> +                                ifa1->ifa_flags |= IFA_F_SECONDARY;
>> +                        else
>> +                                ifa->ifa_flags |= IFA_F_SECONDARY;
>
>         It should not be so simple. You can not
> just change the flag of existing address (ifa1) to secondary.
> See __inet_del_ifa(), there are many more actions that
> follow:
>
> - kernel routes that use this primary address
> should be deleted and recreated with the new primary
> address as source. This includes local routes to secondary
> IPs.
>
> - RTM_NEWADDR should be sent, so that user space see
> the IFA_F_SECONDARY flag
>
>         Some questions:
>
> - should we allow adding of secondary IPs when no primary
> address exists for the subnet, it can happen when primary
> for another subnet already exists
>
> - by default, existing 'ip addr' binaries will provide
> address without IFA_F_SECONDARY flag. Before now we added
> such addresses as last for the subnet, now the
> behaviour changes, we start to add addresses in reverse
> order. Is that true? I.e. before now the operation was
> APPEND, now we need a way to provide PREPEND operation.
>
> Regards
>
> --
> Julian Anastasov <ja@ssi.bg>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Vincent Li Sept. 24, 2013, 9:54 p.m. UTC | #1
sorry Julian to miss your point after reading the __inet_del_ifa and
see the rtmsg_ifa, fib_del_ifaddr/fib_add_ifaddr, I can try another
patch and actually test if the patches changes works as it is
intended, not just checking from ip binary output.

Vincent

On Tue, Sep 24, 2013 at 2:34 PM, Vincent Li <vincent.mc.li@gmail.com> wrote:
> Thanks Julian for the comments, I imagined it would not be so simple
> as it changed old behavior with ip binary and some actions in
> __inet_del_ifa() that I am not fully aware of. my intention is to
> preserve the old behavior and extend the flexibility, I am unable to
> come up with a good patch to achieve the intended behavior.
>
> I had to patch the ip binary to sort of preserve original ip binary
> behavior with the kernel patch I provided., the ip command patch
> below:
>
> diff --git a/ip/ipaddress.c b/ip/ipaddress.c
> index 1c3e4da..9f2802c 100644
> --- a/ip/ipaddress.c
> +++ b/ip/ipaddress.c
> @@ -1259,6 +1259,7 @@ static int ipaddr_modify(int cmd, int flags, int
> argc, char **argv)
>         req.n.nlmsg_flags = NLM_F_REQUEST | flags;
>         req.n.nlmsg_type = cmd;
>         req.ifa.ifa_family = preferred_family;
> +       req.ifa.ifa_flags |= IFA_F_SECONDARY;
>
>         while (argc > 0) {
>                 if (strcmp(*argv, "peer") == 0 ||
> @@ -1307,6 +1308,11 @@ static int ipaddr_modify(int cmd, int flags,
> int argc, char **argv)
>                                 invarg("invalid scope value.", *argv);
>                         req.ifa.ifa_scope = scope;
>                         scoped = 1;
> +                } else if (strcmp(*argv, "secondary") == 0 ||
> +                           strcmp(*argv, "temporary") == 0) {
> +                        req.ifa.ifa_flags |= IFA_F_SECONDARY;
> +                } else if (strcmp(*argv, "primary") == 0) {
> +                        req.ifa.ifa_flags &= ~IFA_F_SECONDARY;
>                 } else if (strcmp(*argv, "dev") == 0) {
>                         NEXT_ARG();
>                         d = *argv;
>
> if someone can point me to the right patch directions or coming up
> with better patches, it is very much appreciated.
>
>
> On Tue, Sep 24, 2013 at 2:13 PM, Julian Anastasov <ja@ssi.bg> wrote:
>>
>>         Hello,
>>
>> On Tue, 24 Sep 2013, Vincent Li wrote:
>>
>>> the current behavior is when an IP is added to an interface, the primary
>>> or secondary attributes is depending on the order of ip added to the interface
>>> the first IP will be primary and second, third,... or alias IP will be secondary
>>> if the IP subnet matches
>>>
>>> this patch add the flexiblity to allow user to specify an argument 'primary' or 'secondary'
>>> (use 'ip addr add ip/mask primary|secondary dev ethX ' from iproute2 for example) to specify
>>> an IP address to be  primary or secondary.
>>>
>>> the reason for this patch is that we have a multi blade cluster platform sharing 'floating management ip'
>>> and also that each blade has its own management ip on the management interface, so whichever blade in the
>>> cluster becomes primary blade, the 'floating mangaement ip' follows it, and we want any of our traffic
>>> originated from the primary blade source from the 'floating management ip' for consistency. but in this
>>> case, since the local blade management ip is always the primary ip on the mangaement interface and 'floating
>>> management ip' is always secondary, kernel always choose the primary ip as source ip address. thus we would
>>> like to add the flexibility in kernel to allow us to specify which ip to be primary or secondary.
>>>
>>> Signed-off-by: Vincent Li <vincent.mc.li@gmail.com>
>>> ---
>>>  net/ipv4/devinet.c |    9 +++++++--
>>>  1 file changed, 7 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/net/ipv4/devinet.c b/net/ipv4/devinet.c
>>> index a1b5bcb..bfc702a 100644
>>> --- a/net/ipv4/devinet.c
>>> +++ b/net/ipv4/devinet.c
>>> @@ -440,9 +440,11 @@ static int __inet_insert_ifa(struct in_ifaddr *ifa, struct nlmsghdr *nlh,
>>>               return 0;
>>>       }
>>>
>>> -     ifa->ifa_flags &= ~IFA_F_SECONDARY;
>>>       last_primary = &in_dev->ifa_list;
>>>
>>> +     if((*last_primary) == NULL)
>>> +             ifa->ifa_flags &= ~IFA_F_SECONDARY;
>>> +
>>>       for (ifap = &in_dev->ifa_list; (ifa1 = *ifap) != NULL;
>>>            ifap = &ifa1->ifa_next) {
>>>               if (!(ifa1->ifa_flags & IFA_F_SECONDARY) &&
>>> @@ -458,7 +460,10 @@ static int __inet_insert_ifa(struct in_ifaddr *ifa, struct nlmsghdr *nlh,
>>>                               inet_free_ifa(ifa);
>>>                               return -EINVAL;
>>>                       }
>>> -                     ifa->ifa_flags |= IFA_F_SECONDARY;
>>
>>         There is some confusion here, when ifa has
>> IFA_F_SECONDARY bit set, in the 'else' we set it again.
>> I guess the 'else' part is not needed.
>>
>>> +                        if (!(ifa->ifa_flags & IFA_F_SECONDARY))
>>> +                                ifa1->ifa_flags |= IFA_F_SECONDARY;
>>> +                        else
>>> +                                ifa->ifa_flags |= IFA_F_SECONDARY;
>>
>>         It should not be so simple. You can not
>> just change the flag of existing address (ifa1) to secondary.
>> See __inet_del_ifa(), there are many more actions that
>> follow:
>>
>> - kernel routes that use this primary address
>> should be deleted and recreated with the new primary
>> address as source. This includes local routes to secondary
>> IPs.
>>
>> - RTM_NEWADDR should be sent, so that user space see
>> the IFA_F_SECONDARY flag
>>
>>         Some questions:
>>
>> - should we allow adding of secondary IPs when no primary
>> address exists for the subnet, it can happen when primary
>> for another subnet already exists
>>
>> - by default, existing 'ip addr' binaries will provide
>> address without IFA_F_SECONDARY flag. Before now we added
>> such addresses as last for the subnet, now the
>> behaviour changes, we start to add addresses in reverse
>> order. Is that true? I.e. before now the operation was
>> APPEND, now we need a way to provide PREPEND operation.
>>
>> Regards
>>
>> --
>> Julian Anastasov <ja@ssi.bg>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Julian Anastasov Sept. 25, 2013, 7:08 a.m. UTC | #2
Hello,

On Tue, 24 Sep 2013, Vincent Li wrote:

> Thanks Julian for the comments, I imagined it would not be so simple
> as it changed old behavior with ip binary and some actions in
> __inet_del_ifa() that I am not fully aware of. my intention is to
> preserve the old behavior and extend the flexibility, I am unable to
> come up with a good patch to achieve the intended behavior.

...

> if someone can point me to the right patch directions or coming up
> with better patches, it is very much appreciated.

	My first idea was to use NLM_F_APPEND to implement
'ip addr prepend' and 'ip addr append' but the default
operation is 'append' without providing NLM_F_APPEND, so it
does not work.

	Another idea is to add new attribute IFA_PREFERENCE in
include/uapi/linux/if_addr.h just before __IFA_MAX, integer,
3 of the values are known. A preference for the used scope.

/* Add as last, default */
IFA_PREFERENCE_APPEND = 0,

/* Add as last primary, before any present primary in subnet */
IFA_PREFERENCE_PRIMARY = 128,

/* First for scope */
IFA_PREFERENCE_FIRST = 255,

	We should keep it in ifa as priority, for
sorting purposes. It can be 4-byte value, if user wants
to copy user-defined order into preference.

	Sorting order should be:

- all primaries sorted by decreasing scope, decreasing
priority and adding order

- then all secondaries (IFA_F_SECONDARY) sorted by decreasing
priority and adding order

	Usage:

ip addr add ... pref[erence] type_or_priority

# Add floating IP (append at priority 128)
# The primary mode is not guaranteed if another address from
# the same subnet is already using the same or higher priority.
ip addr add ... pref primary
# More preferred primary
ip addr add ... pref 129

# Add first IP for scope
ip addr add ... pref first

	The scope has similar 'sorting' property but not
for IPs in same subnet and it would be difficult to use
it for global routes.

	Thoughts?

Regards

--
Julian Anastasov <ja@ssi.bg>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Vincent Li Sept. 25, 2013, 5:30 p.m. UTC | #3
I think it is good idea to add these preferences flags and sorted
them, but my code knowledge is limited to implement it  as I am still
learning, I can help testing :)

On Wed, Sep 25, 2013 at 12:08 AM, Julian Anastasov <ja@ssi.bg> wrote:
>
>         Hello,
>
> On Tue, 24 Sep 2013, Vincent Li wrote:
>
>> Thanks Julian for the comments, I imagined it would not be so simple
>> as it changed old behavior with ip binary and some actions in
>> __inet_del_ifa() that I am not fully aware of. my intention is to
>> preserve the old behavior and extend the flexibility, I am unable to
>> come up with a good patch to achieve the intended behavior.
>
> ...
>
>> if someone can point me to the right patch directions or coming up
>> with better patches, it is very much appreciated.
>
>         My first idea was to use NLM_F_APPEND to implement
> 'ip addr prepend' and 'ip addr append' but the default
> operation is 'append' without providing NLM_F_APPEND, so it
> does not work.
>
>         Another idea is to add new attribute IFA_PREFERENCE in
> include/uapi/linux/if_addr.h just before __IFA_MAX, integer,
> 3 of the values are known. A preference for the used scope.
>
> /* Add as last, default */
> IFA_PREFERENCE_APPEND = 0,
>
> /* Add as last primary, before any present primary in subnet */
> IFA_PREFERENCE_PRIMARY = 128,
>
> /* First for scope */
> IFA_PREFERENCE_FIRST = 255,
>
>         We should keep it in ifa as priority, for
> sorting purposes. It can be 4-byte value, if user wants
> to copy user-defined order into preference.
>
>         Sorting order should be:
>
> - all primaries sorted by decreasing scope, decreasing
> priority and adding order
>
> - then all secondaries (IFA_F_SECONDARY) sorted by decreasing
> priority and adding order
>
>         Usage:
>
> ip addr add ... pref[erence] type_or_priority
>
> # Add floating IP (append at priority 128)
> # The primary mode is not guaranteed if another address from
> # the same subnet is already using the same or higher priority.
> ip addr add ... pref primary
> # More preferred primary
> ip addr add ... pref 129
>
> # Add first IP for scope
> ip addr add ... pref first
>
>         The scope has similar 'sorting' property but not
> for IPs in same subnet and it would be difficult to use
> it for global routes.
>
>         Thoughts?
>
> Regards
>
> --
> Julian Anastasov <ja@ssi.bg>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Julian Anastasov Sept. 27, 2013, 7:26 p.m. UTC | #4
Hello,

On Wed, 25 Sep 2013, Vincent Li wrote:

> I think it is good idea to add these preferences flags and sorted
> them, but my code knowledge is limited to implement it  as I am still
> learning, I can help testing :)

	I can try it, if such idea looks good enough to
others.

> On Wed, Sep 25, 2013 at 12:08 AM, Julian Anastasov <ja@ssi.bg> wrote:
>
> >         My first idea was to use NLM_F_APPEND to implement
> > 'ip addr prepend' and 'ip addr append' but the default
> > operation is 'append' without providing NLM_F_APPEND, so it
> > does not work.
> >
> >         Another idea is to add new attribute IFA_PREFERENCE in
> > include/uapi/linux/if_addr.h just before __IFA_MAX, integer,
> > 3 of the values are known. A preference for the used scope.
> >
> > /* Add as last, default */
> > IFA_PREFERENCE_APPEND = 0,
> >
> > /* Add as last primary, before any present primary in subnet */
> > IFA_PREFERENCE_PRIMARY = 128,
> >
> > /* First for scope */
> > IFA_PREFERENCE_FIRST = 255,
> >
> >         We should keep it in ifa as priority, for
> > sorting purposes. It can be 4-byte value, if user wants
> > to copy user-defined order into preference.
> >
> >         Sorting order should be:
> >
> > - all primaries sorted by decreasing scope, decreasing
> > priority and adding order
> >
> > - then all secondaries (IFA_F_SECONDARY) sorted by decreasing
> > priority and adding order
> >
> >         Usage:
> >
> > ip addr add ... pref[erence] type_or_priority
> >
> > # Add floating IP (append at priority 128)
> > # The primary mode is not guaranteed if another address from
> > # the same subnet is already using the same or higher priority.
> > ip addr add ... pref primary
> > # More preferred primary
> > ip addr add ... pref 129
> >
> > # Add first IP for scope
> > ip addr add ... pref first
> >
> >         The scope has similar 'sorting' property but not
> > for IPs in same subnet and it would be difficult to use
> > it for global routes.
> >
> >         Thoughts?

Regards

--
Julian Anastasov <ja@ssi.bg>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/ip/ipaddress.c b/ip/ipaddress.c
index 1c3e4da..9f2802c 100644
--- a/ip/ipaddress.c
+++ b/ip/ipaddress.c
@@ -1259,6 +1259,7 @@  static int ipaddr_modify(int cmd, int flags, int
argc, char **argv)
        req.n.nlmsg_flags = NLM_F_REQUEST | flags;
        req.n.nlmsg_type = cmd;
        req.ifa.ifa_family = preferred_family;
+       req.ifa.ifa_flags |= IFA_F_SECONDARY;

        while (argc > 0) {