Message ID | 20190507091118.24324-1-liuhangbin@gmail.com |
---|---|
State | Accepted |
Delegated to: | David Miller |
Headers | show |
Series | [net] fib_rules: return 0 directly if an exactly same rule exists when NLM_F_EXCL not supplied | expand |
From: Hangbin Liu <liuhangbin@gmail.com> Date: Tue, 7 May 2019 17:11:18 +0800 > With commit 153380ec4b9 ("fib_rules: Added NLM_F_EXCL support to > fib_nl_newrule") we now able to check if a rule already exists. But this > only works with iproute2. For other tools like libnl, NetworkManager, > it still could add duplicate rules with only NLM_F_CREATE flag, like > > [localhost ~ ]# ip rule > 0: from all lookup local > 32766: from all lookup main > 32767: from all lookup default > 100000: from 192.168.7.5 lookup 5 > 100000: from 192.168.7.5 lookup 5 > > As it doesn't make sense to create two duplicate rules, let's just return > 0 if the rule exists. > > Fixes: 153380ec4b9 ("fib_rules: Added NLM_F_EXCL support to fib_nl_newrule") > Reported-by: Thomas Haller <thaller@redhat.com> > Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Applied and queued up for -stable, thanks.
FYI, this userspace visible change in behaviour breaks Android. We rely on being able to add a rule and either have a dup be created (in which case we'll remove it later) or have it fail with EEXIST (in which case we won't remove it later). Returning 0 makes atomically changing a rule difficult. Please revert. On Wed, May 8, 2019 at 9:39 AM David Miller <davem@davemloft.net> wrote: > > From: Hangbin Liu <liuhangbin@gmail.com> > Date: Tue, 7 May 2019 17:11:18 +0800 > > > With commit 153380ec4b9 ("fib_rules: Added NLM_F_EXCL support to > > fib_nl_newrule") we now able to check if a rule already exists. But this > > only works with iproute2. For other tools like libnl, NetworkManager, > > it still could add duplicate rules with only NLM_F_CREATE flag, like > > > > [localhost ~ ]# ip rule > > 0: from all lookup local > > 32766: from all lookup main > > 32767: from all lookup default > > 100000: from 192.168.7.5 lookup 5 > > 100000: from 192.168.7.5 lookup 5 > > > > As it doesn't make sense to create two duplicate rules, let's just return > > 0 if the rule exists. > > > > Fixes: 153380ec4b9 ("fib_rules: Added NLM_F_EXCL support to fib_nl_newrule") > > Reported-by: Thomas Haller <thaller@redhat.com> > > Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> > > Applied and queued up for -stable, thanks.
On Fri, May 31, 2019 at 06:43:42PM -0700, Maciej Żenczykowski wrote: > FYI, this userspace visible change in behaviour breaks Android. > > We rely on being able to add a rule and either have a dup be created > (in which case we'll remove it later) or have it fail with EEXIST (in > which case we won't remove it later). > > Returning 0 makes atomically changing a rule difficult. > > Please revert. That's crazy, but makes sense in an odd way :) And it explains why my "fix up of the patch" also breaks things, both patches need to be reverted in the stable trees. Do you need me to make a patch to revert this in Linus's tree now, or can you do that? Just asking for an existing commit to be reverted is usually a bit harder as that's not most maintainer's workflow (I know it's not mine.) thanks, gre gk-h
Hi David Ahern, On Fri, May 31, 2019 at 06:43:42PM -0700, Maciej Żenczykowski wrote: > FYI, this userspace visible change in behaviour breaks Android. > > We rely on being able to add a rule and either have a dup be created > (in which case we'll remove it later) or have it fail with EEXIST (in > which case we won't remove it later). > > Returning 0 makes atomically changing a rule difficult. > > Please revert. What do you think? Should I rever this commit? Although I'm still not clear what's the difference between a) adding a dup rule and remove it later and b) return 0 directly if the rule exactally the same. Thanks Hangbin > > On Wed, May 8, 2019 at 9:39 AM David Miller <davem@davemloft.net> wrote: > > > > From: Hangbin Liu <liuhangbin@gmail.com> > > Date: Tue, 7 May 2019 17:11:18 +0800 > > > > > With commit 153380ec4b9 ("fib_rules: Added NLM_F_EXCL support to > > > fib_nl_newrule") we now able to check if a rule already exists. But this > > > only works with iproute2. For other tools like libnl, NetworkManager, > > > it still could add duplicate rules with only NLM_F_CREATE flag, like > > > > > > [localhost ~ ]# ip rule > > > 0: from all lookup local > > > 32766: from all lookup main > > > 32767: from all lookup default > > > 100000: from 192.168.7.5 lookup 5 > > > 100000: from 192.168.7.5 lookup 5 > > > > > > As it doesn't make sense to create two duplicate rules, let's just return > > > 0 if the rule exists. > > > > > > Fixes: 153380ec4b9 ("fib_rules: Added NLM_F_EXCL support to fib_nl_newrule") > > > Reported-by: Thomas Haller <thaller@redhat.com> > > > Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> > > > > Applied and queued up for -stable, thanks.
On Wed, Jun 5, 2019 at 10:43 AM Hangbin Liu <liuhangbin@gmail.com> wrote: > Although I'm still not clear what's the difference between > > a) adding a dup rule and remove it later > and > b) return 0 directly if the rule exactally the same. The Android code updates ip rules by adding the new rule and then deleting the old rule. Before this patch, the result of the operation is that the old rule is deleted and the new rule exists. After this patch, if the new rule is the same as the old rule, then the add does nothing and the delete deletes the old rule. The result of the operation is that the old rule is deleted and the new rule is no longer there, and the rules are broken.
On Wed, Jun 05, 2019 at 10:47:58AM +0900, Lorenzo Colitti wrote: > On Wed, Jun 5, 2019 at 10:43 AM Hangbin Liu <liuhangbin@gmail.com> wrote: > > Although I'm still not clear what's the difference between > > > > a) adding a dup rule and remove it later > > and > > b) return 0 directly if the rule exactally the same. > > The Android code updates ip rules by adding the new rule and then > deleting the old rule. Before this patch, the result of the operation > is that the old rule is deleted and the new rule exists. After this > patch, if the new rule is the same as the old rule, then the add does > nothing and the delete deletes the old rule. The result of the > operation is that the old rule is deleted and the new rule is no > longer there, and the rules are broken. Hi Lorenzo, How do you add the rules? with ip cmd it should has NLM_F_EXCL flag and you will get -EEXIST error out. Thanks Hangbin
On Wed, Jun 5, 2019 at 11:15 AM Hangbin Liu <liuhangbin@gmail.com> wrote: > How do you add the rules? with ip cmd it should has NLM_F_EXCL flag and > you will get -EEXIST error out. The fact that the code worked before this commit implies that it was *not* using NLM_F_EXCL. :-) The code is here if you want to take a look. https://android.googlesource.com/platform/system/netd/+/master/server/RouteController.cpp#576 https://android.googlesource.com/platform/system/netd/+/master/server/NetlinkCommands.h#33
On Wed, Jun 05, 2019 at 11:25:11AM +0900, Lorenzo Colitti wrote: > On Wed, Jun 5, 2019 at 11:15 AM Hangbin Liu <liuhangbin@gmail.com> wrote: > > How do you add the rules? with ip cmd it should has NLM_F_EXCL flag and > > you will get -EEXIST error out. > > The fact that the code worked before this commit implies that it was > *not* using NLM_F_EXCL. :-) Yes, that's why you got the issue. > We rely on being able to add a rule and either have a dup be created > (in which case we'll remove it later) or have it fail with EEXIST (in > which case we won't remove it later). With Maciej said, how about add NLM_F_EXCL flag when you add a new rule. If it returned EEXIST, which means there is an dup rule, you just do not remove it later. Would that fix your issue? Thanks Hangbin
On Wed, Jun 5, 2019 at 12:29 PM Hangbin Liu <liuhangbin@gmail.com> wrote: > > We rely on being able to add a rule and either have a dup be created > > (in which case we'll remove it later) or have it fail with EEXIST (in > > which case we won't remove it later). > > With Maciej said, how about add NLM_F_EXCL flag when you add a new rule. > If it returned EEXIST, which means there is an dup rule, you just do not > remove it later. > > Would that fix your issue? We can't do that without rewriting our code and making it more complex. The way the code is structured is that an update is "add all new rules; delete all old rules". To do what you suggest we would need to either change that to "for rule in rules; add newrule; delete oldrule" or we'd need to keep state on which rules already existed. The previous behaviour provided semantics that are useful to userspace, and this commit broke those semantics. Please revert.
On 6/4/19 7:43 PM, Hangbin Liu wrote: > Hi David Ahern, > > On Fri, May 31, 2019 at 06:43:42PM -0700, Maciej Żenczykowski wrote: >> FYI, this userspace visible change in behaviour breaks Android. >> >> We rely on being able to add a rule and either have a dup be created >> (in which case we'll remove it later) or have it fail with EEXIST (in >> which case we won't remove it later). >> >> Returning 0 makes atomically changing a rule difficult. >> >> Please revert. > What do you think? Should I rever this commit? I think it is crazy to add multiple identical rules given the linear effect on performance. But, since it breaks Android, it has to be reverted.
On Wed, Jun 05, 2019 at 12:43:44PM +0900, Lorenzo Colitti wrote: > On Wed, Jun 5, 2019 at 12:29 PM Hangbin Liu <liuhangbin@gmail.com> wrote: > > > We rely on being able to add a rule and either have a dup be created > > > (in which case we'll remove it later) or have it fail with EEXIST (in > > > which case we won't remove it later). > > > > With Maciej said, how about add NLM_F_EXCL flag when you add a new rule. > > If it returned EEXIST, which means there is an dup rule, you just do not > > remove it later. > > > > Would that fix your issue? > > We can't do that without rewriting our code and making it more > complex. The way the code is structured is that an update is "add all > new rules; delete all old rules". To do what you suggest we would need > to either change that to "for rule in rules; add newrule; delete > oldrule" or we'd need to keep state on which rules already existed. Hmm...Generally speaking we need to check the cmd's return value, or why it exists. > > The previous behaviour provided semantics that are useful to > userspace, and this commit broke those semantics. Please revert. Keep two exactally same rules in kernel looks strange and doesn't make sense to me. But let's see what does David Ahern think, he is more experienced in this part :) Thanks Hangbin
On Tue, Jun 04, 2019 at 09:57:56PM -0600, David Ahern wrote: > On 6/4/19 7:43 PM, Hangbin Liu wrote: > > Hi David Ahern, > > > > On Fri, May 31, 2019 at 06:43:42PM -0700, Maciej Żenczykowski wrote: > >> FYI, this userspace visible change in behaviour breaks Android. > >> > >> We rely on being able to add a rule and either have a dup be created > >> (in which case we'll remove it later) or have it fail with EEXIST (in > >> which case we won't remove it later). > >> > >> Returning 0 makes atomically changing a rule difficult. > >> > >> Please revert. > > What do you think? Should I rever this commit? > > I think it is crazy to add multiple identical rules given the linear > effect on performance. But, since it breaks Android, it has to be reverted. OK.. Since you also agree to revert it. Thanks Hangbin
On Wed, Jun 5, 2019 at 12:58 PM David Ahern <dsa@cumulusnetworks.com> wrote: > I think it is crazy to add multiple identical rules given the linear > effect on performance. Not sure if this is what you were implying or not, but our code doesn't maintain multiple identical rules in steady state. It only uses them for make-before-break when something changes. > But, since it breaks Android, it has to be reverted. Well... the immediate problem on Android is that we cannot live with this going to LTS, since it is going to break devices in the field. As for making this change in 5.3: we might be able to structure the code differently in a future Android release, assuming the same userspace code can work on kernels back to 4.4 (not sure it can, since the semantics changed in 4.8). But even if we can fix this in Android, this change is still breaking compatibility with existing other userspace code. Are there concrete performance optimizations that you'd like to make that can't be made unless you change the semantics here? Are those optimizations worth breaking the backwards compatibility guarantees for?
On 6/4/19 10:58 PM, Lorenzo Colitti wrote: > As for making this change in 5.3: we might be able to structure the > code differently in a future Android release, assuming the same > userspace code can work on kernels back to 4.4 (not sure it can, since > the semantics changed in 4.8). But even if we can fix this in Android, > this change is still breaking compatibility with existing other > userspace code. Are there concrete performance optimizations that > you'd like to make that can't be made unless you change the semantics > here? Are those optimizations worth breaking the backwards > compatibility guarantees for? The list of fib rules is walked looking for a match. more rules = more overhead. Given the flexibility of the rules, I have not come up with any changes that have a real improvement in that overhead. VRF, which uses policy routing, was changed to have a single l3mdev rule for all VRFs for this reason.
(side note: change was reverted in net stable) On Wed, Jun 5, 2019 at 8:33 AM David Ahern <dsahern@gmail.com> wrote: > On 6/4/19 10:58 PM, Lorenzo Colitti wrote: > > As for making this change in 5.3: we might be able to structure the > > code differently in a future Android release, assuming the same > > userspace code can work on kernels back to 4.4 (not sure it can, since > > the semantics changed in 4.8). But even if we can fix this in Android, > > this change is still breaking compatibility with existing other > > userspace code. Are there concrete performance optimizations that > > you'd like to make that can't be made unless you change the semantics > > here? Are those optimizations worth breaking the backwards > > compatibility guarantees for? > > The list of fib rules is walked looking for a match. more rules = more > overhead. Given the flexibility of the rules, I have not come up with > any changes that have a real improvement in that overhead. VRF, which > uses policy routing, was changed to have a single l3mdev rule for all > VRFs for this reason. Instead of keeping duplicates there could be a counter on the singleton rule. And then adding/removing is just inc/dec on the counter (and only actually delete when counter drops to 0). Would require some extra effort to make it look the same when dumping I guess (to expand it out). --- I'm not sure this is worth optimizing for. In Android these states with dupes are temporary. And really, if you complain about performance it is perfectly reasonable to say "then don't create duplicate rules". --- Note though that from a multithreaded perspective, you'd never want the 'ignore it and return 0' behaviour. It's fundamentally a bad api. It's better to return an error and userspace can decide that EEXIST is acceptable and treat it as 0. Imagine two threads doing 'add ip rule foo; blah blah; remove ip rule foo' You either want the create dupes, or the 'return EEXIST' behaviour (so the second thread can fail, or wait until it succeeds or whatever). This way if two threads both run the same operation, either both of them succeed, or one succeeds and the other gets notified of dup. Otherwise you get a spurious failure in one of the threads and bad behaviour in the other (since rule vanishes before it should).
diff --git a/net/core/fib_rules.c b/net/core/fib_rules.c index ffbb827723a2..c49b752ea7eb 100644 --- a/net/core/fib_rules.c +++ b/net/core/fib_rules.c @@ -756,9 +756,9 @@ int fib_nl_newrule(struct sk_buff *skb, struct nlmsghdr *nlh, if (err) goto errout; - if ((nlh->nlmsg_flags & NLM_F_EXCL) && - rule_exists(ops, frh, tb, rule)) { - err = -EEXIST; + if (rule_exists(ops, frh, tb, rule)) { + if (nlh->nlmsg_flags & NLM_F_EXCL) + err = -EEXIST; goto errout_free; }
With commit 153380ec4b9 ("fib_rules: Added NLM_F_EXCL support to fib_nl_newrule") we now able to check if a rule already exists. But this only works with iproute2. For other tools like libnl, NetworkManager, it still could add duplicate rules with only NLM_F_CREATE flag, like [localhost ~ ]# ip rule 0: from all lookup local 32766: from all lookup main 32767: from all lookup default 100000: from 192.168.7.5 lookup 5 100000: from 192.168.7.5 lookup 5 As it doesn't make sense to create two duplicate rules, let's just return 0 if the rule exists. Fixes: 153380ec4b9 ("fib_rules: Added NLM_F_EXCL support to fib_nl_newrule") Reported-by: Thomas Haller <thaller@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> --- net/core/fib_rules.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)