mbox series

[0/4,SRU,focal:linux] NFSv4.1: Don't rebind to the same source port when reconnecting to the server

Message ID 20221122203914.344894-1-tim.gardner@canonical.com
Headers show
Series NFSv4.1: Don't rebind to the same source port when reconnecting to the server | expand

Message

Tim Gardner Nov. 22, 2022, 8:39 p.m. UTC
BugLink: https://bugs.launchpad.net/bugs/1997488

SRU Justification

Microsoft asked for this backport complaining that, 
"NFS 4.1 mounts on AKS are reusing the same source port when a reconnect needs to happen".

Patches 1-3 are scaffolding in order to provide a clean cherrypick for patch 4.
This issue looks applicable to the generic kernel.

[Impact]

NFSv2, v3 and NFSv4 servers often have duplicate replay caches that look
at the source port when deciding whether or not an RPC call is a replay
of a previous call. This requires clients to perform strange TCP gymnastics
in order to ensure that when they reconnect to the server, they bind
to the same source port.

NFSv4.1 and NFSv4.2 have sessions that provide proper replay semantics,
that do not look at the source port of the connection. This patch therefore
ensures they can ignore the rebind requirement.

[Where things could go wrong]

NFS reconnects may erroneously fail.

[Other Info]

SF: #00345839

Comments

Khalid Elmously Nov. 22, 2022, 9:11 p.m. UTC | #1
On 2022-11-22 13:39:10 , Tim Gardner wrote:
> BugLink: https://bugs.launchpad.net/bugs/1997488
> 
> SRU Justification
> 
> Microsoft asked for this backport complaining that, 
> "NFS 4.1 mounts on AKS are reusing the same source port when a reconnect needs to happen".
> 
> Patches 1-3 are scaffolding in order to provide a clean cherrypick for patch 4.
> This issue looks applicable to the generic kernel.
> 
> [Impact]
> 
> NFSv2, v3 and NFSv4 servers often have duplicate replay caches that look
> at the source port when deciding whether or not an RPC call is a replay
> of a previous call. This requires clients to perform strange TCP gymnastics
> in order to ensure that when they reconnect to the server, they bind
> to the same source port.
> 
> NFSv4.1 and NFSv4.2 have sessions that provide proper replay semantics,
> that do not look at the source port of the connection. This patch therefore
> ensures they can ignore the rebind requirement.
> 
> [Where things could go wrong]
> 
> NFS reconnects may erroneously fail.
> 
> [Other Info]
> 
> SF: #00345839
> 

Would this not cause issues for users that run NFSv < 4.1 who don't have proper replay semantics and still rely on source-port reuse?




> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Tim Gardner Nov. 23, 2022, 1:20 p.m. UTC | #2
On 11/22/22 2:11 PM, Khaled Elmously wrote:
> On 2022-11-22 13:39:10 , Tim Gardner wrote:
>> BugLink: https://bugs.launchpad.net/bugs/1997488
>>
>> SRU Justification
>>
>> Microsoft asked for this backport complaining that,
>> "NFS 4.1 mounts on AKS are reusing the same source port when a reconnect needs to happen".
>>
>> Patches 1-3 are scaffolding in order to provide a clean cherrypick for patch 4.
>> This issue looks applicable to the generic kernel.
>>
>> [Impact]
>>
>> NFSv2, v3 and NFSv4 servers often have duplicate replay caches that look
>> at the source port when deciding whether or not an RPC call is a replay
>> of a previous call. This requires clients to perform strange TCP gymnastics
>> in order to ensure that when they reconnect to the server, they bind
>> to the same source port.
>>
>> NFSv4.1 and NFSv4.2 have sessions that provide proper replay semantics,
>> that do not look at the source port of the connection. This patch therefore
>> ensures they can ignore the rebind requirement.
>>
>> [Where things could go wrong]
>>
>> NFS reconnects may erroneously fail.
>>
>> [Other Info]
>>
>> SF: #00345839
>>
> 
> Would this not cause issues for users that run NFSv < 4.1 who don't have proper replay semantics and still rely on source-port reuse?
> 

Aren't NFS version updates backwards compatible ?
Khalid Elmously Nov. 23, 2022, 9:23 p.m. UTC | #3
On 2022-11-23 06:20:28 , Tim Gardner wrote:
> On 11/22/22 2:11 PM, Khaled Elmously wrote:
> > On 2022-11-22 13:39:10 , Tim Gardner wrote:
> > > BugLink: https://bugs.launchpad.net/bugs/1997488
> > > 
> > > SRU Justification
> > > 
> > > Microsoft asked for this backport complaining that,
> > > "NFS 4.1 mounts on AKS are reusing the same source port when a reconnect needs to happen".
> > > 
> > > Patches 1-3 are scaffolding in order to provide a clean cherrypick for patch 4.
> > > This issue looks applicable to the generic kernel.
> > > 
> > > [Impact]
> > > 
> > > NFSv2, v3 and NFSv4 servers often have duplicate replay caches that look
> > > at the source port when deciding whether or not an RPC call is a replay
> > > of a previous call. This requires clients to perform strange TCP gymnastics
> > > in order to ensure that when they reconnect to the server, they bind
> > > to the same source port.
> > > 
> > > NFSv4.1 and NFSv4.2 have sessions that provide proper replay semantics,
> > > that do not look at the source port of the connection. This patch therefore
> > > ensures they can ignore the rebind requirement.
> > > 
> > > [Where things could go wrong]
> > > 
> > > NFS reconnects may erroneously fail.
> > > 
> > > [Other Info]
> > > 
> > > SF: #00345839
> > > 
> > 
> > Would this not cause issues for users that run NFSv < 4.1 who don't have proper replay semantics and still rely on source-port reuse?
> > 
> 
> Aren't NFS version updates backwards compatible ?

Yes, you're right.

--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -880,8 +880,11 @@ static int nfs4_set_client(struct nfs_server *server,
        };
        struct nfs_client *clp;

-       if (minorversion > 0 && proto == XPRT_TRANSPORT_TCP)
+       if (minorversion == 0)
+               __set_bit(NFS_CS_REUSEPORT, &cl_init.init_flags);
+       else if (proto == XPRT_TRANSPORT_TCP)
                cl_init.nconnect = nconnect;



Sorry ignore the noise.


> 
> -- 
> -----------
> Tim Gardner
> Canonical, Inc
>
Khalid Elmously Nov. 23, 2022, 9:32 p.m. UTC | #4
I was wrong about changing the behaviour of NFSv4.0 - it does appear to be kept unchanged, as per:

-       if (minorversion > 0 && proto == XPRT_TRANSPORT_TCP)
+       if (minorversion == 0)
+               __set_bit(NFS_CS_REUSEPORT, &cl_init.init_flags);
+       else if (proto == XPRT_TRANSPORT_TCP)
                cl_init.nconnect = nconnect;



But you do appear to making a small change in the behaviour of NFSv3.x though, as per patches 2 and 3. It seems you needed patch 2 only to be able to cleanly cherry-pick patch 3, but that ends up introducing the NFS_CS_DS flag to NFSv3.x clients.

Would it be safer to omit patch 2 entirely and skip the changes to fs/nfs/nfs3client.c from patch 3? It should have no effect on your final intended fix.





On 2022-11-22 13:39:10 , Tim Gardner wrote:
> BugLink: https://bugs.launchpad.net/bugs/1997488
> 
> SRU Justification
> 
> Microsoft asked for this backport complaining that, 
> "NFS 4.1 mounts on AKS are reusing the same source port when a reconnect needs to happen".
> 
> Patches 1-3 are scaffolding in order to provide a clean cherrypick for patch 4.
> This issue looks applicable to the generic kernel.
> 
> [Impact]
> 
> NFSv2, v3 and NFSv4 servers often have duplicate replay caches that look
> at the source port when deciding whether or not an RPC call is a replay
> of a previous call. This requires clients to perform strange TCP gymnastics
> in order to ensure that when they reconnect to the server, they bind
> to the same source port.
> 
> NFSv4.1 and NFSv4.2 have sessions that provide proper replay semantics,
> that do not look at the source port of the connection. This patch therefore
> ensures they can ignore the rebind requirement.
> 
> [Where things could go wrong]
> 
> NFS reconnects may erroneously fail.
> 
> [Other Info]
> 
> SF: #00345839
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Tim Gardner Nov. 28, 2022, 2:39 p.m. UTC | #5
On 11/23/22 2:32 PM, Khaled Elmously wrote:
> I was wrong about changing the behaviour of NFSv4.0 - it does appear
> to be kept unchanged, as per:
> 
> -       if (minorversion > 0 && proto == XPRT_TRANSPORT_TCP) +
> if (minorversion == 0) +               __set_bit(NFS_CS_REUSEPORT,
> &cl_init.init_flags); +       else if (proto == XPRT_TRANSPORT_TCP) 
> cl_init.nconnect = nconnect;
> 
> 
> 
> But you do appear to making a small change in the behaviour of
> NFSv3.x though, as per patches 2 and 3. It seems you needed patch 2
> only to be able to cleanly cherry-pick patch 3, but that ends up
> introducing the NFS_CS_DS flag to NFSv3.x clients.
> 
> Would it be safer to omit patch 2 entirely and skip the changes to
> fs/nfs/nfs3client.c from patch 3? It should have no effect on your
> final intended fix.
> 

You are correct, there is a subtle change in behavior with patch 2. Same 
with patch 3. I'll post a v2 with just patch 4 backported without these 
changes.

rtg