diff mbox series

[SMB,CLIENT] do not reserve too many oplock credits

Message ID CAH2r5mvuuCeQQKN+RRxoELjf9NOfLNOwgOjBbxdUKYiowsbY_w@mail.gmail.com
State New
Headers show
Series [SMB,CLIENT] do not reserve too many oplock credits | expand

Commit Message

Steve French June 20, 2023, 3:41 a.m. UTC
There were cases reported where servers will sometimes return more
credits than requested on oplock break responses, which can lead to
most of the credits being allocated for oplock breaks (instead of
for normal operations like read and write) if number of SMB3 requests
in flight always stays above 0 (the oplock and echo credits are
rebalanced when in flight requests goes down to zero).

If oplock credits gets unexpectedly large (e.g. ten is more than it
would ever be expected to be) and in flight requests are greater than
zero, then rebalance the oplock credits and regular credits (go
back to reserving just one oplock credit.

See attached

Comments

Shyam Prasad N June 20, 2023, 7:48 a.m. UTC | #1
On Tue, Jun 20, 2023 at 9:12 AM Steve French <smfrench@gmail.com> wrote:
>
> There were cases reported where servers will sometimes return more
> credits than requested on oplock break responses, which can lead to
> most of the credits being allocated for oplock breaks (instead of
> for normal operations like read and write) if number of SMB3 requests
> in flight always stays above 0 (the oplock and echo credits are
> rebalanced when in flight requests goes down to zero).
>
> If oplock credits gets unexpectedly large (e.g. ten is more than it
> would ever be expected to be) and in flight requests are greater than
> zero, then rebalance the oplock credits and regular credits (go
> back to reserving just one oplock credit.
>
> See attached
>
> --
> Thanks,
>
> Steve

Hi Steve,

> If oplock credits gets unexpectedly large (e.g. ten is more than it
> would ever be expected to be) and in flight requests are greater than
> zero, then rebalance the oplock credits and regular credits (go
> back to reserving just one oplock crdit).

Why this value of 10? I would go with 1, since we already reserve 1
credit for oplocks.
If the reasoning is to have enough credits to send multiple
lease/oplock acks, we should
change the reserved count altogether.
Steve French June 21, 2023, 3:57 a.m. UTC | #2
> Why this value of 10? I would go with 1, since we already reserve 1
credit for oplocks. If the reasoning is to have enough credits to send multiple
lease/oplock acks, we should change the reserved count altogether.

I think there could be some value in sending multiple lease break
responses (ie allow oplock credits to be a few more than 1), but my
main reasoning for this was to pick some number that was "safe"
(allowing 10 oplock/lease-break credits while in flight count is
non-zero is unlikely to be a problem) and would be unlikely to change
existing behavior.

My thinking was that today's code allows oplock credits to be above 1
(and keep growing in the server scenario you noticed) while multiple
requests continue to be in flight - so there could potentially be a
performance benefit during this period of high activity in having a
few lease breaks in flight at one time and unlikely to hurt anything -
but more importantly if we change the code to never allow oplock/lease
credits to be above one we could (unlikely but possible) have subtle
behavior changes that trigger a bug (since we would then have cases to
at least some servers where we never have two lease breaks in flight).
It seemed harmless to set the threshold to something slightly more
than one (so multiple lease breaks in flight would still be possible
and thus behavior would not change - but risk of credit starvation is
gone).    If you prefer - I could pick a number like 2 or 3 credits
instead of 10.  My intent was just to make it extremely unlikely that
any behavior would change (but would still fix the possible credit
starvation scenario) - so 2 or 3 would also probably be fine.

On Tue, Jun 20, 2023 at 2:48 AM Shyam Prasad N <nspmangalore@gmail.com> wrote:
>
> On Tue, Jun 20, 2023 at 9:12 AM Steve French <smfrench@gmail.com> wrote:
> >
> > There were cases reported where servers will sometimes return more
> > credits than requested on oplock break responses, which can lead to
> > most of the credits being allocated for oplock breaks (instead of
> > for normal operations like read and write) if number of SMB3 requests
> > in flight always stays above 0 (the oplock and echo credits are
> > rebalanced when in flight requests goes down to zero).
> >
> > If oplock credits gets unexpectedly large (e.g. ten is more than it
> > would ever be expected to be) and in flight requests are greater than
> > zero, then rebalance the oplock credits and regular credits (go
> > back to reserving just one oplock credit.
> >
> > See attached
> >
> > --
> > Thanks,
> >
> > Steve
>
> Hi Steve,
>
> > If oplock credits gets unexpectedly large (e.g. ten is more than it
> > would ever be expected to be) and in flight requests are greater than
> > zero, then rebalance the oplock credits and regular credits (go
> > back to reserving just one oplock crdit).
>
> Why this value of 10? I would go with 1, since we already reserve 1
> credit for oplocks.
> If the reasoning is to have enough credits to send multiple
> lease/oplock acks, we should
> change the reserved count altogether.
>
> --
> Regards,
> Shyam
Tom Talpey June 21, 2023, 6:25 p.m. UTC | #3
On 6/20/2023 11:57 PM, Steve French wrote:
>> Why this value of 10? I would go with 1, since we already reserve 1
> credit for oplocks. If the reasoning is to have enough credits to send multiple
> lease/oplock acks, we should change the reserved count altogether.
> 
> I think there could be some value in sending multiple lease break
> responses (ie allow oplock credits to be a few more than 1), but my
> main reasoning for this was to pick some number that was "safe"
> (allowing 10 oplock/lease-break credits while in flight count is
> non-zero is unlikely to be a problem) and would be unlikely to change
> existing behavior.
> 
> My thinking was that today's code allows oplock credits to be above 1
> (and keep growing in the server scenario you noticed) while multiple
> requests continue to be in flight - so there could potentially be a
> performance benefit during this period of high activity in having a
> few lease breaks in flight at one time and unlikely to hurt anything -
> but more importantly if we change the code to never allow oplock/lease
> credits to be above one we could (unlikely but possible) have subtle
> behavior changes that trigger a bug (since we would then have cases to
> at least some servers where we never have two lease breaks in flight).
> It seemed harmless to set the threshold to something slightly more
> than one (so multiple lease breaks in flight would still be possible
> and thus behavior would not change - but risk of credit starvation is
> gone).    If you prefer - I could pick a number like 2 or 3 credits
> instead of 10.  My intent was just to make it extremely unlikely that
> any behavior would change (but would still fix the possible credit
> starvation scenario) - so 2 or 3 would also probably be fine.

What do you mean by "oplock credits"? There's no such field in the
protocol. Is this some sort of reserved pool for the server to send
unsolicited messages, such as recalls or STATUS_PENDING async stuff?

If so, I really don't think any constant is appropriate. If the client
can't calculate an expected number, we should keep it quite small.

Tom.

> On Tue, Jun 20, 2023 at 2:48 AM Shyam Prasad N <nspmangalore@gmail.com> wrote:
>>
>> On Tue, Jun 20, 2023 at 9:12 AM Steve French <smfrench@gmail.com> wrote:
>>>
>>> There were cases reported where servers will sometimes return more
>>> credits than requested on oplock break responses, which can lead to
>>> most of the credits being allocated for oplock breaks (instead of
>>> for normal operations like read and write) if number of SMB3 requests
>>> in flight always stays above 0 (the oplock and echo credits are
>>> rebalanced when in flight requests goes down to zero).
>>>
>>> If oplock credits gets unexpectedly large (e.g. ten is more than it
>>> would ever be expected to be) and in flight requests are greater than
>>> zero, then rebalance the oplock credits and regular credits (go
>>> back to reserving just one oplock credit.
>>>
>>> See attached
>>>
>>> --
>>> Thanks,
>>>
>>> Steve
>>
>> Hi Steve,
>>
>>> If oplock credits gets unexpectedly large (e.g. ten is more than it
>>> would ever be expected to be) and in flight requests are greater than
>>> zero, then rebalance the oplock credits and regular credits (go
>>> back to reserving just one oplock crdit).
>>
>> Why this value of 10? I would go with 1, since we already reserve 1
>> credit for oplocks.
>> If the reasoning is to have enough credits to send multiple
>> lease/oplock acks, we should
>> change the reserved count altogether.
>>
>> --
>> Regards,
>> Shyam
> 
> 
>
Steve French June 22, 2023, 4:56 a.m. UTC | #4
On Wed, Jun 21, 2023 at 1:25 PM Tom Talpey <tom@talpey.com> wrote:
>
> On 6/20/2023 11:57 PM, Steve French wrote:
> >> Why this value of 10? I would go with 1, since we already reserve 1
> > credit for oplocks. If the reasoning is to have enough credits to send multiple
> > lease/oplock acks, we should change the reserved count altogether.
> >
> > I think there could be some value in sending multiple lease break
> > responses (ie allow oplock credits to be a few more than 1), but my
> > main reasoning for this was to pick some number that was "safe"
> > (allowing 10 oplock/lease-break credits while in flight count is
> > non-zero is unlikely to be a problem) and would be unlikely to change
> > existing behavior.
> >
> > My thinking was that today's code allows oplock credits to be above 1
> > (and keep growing in the server scenario you noticed) while multiple
> > requests continue to be in flight - so there could potentially be a
> > performance benefit during this period of high activity in having a
> > few lease breaks in flight at one time and unlikely to hurt anything -
> > but more importantly if we change the code to never allow oplock/lease
> > credits to be above one we could (unlikely but possible) have subtle
> > behavior changes that trigger a bug (since we would then have cases to
> > at least some servers where we never have two lease breaks in flight).
> > It seemed harmless to set the threshold to something slightly more
> > than one (so multiple lease breaks in flight would still be possible
> > and thus behavior would not change - but risk of credit starvation is
> > gone).    If you prefer - I could pick a number like 2 or 3 credits
> > instead of 10.  My intent was just to make it extremely unlikely that
> > any behavior would change (but would still fix the possible credit
> > starvation scenario) - so 2 or 3 would also probably be fine.
>
> What do you mean by "oplock credits"? There's no such field in the
> protocol. Is this some sort of reserved pool

The client divides the total credits granted by the server into three
buckets (see struct TCP_Server_Info)
        int echo_credits;  /* echo reserved slots */
        int oplock_credits;  /* lease/oplock break reserved slots */
        int credits;  /* credits reserved for all other operation types */

If we run low on credits we can disable (temporarily) leases and
sending echo requests so we can continue to send other requests (open,
read, write, close etc.).    As an example, if the server has granted
us 512 credits (total) if there were 4 large writes that were
responded to very slowly (and used up all of our credits), we could
time out if the write responses were very slow - since we would have
no way of sending an echo request periodically to see if the server
were still alive) - since we have 1 credit reserved for echo requests
though, even if the responses to the writes were slow, we would be
able to ping the server with an echo request to make sure it is still
alive.   Similarly (and this can be very important with some servers
who could hold up granting credits if a lease break is pending) - we
have to be able to respond to a lease break even if all of our credits
are used up with large reads or writes so we reserve at least one
credit for sending lease break responses.

The easiest way to think about this is that we reserve 1 credit for
echo and 1 credit for leases (although it can grow larger as we saw in
some server scenarios when they grant us more credits on lease break
responses than we asked for), but all the reset (all other remaining
credits) are available to send read or write or open or close (etc.).
  When requests in flight goes back to zero we rebalance credits to
make sure we still have 1 reserved for echo and 1 reserved for
oplock/lease break responses (and everything else for the other
operation types).

The scenario we were having a problem with though was one in which
requests inflight stayed above 0 for a long time and the server often
granted more credits than we asked for on lease break responses - this
caused the number of oplock/lease credits reserved to be larger than
the amount of credits for everything else and eventually starved the
client credits needed for normal operations (this would normally not
be an issue but the number of requests in flight stayed above zero for
a long time which kept us from rebalancing credits, and moving credits
back into the main credit pool from the oplock/lease break reserved
category - which could significantly hurt performance).

There is normally not a problem with having more than one credit in
the lease break pool, but when it grows particularly large it could
hurt performance (or even hang).





> If so, I really don't think any constant is appropriate. If the client
> can't calculate an expected number, we should keep it quite small.
Shyam Prasad N June 22, 2023, 6:42 a.m. UTC | #5
On Thu, Jun 22, 2023 at 10:26 AM Steve French <smfrench@gmail.com> wrote:
>
> On Wed, Jun 21, 2023 at 1:25 PM Tom Talpey <tom@talpey.com> wrote:
> >
> > On 6/20/2023 11:57 PM, Steve French wrote:
> > >> Why this value of 10? I would go with 1, since we already reserve 1
> > > credit for oplocks. If the reasoning is to have enough credits to send multiple
> > > lease/oplock acks, we should change the reserved count altogether.
> > >
> > > I think there could be some value in sending multiple lease break
> > > responses (ie allow oplock credits to be a few more than 1), but my
> > > main reasoning for this was to pick some number that was "safe"
> > > (allowing 10 oplock/lease-break credits while in flight count is
> > > non-zero is unlikely to be a problem) and would be unlikely to change
> > > existing behavior.
> > >
> > > My thinking was that today's code allows oplock credits to be above 1
> > > (and keep growing in the server scenario you noticed) while multiple
> > > requests continue to be in flight - so there could potentially be a
> > > performance benefit during this period of high activity in having a
> > > few lease breaks in flight at one time and unlikely to hurt anything -
> > > but more importantly if we change the code to never allow oplock/lease
> > > credits to be above one we could (unlikely but possible) have subtle
> > > behavior changes that trigger a bug (since we would then have cases to
> > > at least some servers where we never have two lease breaks in flight).
> > > It seemed harmless to set the threshold to something slightly more
> > > than one (so multiple lease breaks in flight would still be possible
> > > and thus behavior would not change - but risk of credit starvation is
> > > gone).    If you prefer - I could pick a number like 2 or 3 credits
> > > instead of 10.  My intent was just to make it extremely unlikely that
> > > any behavior would change (but would still fix the possible credit
> > > starvation scenario) - so 2 or 3 would also probably be fine.
> >
> > What do you mean by "oplock credits"? There's no such field in the
> > protocol. Is this some sort of reserved pool
>
> The client divides the total credits granted by the server into three
> buckets (see struct TCP_Server_Info)
>         int echo_credits;  /* echo reserved slots */
>         int oplock_credits;  /* lease/oplock break reserved slots */
>         int credits;  /* credits reserved for all other operation types */
>
> If we run low on credits we can disable (temporarily) leases and
> sending echo requests so we can continue to send other requests (open,
> read, write, close etc.).    As an example, if the server has granted
> us 512 credits (total) if there were 4 large writes that were
> responded to very slowly (and used up all of our credits), we could
> time out if the write responses were very slow - since we would have
> no way of sending an echo request periodically to see if the server
> were still alive) - since we have 1 credit reserved for echo requests
> though, even if the responses to the writes were slow, we would be
> able to ping the server with an echo request to make sure it is still
> alive.   Similarly (and this can be very important with some servers
> who could hold up granting credits if a lease break is pending) - we
> have to be able to respond to a lease break even if all of our credits
> are used up with large reads or writes so we reserve at least one
> credit for sending lease break responses.
>
> The easiest way to think about this is that we reserve 1 credit for
> echo and 1 credit for leases (although it can grow larger as we saw in
> some server scenarios when they grant us more credits on lease break
> responses than we asked for), but all the reset (all other remaining
> credits) are available to send read or write or open or close (etc.).
>   When requests in flight goes back to zero we rebalance credits to
> make sure we still have 1 reserved for echo and 1 reserved for
> oplock/lease break responses (and everything else for the other
> operation types).
>
> The scenario we were having a problem with though was one in which
> requests inflight stayed above 0 for a long time and the server often
> granted more credits than we asked for on lease break responses - this
> caused the number of oplock/lease credits reserved to be larger than
> the amount of credits for everything else and eventually starved the
> client credits needed for normal operations (this would normally not
> be an issue but the number of requests in flight stayed above zero for
> a long time which kept us from rebalancing credits, and moving credits
> back into the main credit pool from the oplock/lease break reserved
> category - which could significantly hurt performance).
>
> There is normally not a problem with having more than one credit in
> the lease break pool, but when it grows particularly large it could
> hurt performance (or even hang).
>
>
>
>
>
> > If so, I really don't think any constant is appropriate. If the client
> > can't calculate an expected number, we should keep it quite small.

Hi Steve,

During response handling, in case oplock_credits reach 0, we anyhow
have it stealing one credit from regular credits today...
        else if (server->in_flight > 0 && server->oplock_credits == 0 &&
                 server->oplocks) {
                if (server->credits > 1) {
                        server->credits--;
                        server->oplock_credits++;
                }
        }

So I don't think we need to reserve more than 1 credits for
oplock_credits at all.
I think the behaviour would not be affected by restricting oplock_credits to 1.
Tom Talpey June 22, 2023, 9:07 p.m. UTC | #6
On 6/22/2023 2:42 AM, Shyam Prasad N wrote:
> On Thu, Jun 22, 2023 at 10:26 AM Steve French <smfrench@gmail.com> wrote:
>>
>> On Wed, Jun 21, 2023 at 1:25 PM Tom Talpey <tom@talpey.com> wrote:
>>>
>>> On 6/20/2023 11:57 PM, Steve French wrote:
>>>>> Why this value of 10? I would go with 1, since we already reserve 1
>>>> credit for oplocks. If the reasoning is to have enough credits to send multiple
>>>> lease/oplock acks, we should change the reserved count altogether.
>>>>
>>>> I think there could be some value in sending multiple lease break
>>>> responses (ie allow oplock credits to be a few more than 1), but my
>>>> main reasoning for this was to pick some number that was "safe"
>>>> (allowing 10 oplock/lease-break credits while in flight count is
>>>> non-zero is unlikely to be a problem) and would be unlikely to change
>>>> existing behavior.
>>>>
>>>> My thinking was that today's code allows oplock credits to be above 1
>>>> (and keep growing in the server scenario you noticed) while multiple
>>>> requests continue to be in flight - so there could potentially be a
>>>> performance benefit during this period of high activity in having a
>>>> few lease breaks in flight at one time and unlikely to hurt anything -
>>>> but more importantly if we change the code to never allow oplock/lease
>>>> credits to be above one we could (unlikely but possible) have subtle
>>>> behavior changes that trigger a bug (since we would then have cases to
>>>> at least some servers where we never have two lease breaks in flight).
>>>> It seemed harmless to set the threshold to something slightly more
>>>> than one (so multiple lease breaks in flight would still be possible
>>>> and thus behavior would not change - but risk of credit starvation is
>>>> gone).    If you prefer - I could pick a number like 2 or 3 credits
>>>> instead of 10.  My intent was just to make it extremely unlikely that
>>>> any behavior would change (but would still fix the possible credit
>>>> starvation scenario) - so 2 or 3 would also probably be fine.
>>>
>>> What do you mean by "oplock credits"? There's no such field in the
>>> protocol. Is this some sort of reserved pool
>>
>> The client divides the total credits granted by the server into three
>> buckets (see struct TCP_Server_Info)
>>          int echo_credits;  /* echo reserved slots */
>>          int oplock_credits;  /* lease/oplock break reserved slots */
>>          int credits;  /* credits reserved for all other operation types */
>>
>> If we run low on credits we can disable (temporarily) leases and
>> sending echo requests so we can continue to send other requests (open,
>> read, write, close etc.).    As an example, if the server has granted
>> us 512 credits (total) if there were 4 large writes that were
>> responded to very slowly (and used up all of our credits), we could
>> time out if the write responses were very slow - since we would have
>> no way of sending an echo request periodically to see if the server
>> were still alive) - since we have 1 credit reserved for echo requests
>> though, even if the responses to the writes were slow, we would be
>> able to ping the server with an echo request to make sure it is still
>> alive.   Similarly (and this can be very important with some servers
>> who could hold up granting credits if a lease break is pending) - we
>> have to be able to respond to a lease break even if all of our credits
>> are used up with large reads or writes so we reserve at least one
>> credit for sending lease break responses.
>>
>> The easiest way to think about this is that we reserve 1 credit for
>> echo and 1 credit for leases (although it can grow larger as we saw in
>> some server scenarios when they grant us more credits on lease break
>> responses than we asked for), but all the reset (all other remaining
>> credits) are available to send read or write or open or close (etc.).
>>    When requests in flight goes back to zero we rebalance credits to
>> make sure we still have 1 reserved for echo and 1 reserved for
>> oplock/lease break responses (and everything else for the other
>> operation types).
>>
>> The scenario we were having a problem with though was one in which
>> requests inflight stayed above 0 for a long time and the server often
>> granted more credits than we asked for on lease break responses - this
>> caused the number of oplock/lease credits reserved to be larger than
>> the amount of credits for everything else and eventually starved the
>> client credits needed for normal operations (this would normally not
>> be an issue but the number of requests in flight stayed above zero for
>> a long time which kept us from rebalancing credits, and moving credits
>> back into the main credit pool from the oplock/lease break reserved
>> category - which could significantly hurt performance).
>>
>> There is normally not a problem with having more than one credit in
>> the lease break pool, but when it grows particularly large it could
>> hurt performance (or even hang).
>>
>>
>>
>>
>>
>>> If so, I really don't think any constant is appropriate. If the client
>>> can't calculate an expected number, we should keep it quite small.
> 
> Hi Steve,
> 
> During response handling, in case oplock_credits reach 0, we anyhow
> have it stealing one credit from regular credits today...
>          else if (server->in_flight > 0 && server->oplock_credits == 0 &&
>                   server->oplocks) {
>                  if (server->credits > 1) {
>                          server->credits--;
>                          server->oplock_credits++;
>                  }
>          }
> 
> So I don't think we need to reserve more than 1 credits for
> oplock_credits at all.
> I think the behaviour would not be affected by restricting oplock_credits to 1.

Good! And I really really hope we don't ever set echo_credits higher
than 1 either. I see no point in parallelizing the nearly-useless echo
procedure.

Tom.
Steve French June 22, 2023, 9:17 p.m. UTC | #7
On Thu, Jun 22, 2023 at 4:07 PM Tom Talpey <tom@talpey.com> wrote:
>
> On 6/22/2023 2:42 AM, Shyam Prasad N wrote:
> > On Thu, Jun 22, 2023 at 10:26 AM Steve French <smfrench@gmail.com> wrote:
> >>
> >> On Wed, Jun 21, 2023 at 1:25 PM Tom Talpey <tom@talpey.com> wrote:
> >>>
> >>> On 6/20/2023 11:57 PM, Steve French wrote:
> >>>>> Why this value of 10? I would go with 1, since we already reserve 1
> >>>> credit for oplocks. If the reasoning is to have enough credits to send multiple
> >>>> lease/oplock acks, we should change the reserved count altogether.
> >>>>
> >>>> I think there could be some value in sending multiple lease break
> >>>> responses (ie allow oplock credits to be a few more than 1), but my
> >>>> main reasoning for this was to pick some number that was "safe"
> >>>> (allowing 10 oplock/lease-break credits while in flight count is
> >>>> non-zero is unlikely to be a problem) and would be unlikely to change
> >>>> existing behavior.
> >>>>
> >>>> My thinking was that today's code allows oplock credits to be above 1
> >>>> (and keep growing in the server scenario you noticed) while multiple
> >>>> requests continue to be in flight - so there could potentially be a
> >>>> performance benefit during this period of high activity in having a
> >>>> few lease breaks in flight at one time and unlikely to hurt anything -
> >>>> but more importantly if we change the code to never allow oplock/lease
> >>>> credits to be above one we could (unlikely but possible) have subtle
> >>>> behavior changes that trigger a bug (since we would then have cases to
> >>>> at least some servers where we never have two lease breaks in flight).
> >>>> It seemed harmless to set the threshold to something slightly more
> >>>> than one (so multiple lease breaks in flight would still be possible
> >>>> and thus behavior would not change - but risk of credit starvation is
> >>>> gone).    If you prefer - I could pick a number like 2 or 3 credits
> >>>> instead of 10.  My intent was just to make it extremely unlikely that
> >>>> any behavior would change (but would still fix the possible credit
> >>>> starvation scenario) - so 2 or 3 would also probably be fine.
> >>>
> >>> What do you mean by "oplock credits"? There's no such field in the
> >>> protocol. Is this some sort of reserved pool
> >>
> >> The client divides the total credits granted by the server into three
> >> buckets (see struct TCP_Server_Info)
> >>          int echo_credits;  /* echo reserved slots */
> >>          int oplock_credits;  /* lease/oplock break reserved slots */
> >>          int credits;  /* credits reserved for all other operation types */
> >>
> >> If we run low on credits we can disable (temporarily) leases and
> >> sending echo requests so we can continue to send other requests (open,
> >> read, write, close etc.).    As an example, if the server has granted
> >> us 512 credits (total) if there were 4 large writes that were
> >> responded to very slowly (and used up all of our credits), we could
> >> time out if the write responses were very slow - since we would have
> >> no way of sending an echo request periodically to see if the server
> >> were still alive) - since we have 1 credit reserved for echo requests
> >> though, even if the responses to the writes were slow, we would be
> >> able to ping the server with an echo request to make sure it is still
> >> alive.   Similarly (and this can be very important with some servers
> >> who could hold up granting credits if a lease break is pending) - we
> >> have to be able to respond to a lease break even if all of our credits
> >> are used up with large reads or writes so we reserve at least one
> >> credit for sending lease break responses.
> >>
> >> The easiest way to think about this is that we reserve 1 credit for
> >> echo and 1 credit for leases (although it can grow larger as we saw in
> >> some server scenarios when they grant us more credits on lease break
> >> responses than we asked for), but all the reset (all other remaining
> >> credits) are available to send read or write or open or close (etc.).
> >>    When requests in flight goes back to zero we rebalance credits to
> >> make sure we still have 1 reserved for echo and 1 reserved for
> >> oplock/lease break responses (and everything else for the other
> >> operation types).
> >>
> >> The scenario we were having a problem with though was one in which
> >> requests inflight stayed above 0 for a long time and the server often
> >> granted more credits than we asked for on lease break responses - this
> >> caused the number of oplock/lease credits reserved to be larger than
> >> the amount of credits for everything else and eventually starved the
> >> client credits needed for normal operations (this would normally not
> >> be an issue but the number of requests in flight stayed above zero for
> >> a long time which kept us from rebalancing credits, and moving credits
> >> back into the main credit pool from the oplock/lease break reserved
> >> category - which could significantly hurt performance).
> >>
> >> There is normally not a problem with having more than one credit in
> >> the lease break pool, but when it grows particularly large it could
> >> hurt performance (or even hang).
> >>
> >>
> >>
> >>
> >>
> >>> If so, I really don't think any constant is appropriate. If the client
> >>> can't calculate an expected number, we should keep it quite small.
> >
> > Hi Steve,
> >
> > During response handling, in case oplock_credits reach 0, we anyhow
> > have it stealing one credit from regular credits today...
> >          else if (server->in_flight > 0 && server->oplock_credits == 0 &&
> >                   server->oplocks) {
> >                  if (server->credits > 1) {
> >                          server->credits--;
> >                          server->oplock_credits++;
> >                  }
> >          }
> >
> > So I don't think we need to reserve more than 1 credits for
> > oplock_credits at all.
> > I think the behaviour would not be affected by restricting oplock_credits to 1.

Current version of the patch in for-next has the threshold at 3 not 1
for the maximum
oplock/lease credits - presumably that is low enough, and current
behavior won't change much
but we avoid the credit starvation.

> Good! And I really really hope we don't ever set echo_credits higher
> than 1 either. I see no point in parallelizing the nearly-useless echo
> procedure.

I am not aware of any cases (or reported bugs) where echo credits
would go higher.
echo is just to periodically check if server is unresponsive/hung and
to reduce chance of reconnect issues.
It is rarely sent (every 60 seconds on inactive connections, and echo
interval can be set higher on mount if you prefer)
diff mbox series

Patch

From c50cae15903fc704c3df1170183c8505cd2eb0b9 Mon Sep 17 00:00:00 2001
From: Steve French <stfrench@microsoft.com>
Date: Mon, 19 Jun 2023 22:32:38 -0500
Subject: [PATCH] smb3: do not reserve too many oplock credits

There were cases reported where servers will sometimes return more
credits than requested on oplock break responses, which can lead to
most of the credits being allocated for oplock breaks (instead of
for normal operations like read and write) if number of SMB3 requests
in flight always stays above 0 (the oplock and echo credits are
rebalanced when in flight requests goes down to zero).

If oplock credits gets unexpectedly large (e.g. ten is more than it
would ever be expected to be) and in flight requests are greater than
zero, then rebalance the oplock credits and regular credits (go
back to reserving just one oplock crdit).

Signed-off-by: Steve French <stfrench@microsoft.com>
---
 fs/smb/client/smb2ops.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c
index a8bb9d00d33a..02780f175571 100644
--- a/fs/smb/client/smb2ops.c
+++ b/fs/smb/client/smb2ops.c
@@ -109,7 +109,11 @@  smb2_add_credits(struct TCP_Server_Info *server,
 			server->credits--;
 			server->oplock_credits++;
 		}
-	}
+	} else if ((server->in_flight > 0) && (server->oplock_credits > 10) &&
+		   ((optype & CIFS_OP_MASK) == CIFS_OBREAK_OP))
+		/* if now have too many oplock credits, rebalance so don't starve normal ops */
+		change_conf(server);
+
 	scredits = *val;
 	in_flight = server->in_flight;
 	spin_unlock(&server->req_lock);
-- 
2.34.1