Message ID | 20120106021707.GA6276@google.com |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
Hello, again. On Thu, Jan 05, 2012 at 06:17:07PM -0800, Tejun Heo wrote: > When two requests are merged, if the absorbed request is older than > the absorbing one, cfq_merged_requests() tries to reposition it in the > cfqq->fifo list by list_move()'ing the absorbing request to the > absorbed one before removing it. > > This works if both requests are on the same cfqq but nothing > guarantees that and the code ends up moving the merged request to a > different cfqq's fifo list without adjusting the rest. This leads to > the following failures. > > * A request may be on the fifo list of a cfqq without holding > reference to it and the cfqq can be freed before requst is finished. > Among other things, this triggers list debug warning and slab debug > use-after-free warning. > > * As a request can be on the wrong fifo queue, it may be issued and > completed before its cfqq is scheduled. If the cfqq didn't have > other requests on it, it would be empty by the time it's dispatched > triggering BUG_ON() in cfq_dispatch_request(). > > Fix it by making cfq_merged_requests() scan the absorbing request's > fifo list for the correct slot and move there instead. Hmmm... while the patch would fix the problem. It isn't entirely correct. The root cause is, 1. q->last_merge and rqhash used to be used only for merging bios into requests and that queries elevator whether the merge should be allowed. cfq disallows merging if they belong to different cfqqs. 2. request-request merging didn't use to use q->last_merge or rqhash to find request candidates. It used elv_former/latter_request() and cfq never returned request from a different cfqq. 3. Plug merging started using q->last_merge and rqhash and now elevator can't prevent cross cfqq merges. So, yeah, the right fix would be using elv_former/latter_request() instead. Maybe we should strip out rqhash altogether and change elevator handle everything? I don't know. I'll prepare a different fix patch soon. Thanks.
On Thu, 2012-01-05 at 18:17 -0800, Tejun Heo wrote: > When two requests are merged, if the absorbed request is older than > the absorbing one, cfq_merged_requests() tries to reposition it in the > cfqq->fifo list by list_move()'ing the absorbing request to the > absorbed one before removing it. > > This works if both requests are on the same cfqq but nothing > guarantees that and the code ends up moving the merged request to a > different cfqq's fifo list without adjusting the rest. This leads to > the following failures. > > * A request may be on the fifo list of a cfqq without holding > reference to it and the cfqq can be freed before requst is finished. > Among other things, this triggers list debug warning and slab debug > use-after-free warning. > > * As a request can be on the wrong fifo queue, it may be issued and > completed before its cfqq is scheduled. If the cfqq didn't have > other requests on it, it would be empty by the time it's dispatched > triggering BUG_ON() in cfq_dispatch_request(). > > Fix it by making cfq_merged_requests() scan the absorbing request's > fifo list for the correct slot and move there instead. > > Signed-off-by: Tejun Heo <tj@kernel.org> > Reported-by: Hugh Dickins <hughd@google.com> > Cc: stable@vger.kernel.org > --- > It survived my testing long enough and I'm relatively confident this > should fix the crash but I might have gotten the scanning wrong, so > please pay extra attention there. > > I suspect we just didn't have enough backward request-request merges > before the recent plug merge updates to trigger this bug. The patch itself looks good to me, but I'm wondering if we really need do reposition of the fifo list for merged request. it's rare case and not worthy such complexity to me. Thanks, Shaohua -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, On Fri, Jan 06, 2012 at 11:14:15AM +0800, Shaohua Li wrote: > > So, yeah, the right fix would be using elv_former/latter_request() > > instead. Maybe we should strip out rqhash altogether and change > > elevator handle everything? I don't know. I'll prepare a different > > fix patch soon. > > So not allow merge from two cfq queues strictly? This will impact > performance. I don't know how important the strict isolation is. we even > allow two cfq queues merge to improve performance. That's how cfq has behaved before this recent plug merge breakage and IIRC why the cooperating queue thing is there. If you want to change the behavior, that should be an explicit separate patch. Thanks.
On Thu, 2012-01-05 at 18:36 -0800, Tejun Heo wrote: > Hello, again. > > On Thu, Jan 05, 2012 at 06:17:07PM -0800, Tejun Heo wrote: > > When two requests are merged, if the absorbed request is older than > > the absorbing one, cfq_merged_requests() tries to reposition it in the > > cfqq->fifo list by list_move()'ing the absorbing request to the > > absorbed one before removing it. > > > > This works if both requests are on the same cfqq but nothing > > guarantees that and the code ends up moving the merged request to a > > different cfqq's fifo list without adjusting the rest. This leads to > > the following failures. > > > > * A request may be on the fifo list of a cfqq without holding > > reference to it and the cfqq can be freed before requst is finished. > > Among other things, this triggers list debug warning and slab debug > > use-after-free warning. > > > > * As a request can be on the wrong fifo queue, it may be issued and > > completed before its cfqq is scheduled. If the cfqq didn't have > > other requests on it, it would be empty by the time it's dispatched > > triggering BUG_ON() in cfq_dispatch_request(). > > > > Fix it by making cfq_merged_requests() scan the absorbing request's > > fifo list for the correct slot and move there instead. > > Hmmm... while the patch would fix the problem. It isn't entirely > correct. The root cause is, > > 1. q->last_merge and rqhash used to be used only for merging bios into > requests and that queries elevator whether the merge should be > allowed. cfq disallows merging if they belong to different cfqqs. > > 2. request-request merging didn't use to use q->last_merge or rqhash to > find request candidates. It used elv_former/latter_request() and > cfq never returned request from a different cfqq. > > 3. Plug merging started using q->last_merge and rqhash and now > elevator can't prevent cross cfqq merges. > > So, yeah, the right fix would be using elv_former/latter_request() > instead. Maybe we should strip out rqhash altogether and change > elevator handle everything? I don't know. I'll prepare a different > fix patch soon. So not allow merge from two cfq queues strictly? This will impact performance. I don't know how important the strict isolation is. we even allow two cfq queues merge to improve performance. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Jan 5, 2012 at 7:34 PM, Shaohua Li <shaohua.li@intel.com> wrote: >> That's how cfq has behaved before this recent plug merge breakage and >> IIRC why the cooperating queue thing is there. If you want to change >> the behavior, that should be an explicit separate patch. > My point is both cooperating merge and the plug merge of different cfq > are merge, no reason we allow one but disallow the other. plug merge > isn't a breakage to me. Isolation is pretty big deal for cfq and cross cfqq merging happening without cfq noticing it isn't gonna be helpful to the cause. Why don't we merge bio's across different cfqq's then?
Ummmm... I've been looking at the code and currently I think the best option is yank out plug merging for this merge window. Bypassing rqhash for request merging doesn't seem like a good idea. rqhash itself is, at this point, redundant and the limitation that it can only find requests by the ending sectors lead to weird designs. The single ->last_merge was okay but the recursive back merging is just ugly. What it should be doing is trying back merge and then front merge once for each request insertion as the usual merge path does. We can't do attempt_back/front_merge() for INSERT_MERGE at this point because elv_latter/former_request() only works for requests which are already on elevator and, unfortunately, putting a request onto cfq may kick the queue directly and the request might already be gone by the time we try to merge it. For this merge window, I think we better just disable INSERT_MERGE. A mid-term solution could be changing elevator interface such that elevator_add_req_fn() doesn't kick the queue directly but notify elevator core that the queue needs kicking via return value, so that merging can happen before kicking the queue. Note that there's a caveat here. Merging might make kicking unnecessary or wrong. Better solution would be changing elevator merge logic so that it has "give me request closest to this offset" interface and then use prev/next from there to find out merge candidates for both bio-rq and rq-rq merges and just kill rqhash. Thanks.
On Thu, 2012-01-05 at 19:04 -0800, Tejun Heo wrote: > Hello, > > On Fri, Jan 06, 2012 at 11:14:15AM +0800, Shaohua Li wrote: > > > So, yeah, the right fix would be using elv_former/latter_request() > > > instead. Maybe we should strip out rqhash altogether and change > > > elevator handle everything? I don't know. I'll prepare a different > > > fix patch soon. > > > > So not allow merge from two cfq queues strictly? This will impact > > performance. I don't know how important the strict isolation is. we even > > allow two cfq queues merge to improve performance. > > That's how cfq has behaved before this recent plug merge breakage and > IIRC why the cooperating queue thing is there. If you want to change > the behavior, that should be an explicit separate patch. My point is both cooperating merge and the plug merge of different cfq are merge, no reason we allow one but disallow the other. plug merge isn't a breakage to me. Thanks, Shaohua -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2012-01-05 at 19:22 -0800, Tejun Heo wrote: > On Thu, Jan 5, 2012 at 7:34 PM, Shaohua Li <shaohua.li@intel.com> wrote: > >> That's how cfq has behaved before this recent plug merge breakage and > >> IIRC why the cooperating queue thing is there. If you want to change > >> the behavior, that should be an explicit separate patch. > > My point is both cooperating merge and the plug merge of different cfq > > are merge, no reason we allow one but disallow the other. plug merge > > isn't a breakage to me. > > Isolation is pretty big deal for cfq and cross cfqq merging happening > without cfq noticing it isn't gonna be helpful to the cause. Why > don't we merge bio's across different cfqq's then? don't know. I don't think a tweak for merge impacts isolation so much. For rotate disk, request size hasn't impact to request cost, so this doesn't impact isolation. Even for ssd, big size request is more efficient to dispatch. And we already have breakage of fairness for SSD, such as no idle. -- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hello, On Thu, Jan 5, 2012 at 8:15 PM, Shaohua Li <shaohua.li@intel.com> wrote: > don't know. I don't think a tweak for merge impacts isolation so much. > For rotate disk, request size hasn't impact to request cost, so this > doesn't impact isolation. Even for ssd, big size request is more > efficient to dispatch. And we already have breakage of fairness for SSD, > such as no idle. I'm not saying they shouldn't be merged but the decision should be elevator's. Block core shouldn't decide it for the elevator. So, whether cross cfqq merge is a good idea or not is mostly irrelevant in this thread. Thanks.
different cfqq's fifo list without adjusting the rest. This leads to the following failures. * A request may be on the fifo list of a cfqq without holding reference to it and the cfqq can be freed before requst is finished. Among other things, this triggers list debug warning and slab debug use-after-free warning. * As a request can be on the wrong fifo queue, it may be issued and completed before its cfqq is scheduled. If the cfqq didn't have other requests on it, it would be empty by the time it's dispatched triggering BUG_ON() in cfq_dispatch_request(). Fix it by making cfq_merged_requests() scan the absorbing request's fifo list for the correct slot and move there instead. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: Hugh Dickins <hughd@google.com> Cc: stable@vger.kernel.org --- It survived my testing long enough and I'm relatively confident this should fix the crash but I might have gotten the scanning wrong, so please pay extra attention there. I suspect we just didn't have enough backward request-request merges before the recent plug merge updates to trigger this bug. Thanks. block/cfq-iosched.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c index 163263d..a235cf3 100644 --- a/block/cfq-iosched.c +++ b/block/cfq-iosched.c @@ -1663,8 +1663,15 @@ cfq_merged_requests(struct request_queue *q, struct request *rq, */ if (!list_empty(&rq->queuelist) && !list_empty(&next->queuelist) && time_before(rq_fifo_time(next), rq_fifo_time(rq))) { - list_move(&rq->queuelist, &next->queuelist); + struct request *pos = rq; + rq_set_fifo_time(rq, rq_fifo_time(next)); + + list_for_each_entry_continue_reverse(pos, &cfqq->fifo, queuelist) + if (time_before_eq(rq_fifo_time(pos), rq_fifo_time(rq))) + break; + if (rq != pos) + list_move(&rq->queuelist, &pos->queuelist); } if (cfqq->next_rq == next)