From patchwork Thu Mar 12 02:31:48 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shaohua Li X-Patchwork-Id: 449281 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id CAA9B140187 for ; Thu, 12 Mar 2015 13:32:30 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="verification failed; unprotected key" header.d=fb.com header.i=@fb.com header.b=ZLDz0D8t; dkim-adsp=none (unprotected policy); dkim-atps=neutral Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751687AbbCLCc3 (ORCPT ); Wed, 11 Mar 2015 22:32:29 -0400 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:24375 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751587AbbCLCc2 (ORCPT ); Wed, 11 Mar 2015 22:32:28 -0400 Received: from pps.filterd (m0004077 [127.0.0.1]) by mx0b-00082601.pphosted.com (8.14.5/8.14.5) with SMTP id t2C2S2DF003958; Wed, 11 Mar 2015 19:31:59 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=fb.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to; s=facebook; bh=0U3Ihl1DNbm1FDxG7dnD9qt2TCaDTSA5Pel6e8entyE=; b=ZLDz0D8t2ckO+SmJnOQvSygS+7Tzen+9GJLYbQfVjJo2Zw407WTf0jF2WWnNI0LJhMPX JSNqTlnVI8bdNvDJRLoNks2H2R5Y7AP9JV/7a4dKIjsAmWKvXGirLR3F4yPr6a7xgbV0 x84VtGOh8s/LVcuSvl2TX3OQc8TYWCjsChM= Received: from mail.thefacebook.com ([199.201.64.23]) by mx0b-00082601.pphosted.com with ESMTP id 1t30nb8ad2-2 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Wed, 11 Mar 2015 19:31:59 -0700 Received: from devbig257.prn2.facebook.com (192.168.16.4) by mail.thefacebook.com (192.168.16.12) with Microsoft SMTP Server (TLS) id 14.3.195.1; Wed, 11 Mar 2015 19:31:58 -0700 Date: Wed, 11 Mar 2015 19:31:48 -0700 From: Shaohua Li To: Tony Battersby CC: Jens Axboe , Tejun Heo , , Christoph Hellwig , Dan Williams , Subject: Re: [PATCH] libata: revert "libata: use blk taging" et al. Message-ID: <20150312023148.GA1960870@devbig257.prn2.facebook.com> References: <5500863D.4070807@cybernetics.com> <5500B783.20106@fb.com> <5500BF6F.6080000@cybernetics.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <5500BF6F.6080000@cybernetics.com> User-Agent: Mutt/1.5.20 (2009-12-10) X-Originating-IP: [192.168.16.4] X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.13.68, 1.0.33, 0.0.0000 definitions=2015-03-12_01:2015-03-11, 2015-03-12, 1970-01-01 signatures=0 X-Proofpoint-Spam-Details: rule=fb_default_notspam policy=fb_default score=0 kscore.is_bulkscore=0 kscore.compositescore=0 circleOfTrustscore=0 compositescore=0.985052395226116 suspectscore=0 recipient_domain_to_sender_totalscore=0 phishscore=0 bulkscore=0 kscore.is_spamscore=0 rbsscore=0.985052395226116 recipient_to_sender_totalscore=0 recipient_domain_to_sender_domain_totalscore=0 spamscore=0 recipient_to_sender_domain_totalscore=0 urlsuspectscore=0.985052395226116 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=7.0.1-1402240000 definitions=main-1503120026 X-FB-Internal: deliver Sender: linux-ide-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ide@vger.kernel.org On Wed, Mar 11, 2015 at 06:19:27PM -0400, Tony Battersby wrote: > On 03/11/2015 05:45 PM, Jens Axboe wrote: > > On 03/11/2015 02:15 PM, Tony Battersby wrote: > >> This reverts commits 12cb5ce101abfaf74421f8cc9f196e708209eb79 and > >> 98bd4be1ba95f2fe7f543910792b7163a5de06eb. > >> > >> Commit 12cb5ce101ab ("libata: use blk taging") causes the following oops > >> with scsi-mq enabled: > >> > >> BUG: unable to handle kernel NULL pointer dereference at 0000000000000058 > >> IP: [] ata_qc_new_init+0x3e/0x120 > >> PGD 32adf0067 PUD 32adf1067 PMD 0 > >> Oops: 0002 [#1] SMP DEBUG_PAGEALLOC > >> Modules linked in: iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi igb > >> i2c_algo_bit ptp pps_core pm80xx libsas scsi_transport_sas sg coretemp > >> eeprom w83795 i2c_i801 > >> CPU: 4 PID: 1450 Comm: cydiskbench Not tainted 4.0.0-rc3 #1 > >> Hardware name: Supermicro X8DTH-i/6/iF/6F/X8DTH, BIOS 2.1b 05/04/12 > >> task: ffff8800ba86d500 ti: ffff88032a064000 task.ti: ffff88032a064000 > >> RIP: 0010:[] [] ata_qc_new_init+0x3e/0x120 > >> RSP: 0018:ffff88032a067858 EFLAGS: 00010046 > >> RAX: 0000000000000000 RBX: ffff8800ba0d2230 RCX: 000000000000002a > >> RDX: ffffffff80505ae0 RSI: 0000000000000020 RDI: ffff8800ba0d2230 > >> RBP: ffff88032a067868 R08: 0000000000000201 R09: 0000000000000001 > >> R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800ba0d0000 > >> R13: ffff8800ba0d2230 R14: ffffffff80505ae0 R15: ffff8800ba0d0000 > >> FS: 0000000041223950(0063) GS:ffff88033e480000(0000) knlGS:0000000000000000 > >> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > >> CR2: 0000000000000058 CR3: 000000032a0a3000 CR4: 00000000000006e0 > >> Stack: > >> ffff880329eee758 ffff880329eee758 ffff88032a0678a8 ffffffff80502dad > >> ffff8800ba167978 ffff880329eee758 ffff88032bf9c520 ffff8800ba167978 > >> ffff88032bf9c520 ffff88032bf9a290 ffff88032a0678b8 ffffffff80506909 > >> Call Trace: > >> [] ata_scsi_translate+0x3d/0x1b0 > >> [] ata_sas_queuecmd+0x149/0x2a0 > >> [] sas_queuecommand+0xa0/0x1f0 [libsas] > >> [] scsi_dispatch_cmd+0xd4/0x1a0 > >> [] scsi_queue_rq+0x66f/0x7f0 > >> [] __blk_mq_run_hw_queue+0x208/0x3f0 > >> [] blk_mq_run_hw_queue+0x88/0xc0 > >> [] blk_mq_insert_request+0xc4/0x130 > >> [] blk_execute_rq_nowait+0x73/0x160 > >> [] sg_common_write+0x3da/0x720 [sg] > >> [] ? might_fault+0x5e/0xb0 > >> [] sg_new_write+0x250/0x360 [sg] > >> [] ? __lock_acquire+0x50c/0xc10 > >> [] ? lock_release_non_nested+0xa7/0x360 > >> [] ? _raw_spin_unlock_irqrestore+0x3b/0x60 > >> [] ? might_fault+0x5e/0xb0 > >> [] ? might_fault+0x5e/0xb0 > >> [] sg_write+0x13b/0x450 [sg] > >> [] ? __lock_acquire+0x50c/0xc10 > >> [] ? do_futex+0x109/0xbf0 > >> [] ? might_fault+0x5e/0xb0 > >> [] vfs_write+0xd1/0x1b0 > >> [] SyS_write+0x54/0xc0 > >> [] system_call_fastpath+0x12/0x17 > >> Code: 24 20 04 0f 85 ec 00 00 00 49 83 3c 24 00 0f 84 cf 00 00 00 83 fe 1f > >> 0f 87 dc 00 00 00 89 f0 48 69 c0 f0 00 00 00 49 8d 44 04 40 <89> 70 58 48 > >> c7 40 10 00 00 00 00 4c 89 20 48 89 58 08 c7 40 64 > >> RIP [] ata_qc_new_init+0x3e/0x120 > >> RSP > >> CR2: 0000000000000058 > >> ---[ end trace 43f5eefb64627eff ]--- > >> > >> > >> scsi-mq uses a host-wide tag map shared among all devices with some > >> integer tag values >= ATA_MAX_QUEUE. These unexpectedly high tag values > >> cause __ata_qc_from_tag() to return NULL, which is then dereferenced in > >> ata_qc_new_init(), causing the oops above. > > Wait, something is missing here. We should not be getting tag values > > that are >= ATA_MAX_QUEUE. Instead of reverting this, we need to figure > > out why this is happening, and fix it. That is correct way forward here. > > What setup is this being reproduced on? > > > > Hardware: > PMC 8001 SAS HBA (PCI ID 117c:0042, PCI sub-ID 117c:0043) using pm80xx > driver > 4 SATA disks directly connected (no expander) > > Test procedure: > Send disk read/write commands to SATA disks using the SCSI generic (sg) > driver. > > Analysis: > > shost->can_queue is 508 > > With my patch applied to revert the problematic commits, I added the > following code to ata_scsi_qc_new(): > > int tag = cmd->request->tag; > static int max_tag; > if (tag > max_tag) { > max_tag = tag; > printk(KERN_DEBUG "max tag %d\n", tag); > } > > Testing one SATA disk at a time with scsi-mq enabled, I get a max tag of > 64. Testing 4 disks at a time with scsi-mq enabled gives a max tag of > 194. With scsi-mq disabled, I get a max tag of 30 no matter how many > disks I test. Can you please try this debug patch: --- To unsubscribe from this list: send the line "unsubscribe linux-ide" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/scsi/libsas/sas_ata.c b/drivers/scsi/libsas/sas_ata.c index 932d9cc..46f153f 100644 --- a/drivers/scsi/libsas/sas_ata.c +++ b/drivers/scsi/libsas/sas_ata.c @@ -572,7 +572,6 @@ int sas_ata_init(struct domain_device *found_dev) ap->private_data = found_dev; ap->cbl = ATA_CBL_SATA; - ap->scsi_host = shost; rc = ata_sas_port_init(ap); if (rc) { ata_sas_port_destroy(ap);