diff mbox

[4/4] sd: Handle ZBC drives correctly

Message ID 1405931241-92015-5-git-send-email-hare@suse.de
State Not Applicable
Delegated to: David Miller
Headers show

Commit Message

Hannes Reinecke July 21, 2014, 8:27 a.m. UTC
ZBC drives are close to disk devices, so update sd.c to handle
ZBC drives correctly.

Signed-off-by: Hannes Reinecke <hare@suse.de>
---
 drivers/scsi/sd.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

Christoph Hellwig July 21, 2014, 11:29 a.m. UTC | #1
On Mon, Jul 21, 2014 at 10:27:21AM +0200, Hannes Reinecke wrote:
> ZBC drives are close to disk devices, so update sd.c to handle
> ZBC drives correctly.

While the other patches all look fine with me I strongly disagree with
this one.  Devices reporting the ZBC device type must have sequential
required zones, so they are _not_ anywhere close to disk devices,
and writes to them will cause errors that we aren't evenable able to
handle yet.

If it makes some sort of experiment easier for you we can add a way to
force ZBC devices to use sd.c conditionally.  Do you have a test device
that this works with?

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Reinecke July 21, 2014, 11:51 a.m. UTC | #2
On 07/21/2014 01:29 PM, Christoph Hellwig wrote:
> On Mon, Jul 21, 2014 at 10:27:21AM +0200, Hannes Reinecke wrote:
>> ZBC drives are close to disk devices, so update sd.c to handle
>> ZBC drives correctly.
>
> While the other patches all look fine with me I strongly disagree with
> this one.  Devices reporting the ZBC device type must have sequential
> required zones, so they are _not_ anywhere close to disk devices,
> and writes to them will cause errors that we aren't evenable able to
> handle yet.
>
> If it makes some sort of experiment easier for you we can add a way to
> force ZBC devices to use sd.c conditionally.  Do you have a test device
> that this works with?
>
Yes, I have.

I was actually considering implementing a module option for sd
(eg attach_zbc) to allow zac/zbc devices to be attached to the sd 
driver.
Which will be off per default, so your concern should be addressed.

However, zbc devices _are_ similar to normal 'sd' devices; it's only 
that some commands might fail unexpectedly.
But all the usual commands from sbc are supported, so I found it a 
bit odd having to implement my own device driver (which would be a 
clone of 'sd' anyway).

Cheers,

Hannes
Christoph Hellwig July 21, 2014, 3:01 p.m. UTC | #3
On Mon, Jul 21, 2014 at 01:51:26PM +0200, Hannes Reinecke wrote:
> Yes, I have.

How well does the SCSI layer recover when you write outside the write
pointer?  ZBC defeinds new sense codes that we don't handle, so I'd
expect we'd get long loops of unrecovered errors.

> I was actually considering implementing a module option for sd
> (eg attach_zbc) to allow zac/zbc devices to be attached to the sd driver.
> Which will be off per default, so your concern should be addressed.

That sounds fine to.

> However, zbc devices _are_ similar to normal 'sd' devices; it's only that
> some commands might fail unexpectedly.
> But all the usual commands from sbc are supported, so I found it a bit odd
> having to implement my own device driver (which would be a clone of 'sd'
> anyway).

They are similar in some ways and very different in others.  I
defintively agree we should share as much code as possible, but I'm not
sure exposing them as plain sd devices is a good idea.  I'm defintively
fine with an optional attachment to sd to start playing around with
different ways to make use of them.

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Utz July 21, 2014, 4:18 p.m. UTC | #4
If I read Hanne's patch correctly, he is checking for the ZBC signature when the drives are probed.

Thus, the only circumstance where adding zbc handling would lead to unexpected bad behavior in sd would be if a ZBC drive reported itself as a standard ATA drive or an ATA drive reported itself as a ZBC drive.
John Utz July 21, 2014, 4:30 p.m. UTC | #5
BAH! I get this ZBC/ZAC thing wrong still. mentally replace ATA with SCSI in my comments below....
Hannes Reinecke July 21, 2014, 5:49 p.m. UTC | #6
On 07/21/2014 06:18 PM, John Utz wrote:
> If I read Hanne's patch correctly, he is checking for the ZBC signature
 > when the drives are probed.
>
Yep.

> Thus, the only circumstance where adding zbc handling would lead to
 > unexpected bad behavior in sd would be if a ZBC drive reported itself
 > as a standard ATA drive or an ATA drive reported itself as a ZBC drive.

It's actually worse than that:
Currently libata will happily attach to a ZAC device, and passing them 
on to the SCSI stack as a normal disk device.
Which most definitely will lead to unexpected results :-)

Cheers,

Hannes
Christoph Hellwig July 22, 2014, 4:19 p.m. UTC | #7
Hi John,

I can't really comment much on the ATA side as I'm not overly uptodate
on the ZAC spec.

But a device with the 0x14 device type must have sequential required
zones, which will generate new sense codes when writes outside the
write pointer happens.  I'm very concerned about

 a) how the SCSI midlayer handles those sense codes

and

 b) how we can communicate them to the user of the device, e.g. the
    filesystem or user application

Just attaching the device by default without a good solution for these
seems dangerous to me.

--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John Utz July 22, 2014, 6:25 p.m. UTC | #8
(hand citing in outlook web access, sorry! i also took the liberty of reformatting your response a teeny bit.)
> ________________________________________
> From: Christoph Hellwig [hch@infradead.org]
> Sent: Tuesday, July 22, 2014 9:19 AM
> To: John Utz
> Cc: Hannes Reinecke; linux-ide@vger.kernel.org; linux-scsi@vger.kernel.org; James Bottomely
> Subject: Re: [PATCH 4/4] sd: Handle ZBC drives correctly
>
> Hi John,

Hi Christoph!

> I can't really comment much on the ATA side as I'm not overly uptodate
> on the ZAC spec.

It's OK, I have to chew on it because it's my day job, it's not reasonable to expect other folks to pay too much attention to it yet. :-)

I will say that it's helpful to consider it to be largely the same as the exiting ATA and SCSI DISK specs with a few extra new commands to make it adaptable to newer storage media designs.

>But a device with the 0x14 device type must have sequential required zones, 

Yes. 

> which will generate new sense codes when writes outside the write pointer happens.

Yes. and reads that will start or finish past the write pointer will also do something similar too.

> I'm very concerned about
>
>  a) how the SCSI midlayer handles those sense codes
>
> and
>
>  b) how we can communicate them to the user of the device, e.g. the filesystem or user application
>
> Just attaching the device by default without a good solution for these seems dangerous to me.

Your concerns are eminently practical and reasonable and you are not alone in possessing them.

But we gotta start somewhere. :-)

Actually, a couple of 'somewheres' and those 2 somewheres  map directly to your a and b concerns.

i am working on addressing 'b'.  i am using device-mapper to create a simulated ZAC target and this simulator will be used by hardworking file system vict.....um hardy pioneers.... to adapt their favorite file systems to do the right thing with respect to ZAC and ZBC.

This will not be a small job, and it will probably be easier for some file systems than others. 

Ultimately, i dont think that there will be much user space impact unless your userspace app is a database that manages its own disk access.

So let's talk about your concern 'a'. 

We are all blessed to be working at the cutting edge of linux.

IMHO, things like this are *supposed* to go into the tree here at the leading edge, because folks know not to be sticking bleeding edge kernels on to boxes that they run their lives and businesses on.

There have been many discussions/personal ruminations/guesses about how ZAC/ZBC is going to work and these things have progressed to the point where folks now have to check their hypotheticals into the tree so that we can all see what will happen.

So I would argue that this is actually a good thing because it's inspiring smart folks such as yourself to point out the snakepits that have jumped out at *you* when you saw Hannes's proposed checkin. :-)

Perhaps those of us who will be in Chicago in August who are interested in the topic should go find a hallway to hog up for a while and talk about this? :-)

just my us$0.02!

tnx!

johnu

Really,  alot of progress has been made since 
--
To unsubscribe from this list: send the line "unsubscribe linux-ide" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index 377a520..29c981b 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -161,7 +161,7 @@  cache_type_store(struct device *dev, struct device_attribute *attr,
 	static const char temp[] = "temporary ";
 	int len;
 
-	if (sdp->type != TYPE_DISK)
+	if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
 		/* no cache control on RBC devices; theoretically they
 		 * can do it, but there's probably so many exceptions
 		 * it's not worth the risk */
@@ -259,7 +259,7 @@  allow_restart_store(struct device *dev, struct device_attribute *attr,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
-	if (sdp->type != TYPE_DISK)
+	if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
 		return -EINVAL;
 
 	sdp->allow_restart = simple_strtoul(buf, NULL, 10);
@@ -389,7 +389,7 @@  provisioning_mode_store(struct device *dev, struct device_attribute *attr,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
-	if (sdp->type != TYPE_DISK)
+	if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
 		return -EINVAL;
 
 	if (!strncmp(buf, lbp_mode[SD_LBP_UNMAP], 20))
@@ -456,7 +456,7 @@  max_write_same_blocks_store(struct device *dev, struct device_attribute *attr,
 	if (!capable(CAP_SYS_ADMIN))
 		return -EACCES;
 
-	if (sdp->type != TYPE_DISK)
+	if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
 		return -EINVAL;
 
 	err = kstrtoul(buf, 10, &max);
@@ -2537,7 +2537,7 @@  static void sd_read_app_tag_own(struct scsi_disk *sdkp, unsigned char *buffer)
 	struct scsi_mode_data data;
 	struct scsi_sense_hdr sshdr;
 
-	if (sdp->type != TYPE_DISK)
+	if (sdp->type != TYPE_DISK && sdp->type != TYPE_ZBC)
 		return;
 
 	if (sdkp->protection_type == 0)