diff mbox

[v2,1/3] mtd: introduce the mtd_pairing_scheme concept

Message ID 1466430618-9713-2-git-send-email-boris.brezillon@free-electrons.com
State Superseded
Headers show

Commit Message

Boris Brezillon June 20, 2016, 1:50 p.m. UTC
MLC and TLC NAND devices are using NAND cells exposing more than one bit,
but instead of attaching all the bits in a given cell to a single NAND
page, each bit is usually attached to a different page. This concept is
called 'page pairing', and has significant impacts on the flash storage
usage.
The main problem showed by these devices is that interrupting a page
program operation may not only corrupt the page we are programming
but also the page it is paired with, hence the need to expose to MTD
users the pairing scheme information.

The pairing APIs allows one to query pairing information attached to a
given page (here called wunit), or the other way around (the wunit
pointed by pairing information).
It also provides several helpers to help the conversion between absolute
offsets and wunits, and query the number of pairing groups.

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
---
 drivers/mtd/mtdcore.c   |  94 ++++++++++++++++++++++++++++++++++++++++++
 drivers/mtd/mtdpart.c   |   1 +
 include/linux/mtd/mtd.h | 106 ++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 201 insertions(+)

Comments

Brian Norris Aug. 4, 2016, 4:37 a.m. UTC | #1
Hi Boris,

On Mon, Jun 20, 2016 at 03:50:16PM +0200, Boris Brezillon wrote:
> MLC and TLC NAND devices are using NAND cells exposing more than one bit,
> but instead of attaching all the bits in a given cell to a single NAND
> page, each bit is usually attached to a different page. This concept is
> called 'page pairing', and has significant impacts on the flash storage
> usage.
> The main problem showed by these devices is that interrupting a page
> program operation may not only corrupt the page we are programming
> but also the page it is paired with, hence the need to expose to MTD
> users the pairing scheme information.
> 
> The pairing APIs allows one to query pairing information attached to a
> given page (here called wunit), or the other way around (the wunit
> pointed by pairing information).
> It also provides several helpers to help the conversion between absolute
> offsets and wunits, and query the number of pairing groups.
> 
> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>

Overall, the comments and documentation are a lot better on this one.
Thanks for doing that! I only have a few more small comments, and with
those, I think it's ready to land IMO. I'll try to review the NAND
implementation bits too (look OK for now), but I'm not as worried about
that, if we agree on the high-level API.

BTW, I don't know if we're likely to hit any conflicts on the
mtdcore and mtd.h bits. Perhaps it will make sense for us to apply this
first patch as a mini-branch to both our trees? Maybe if you just fixup
any last comments, you can send me a trivial pull request / tag /
whatever (doesn't need to be formal), with just this patch.

> ---
>  drivers/mtd/mtdcore.c   |  94 ++++++++++++++++++++++++++++++++++++++++++
>  drivers/mtd/mtdpart.c   |   1 +
>  include/linux/mtd/mtd.h | 106 ++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 201 insertions(+)
> 
> diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
> index e3936b847c6b..decceb9fdf32 100644
> --- a/drivers/mtd/mtdcore.c
> +++ b/drivers/mtd/mtdcore.c
> @@ -376,6 +376,100 @@ static int mtd_reboot_notifier(struct notifier_block *n, unsigned long state,
>  }
>  
>  /**
> + * mtd_wunit_to_pairing_info - get pairing information of a wunit
> + * @mtd: pointer to new MTD device info structure
> + * @wunit: write unit we are interrested in

s/interrested/interested/

> + * @info: pairing information struct

Maybe something to indicate this is the return value? e.g., "returned
pairing information"?

> + *
> + * Retrieve pairing information associated to the wunit.
> + * This is mainly useful when dealing with MLC/TLC NANDs where pages can be
> + * paired together, and where programming a page may influence the page it is
> + * paired with.
> + * The notion of page is replaced by the term wunit (write-unit) to stay
> + * consistent with the ->writesize field.
> + *
> + * The @wunit argument can be extracted from an absolute offset using
> + * mtd_offset_to_wunit(). @info is filled with the pairing information attached
> + * to @wunit.
> + *
> + * From the pairing info the MTD user can find all the wunits paired with
> + * @wunit using the following loop:
> + *
> + * for (i = 0; i < mtd_pairing_groups(mtd); i++) {
> + *	info.pair = i;
> + *	mtd_pairing_info_to_wunit(mtd, &info);
> + *	...
> + * }
> + */
> +void mtd_wunit_to_pairing_info(struct mtd_info *mtd, int wunit,
> +			       struct mtd_pairing_info *info)
> +{

Do we want to do any range-checking here? i.e., make this return int? Or
is that too paranoid? We've done similarly on most of the rest of the
MTD API.

Notably, I think we're probably safe keeping the ->pairing->get_info()
callback as returning void, since the driver can expect this core helper
to do the range checking for us.

> +	if (!mtd->pairing || !mtd->pairing->get_info) {
> +		info->group = 0;
> +		info->pair = wunit;
> +	} else {
> +		mtd->pairing->get_info(mtd, wunit, info);
> +	}
> +}
> +EXPORT_SYMBOL_GPL(mtd_wunit_to_pairing_info);
> +
> +/**
> + * mtd_wunit_to_pairing_info - get wunit from pairing information
> + * @mtd: pointer to new MTD device info structure
> + * @info: pairing information struct
> + *
> + * Returns a positive number representing the wunit associated to the info
> + * struct, or a negative error code.
> + *
> + * This is the reverse of mtd_wunit_to_pairing_info(), and can help one to
> + * iterate over all wunits of a given pair (see mtd_wunit_to_pairing_info()
> + * doc).
> + *
> + * It can also be used to only program the first page of each pair (i.e.
> + * page attached to group 0), which allows one to use an MLC NAND in
> + * software-emulated SLC mode:
> + *
> + * info.group = 0;
> + * for (info.pair = 0; info < mtd_wunit_per_eb(mtd); info.pair++) {

(I know it's just example code, but...) the second clause should have
'info.pair < ...', not 'info < ...'.

> + *	wunit = mtd_pairing_info_to_wunit(mtd, &info);
> + *	mtd_write(mtd, mtd_wunit_to_offset(mtd, blkoffs, wunit),
> + *		  mtd->writesize, &retlen, buf + (i * mtd->writesize));
> + * }
> + */
> +int mtd_pairing_info_to_wunit(struct mtd_info *mtd,
> +			      const struct mtd_pairing_info *info)
> +{

Any range checking on info->group or info->pair? What about
NULL-checking 'info'?

> +	if (!mtd->pairing || !mtd->pairing->get_info) {
> +		if (info->group)
> +			return -EINVAL;
> +
> +		return info->pair;
> +	}
> +
> +	return mtd->pairing->get_wunit(mtd, info);
> +}
> +EXPORT_SYMBOL_GPL(mtd_pairing_info_to_wunit);
> +
> +/**
> + * mtd_pairing_groups - get the number of pairing groups
> + * @mtd: pointer to new MTD device info structure
> + *
> + * Returns the number of pairing groups.
> + *
> + * This number is usually equal to the number of bits exposed by a single
> + * cell, and can be used in conjunction with mtd_pairing_info_to_wunit()
> + * to iterate over all pages of a given pair.
> + */
> +int mtd_pairing_groups(struct mtd_info *mtd)
> +{
> +	if (!mtd->pairing || !mtd->pairing->ngroups)
> +		return 1;
> +
> +	return mtd->pairing->ngroups;
> +}
> +EXPORT_SYMBOL_GPL(mtd_pairing_groups);
> +
> +/**
>   *	add_mtd_device - register an MTD device
>   *	@mtd: pointer to new MTD device info structure
>   *
> diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
> index 1f13e32556f8..e32a0ac2298f 100644
> --- a/drivers/mtd/mtdpart.c
> +++ b/drivers/mtd/mtdpart.c
> @@ -397,6 +397,7 @@ static struct mtd_part *allocate_partition(struct mtd_info *master,
>  	slave->mtd.oobsize = master->oobsize;
>  	slave->mtd.oobavail = master->oobavail;
>  	slave->mtd.subpage_sft = master->subpage_sft;
> +	slave->mtd.pairing = master->pairing;
>  
>  	slave->mtd.name = name;
>  	slave->mtd.owner = master->owner;
> diff --git a/include/linux/mtd/mtd.h b/include/linux/mtd/mtd.h
> index 29a170612203..00bcacb16176 100644
> --- a/include/linux/mtd/mtd.h
> +++ b/include/linux/mtd/mtd.h
> @@ -127,6 +127,81 @@ struct mtd_ooblayout_ops {
>  		    struct mtd_oob_region *oobfree);
>  };
>  
> +/**
> + * struct mtd_pairing_info - page pairing information
> + *
> + * @pair: pair id
> + * @group: group id
> + *
> + * The pair word is used here, even though TLC NANDs might group pages by 3

Nit: "The pair word is used" is somewhat confusing on first read, IMO. I
think maybe it's partly the ordering of the words, as well as the use
"word" which has different technical meaning sometime... Maybe one of
the following?

  The word "pair" is used here ...
  The term "pair" is used here ...

(Sorry, very nitpicky.)

> + * (3 bits in a single cell). A pair should regroup all pages that are sharing
> + * the same cell. Pairs are then indexed in ascending order.
> + *
> + * @group is defining the position of a page in a given pair. It can also be
> + * seen as the bit position in the cell: page attached to bit 0 belongs to
> + * group 0, page attached to bit 1 belongs to group 1, etc.
> + *
> + * Example:
> + * The H27UCG8T2BTR-BC datasheet describes the following pairing scheme:
> + *
> + *		group-0		group-1
> + *
> + *  pair-0	page-0		page-4
> + *  pair-1	page-1		page-5
> + *  pair-2	page-2		page-8
> + *  ...
> + *  pair-127	page-251	page-255
> + *
> + *
> + * Note that the "group" and "pair" terms were extracted from Samsung and
> + * Hynix datasheets, and might be referenced under other names in other
> + * datasheets (Micron is describing this concept as "shared pages").

Very, very helpful (to me, even though I'm moderately familiar with the
concepts, but hopefully moreso for others who want to read and
understand this). Thanks for writing this up.

> + */
> +struct mtd_pairing_info {
> +	int pair;
> +	int group;
> +};
> +
> +/**
> + * struct mtd_pairing_scheme - page pairing scheme description
> + *
> + * @ngroups: number of groups. Should be related to the number of bits
> + *	     per cell.
> + * @get_info: converts a write-unit (page number within an erase block) into
> + *	      mtd_pairing information (pair + group). This function should
> + *	      fill the info parameter based on the wunit index.
> + * @get_wunit: converts pairing information into a write-unit (page) number.
> + *	       This function should return the wunit index pointed by the
> + *	       pairing information described in the info argument. It should
> + *	       return -EINVAL, if there's no wunit corresponding to the
> + *	       passed pairing information.
> + *
> + * See mtd_pairing_info documentation for a detailed explanation of the
> + * pair and group concepts.
> + *
> + * The mtd_pairing_scheme structure provides a generic solution to represent
> + * NAND page pairing scheme. Instead of exposing two big tables to do the
> + * write-unit <-> (pair + group) conversions, we ask the MTD drivers to
> + * implement the ->get_info() and ->get_wunit() functions.
> + *
> + * MTD users will then be able to query these information by using the
> + * mtd_pairing_info_to_wunit() and mtd_wunit_to_pairing_info() helpers.
> + *
> + * @ngroups is here to help MTD users iterating over all the pages in a
> + * given pair. This value can be retrieved by MTD users using the
> + * mtd_pairing_groups() helper.
> + *
> + * Examples are given in the mtd_pairing_info_to_wunit() and
> + * mtd_wunit_to_pairing_info() documentation.
> + */
> +struct mtd_pairing_scheme {
> +	int ngroups;
> +	void (*get_info)(struct mtd_info *mtd, int wunit,
> +			 struct mtd_pairing_info *info);
> +	int (*get_wunit)(struct mtd_info *mtd,
> +			 const struct mtd_pairing_info *info);

Wait, I noted above that get_info() doesn't return errors (and that's
OK, if we do bounds checking in mtdcore), but why does get_wunit(),
then? From the looks of it, you don't actually do any bounds checking in
the implementations in patch 2, right? And couldn't we do any checking
in the mtdcore.c helper anyway?

Unless I'm misunderstanding something, I think we should have both
return errors, or neither.

> +};
> +
>  struct module;	/* only needed for owner field in mtd_info */
>  
>  struct mtd_info {
> @@ -188,6 +263,9 @@ struct mtd_info {
>  	/* OOB layout description */
>  	const struct mtd_ooblayout_ops *ooblayout;
>  
> +	/* NAND pairing scheme, only provided for MLC/TLC NANDs */
> +	const struct mtd_pairing_scheme *pairing;
> +
>  	/* the ecc step size. */
>  	unsigned int ecc_step_size;
>  
> @@ -296,6 +374,12 @@ static inline void mtd_set_ooblayout(struct mtd_info *mtd,
>  	mtd->ooblayout = ooblayout;
>  }
>  
> +static inline void mtd_set_pairing_scheme(struct mtd_info *mtd,
> +				const struct mtd_pairing_scheme *pairing)
> +{
> +	mtd->pairing = pairing;
> +}
> +
>  static inline void mtd_set_of_node(struct mtd_info *mtd,
>  				   struct device_node *np)
>  {
> @@ -312,6 +396,11 @@ static inline int mtd_oobavail(struct mtd_info *mtd, struct mtd_oob_ops *ops)
>  	return ops->mode == MTD_OPS_AUTO_OOB ? mtd->oobavail : mtd->oobsize;
>  }
>  
> +void mtd_wunit_to_pairing_info(struct mtd_info *mtd, int wunit,
> +			       struct mtd_pairing_info *info);
> +int mtd_pairing_info_to_wunit(struct mtd_info *mtd,
> +			      const struct mtd_pairing_info *info);
> +int mtd_pairing_groups(struct mtd_info *mtd);
>  int mtd_erase(struct mtd_info *mtd, struct erase_info *instr);
>  int mtd_point(struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen,
>  	      void **virt, resource_size_t *phys);
> @@ -397,6 +486,23 @@ static inline uint32_t mtd_mod_by_ws(uint64_t sz, struct mtd_info *mtd)
>  	return do_div(sz, mtd->writesize);
>  }
>  
> +static inline int mtd_wunit_per_eb(struct mtd_info *mtd)
> +{
> +	return mtd->erasesize / mtd->writesize;
> +}
> +
> +static inline int mtd_offset_to_wunit(struct mtd_info *mtd, loff_t offs)
> +{
> +	return mtd_div_by_ws(mtd_mod_by_eb(offs, mtd), mtd);
> +}
> +
> +static inline loff_t mtd_wunit_to_offset(struct mtd_info *mtd, loff_t base,
> +					 int wunit)
> +{
> +	return base + (wunit * mtd->writesize);
> +}
> +
> +
>  static inline int mtd_has_oob(const struct mtd_info *mtd)
>  {
>  	return mtd->_read_oob && mtd->_write_oob;

With the above addressed:

Reviewed-by: Brian Norris <computersforpeace@gmail.com>
Boris Brezillon Aug. 8, 2016, 10:42 p.m. UTC | #2
Hi Brian,

On Thu, 4 Aug 2016 12:37:51 +0800
Brian Norris <computersforpeace@gmail.com> wrote:

> Hi Boris,
> 
> On Mon, Jun 20, 2016 at 03:50:16PM +0200, Boris Brezillon wrote:
> > MLC and TLC NAND devices are using NAND cells exposing more than one bit,
> > but instead of attaching all the bits in a given cell to a single NAND
> > page, each bit is usually attached to a different page. This concept is
> > called 'page pairing', and has significant impacts on the flash storage
> > usage.
> > The main problem showed by these devices is that interrupting a page
> > program operation may not only corrupt the page we are programming
> > but also the page it is paired with, hence the need to expose to MTD
> > users the pairing scheme information.
> > 
> > The pairing APIs allows one to query pairing information attached to a
> > given page (here called wunit), or the other way around (the wunit
> > pointed by pairing information).
> > It also provides several helpers to help the conversion between absolute
> > offsets and wunits, and query the number of pairing groups.
> > 
> > Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>  
> 
> Overall, the comments and documentation are a lot better on this one.
> Thanks for doing that! I only have a few more small comments, and with
> those, I think it's ready to land IMO. I'll try to review the NAND
> implementation bits too (look OK for now), but I'm not as worried about
> that, if we agree on the high-level API.
> 
> BTW, I don't know if we're likely to hit any conflicts on the
> mtdcore and mtd.h bits. Perhaps it will make sense for us to apply this
> first patch as a mini-branch to both our trees? Maybe if you just fixup
> any last comments, you can send me a trivial pull request / tag /
> whatever (doesn't need to be formal), with just this patch.

Sure.

> 
> > ---
> >  drivers/mtd/mtdcore.c   |  94 ++++++++++++++++++++++++++++++++++++++++++
> >  drivers/mtd/mtdpart.c   |   1 +
> >  include/linux/mtd/mtd.h | 106 ++++++++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 201 insertions(+)
> > 
> > diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
> > index e3936b847c6b..decceb9fdf32 100644
> > --- a/drivers/mtd/mtdcore.c
> > +++ b/drivers/mtd/mtdcore.c
> > @@ -376,6 +376,100 @@ static int mtd_reboot_notifier(struct notifier_block *n, unsigned long state,
> >  }
> >  
> >  /**
> > + * mtd_wunit_to_pairing_info - get pairing information of a wunit
> > + * @mtd: pointer to new MTD device info structure
> > + * @wunit: write unit we are interrested in  
> 
> s/interrested/interested/
> 
> > + * @info: pairing information struct  
> 
> Maybe something to indicate this is the return value? e.g., "returned
> pairing information"?

I'll change the description.

> 
> > + *
> > + * Retrieve pairing information associated to the wunit.
> > + * This is mainly useful when dealing with MLC/TLC NANDs where pages can be
> > + * paired together, and where programming a page may influence the page it is
> > + * paired with.
> > + * The notion of page is replaced by the term wunit (write-unit) to stay
> > + * consistent with the ->writesize field.
> > + *
> > + * The @wunit argument can be extracted from an absolute offset using
> > + * mtd_offset_to_wunit(). @info is filled with the pairing information attached
> > + * to @wunit.
> > + *
> > + * From the pairing info the MTD user can find all the wunits paired with
> > + * @wunit using the following loop:
> > + *
> > + * for (i = 0; i < mtd_pairing_groups(mtd); i++) {
> > + *	info.pair = i;
> > + *	mtd_pairing_info_to_wunit(mtd, &info);
> > + *	...
> > + * }
> > + */
> > +void mtd_wunit_to_pairing_info(struct mtd_info *mtd, int wunit,
> > +			       struct mtd_pairing_info *info)
> > +{  
> 
> Do we want to do any range-checking here? i.e., make this return int? Or
> is that too paranoid? We've done similarly on most of the rest of the
> MTD API.

I'm fine changing the prototype to return an int (with -ERANGE if the
wunit parameter is exceeding the number of write-units per eraseblock).
As you say later in your review, we'd better be consistent on the
->get_info()/->get_wunit() semantic.

> 
> Notably, I think we're probably safe keeping the ->pairing->get_info()
> callback as returning void, since the driver can expect this core helper
> to do the range checking for us.
> 
> > +	if (!mtd->pairing || !mtd->pairing->get_info) {
> > +		info->group = 0;
> > +		info->pair = wunit;
> > +	} else {
> > +		mtd->pairing->get_info(mtd, wunit, info);
> > +	}
> > +}
> > +EXPORT_SYMBOL_GPL(mtd_wunit_to_pairing_info);
> > +
> > +/**
> > + * mtd_wunit_to_pairing_info - get wunit from pairing information
> > + * @mtd: pointer to new MTD device info structure
> > + * @info: pairing information struct
> > + *
> > + * Returns a positive number representing the wunit associated to the info
> > + * struct, or a negative error code.
> > + *
> > + * This is the reverse of mtd_wunit_to_pairing_info(), and can help one to
> > + * iterate over all wunits of a given pair (see mtd_wunit_to_pairing_info()
> > + * doc).
> > + *
> > + * It can also be used to only program the first page of each pair (i.e.
> > + * page attached to group 0), which allows one to use an MLC NAND in
> > + * software-emulated SLC mode:
> > + *
> > + * info.group = 0;
> > + * for (info.pair = 0; info < mtd_wunit_per_eb(mtd); info.pair++) {  
> 
> (I know it's just example code, but...) the second clause should have
> 'info.pair < ...', not 'info < ...'.

I'll fix the example.

> 
> > + *	wunit = mtd_pairing_info_to_wunit(mtd, &info);
> > + *	mtd_write(mtd, mtd_wunit_to_offset(mtd, blkoffs, wunit),
> > + *		  mtd->writesize, &retlen, buf + (i * mtd->writesize));
> > + * }
> > + */
> > +int mtd_pairing_info_to_wunit(struct mtd_info *mtd,
> > +			      const struct mtd_pairing_info *info)
> > +{  
> 
> Any range checking on info->group or info->pair? What about
> NULL-checking 'info'?

I'll add those checks.

> 
> > +	if (!mtd->pairing || !mtd->pairing->get_info) {
> > +		if (info->group)
> > +			return -EINVAL;
> > +
> > +		return info->pair;
> > +	}
> > +
> > +	return mtd->pairing->get_wunit(mtd, info);
> > +}
> > +EXPORT_SYMBOL_GPL(mtd_pairing_info_to_wunit);
> > +
> > +/**
> > + * mtd_pairing_groups - get the number of pairing groups
> > + * @mtd: pointer to new MTD device info structure
> > + *
> > + * Returns the number of pairing groups.
> > + *
> > + * This number is usually equal to the number of bits exposed by a single
> > + * cell, and can be used in conjunction with mtd_pairing_info_to_wunit()
> > + * to iterate over all pages of a given pair.
> > + */
> > +int mtd_pairing_groups(struct mtd_info *mtd)
> > +{
> > +	if (!mtd->pairing || !mtd->pairing->ngroups)
> > +		return 1;
> > +
> > +	return mtd->pairing->ngroups;
> > +}
> > +EXPORT_SYMBOL_GPL(mtd_pairing_groups);
> > +
> > +/**
> >   *	add_mtd_device - register an MTD device
> >   *	@mtd: pointer to new MTD device info structure
> >   *
> > diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
> > index 1f13e32556f8..e32a0ac2298f 100644
> > --- a/drivers/mtd/mtdpart.c
> > +++ b/drivers/mtd/mtdpart.c
> > @@ -397,6 +397,7 @@ static struct mtd_part *allocate_partition(struct mtd_info *master,
> >  	slave->mtd.oobsize = master->oobsize;
> >  	slave->mtd.oobavail = master->oobavail;
> >  	slave->mtd.subpage_sft = master->subpage_sft;
> > +	slave->mtd.pairing = master->pairing;
> >  
> >  	slave->mtd.name = name;
> >  	slave->mtd.owner = master->owner;
> > diff --git a/include/linux/mtd/mtd.h b/include/linux/mtd/mtd.h
> > index 29a170612203..00bcacb16176 100644
> > --- a/include/linux/mtd/mtd.h
> > +++ b/include/linux/mtd/mtd.h
> > @@ -127,6 +127,81 @@ struct mtd_ooblayout_ops {
> >  		    struct mtd_oob_region *oobfree);
> >  };
> >  
> > +/**
> > + * struct mtd_pairing_info - page pairing information
> > + *
> > + * @pair: pair id
> > + * @group: group id
> > + *
> > + * The pair word is used here, even though TLC NANDs might group pages by 3  
> 
> Nit: "The pair word is used" is somewhat confusing on first read, IMO. I
> think maybe it's partly the ordering of the words, as well as the use
> "word" which has different technical meaning sometime... Maybe one of
> the following?
> 
>   The word "pair" is used here ...
>   The term "pair" is used here ...
> 
> (Sorry, very nitpicky.)

No problem :), I'll pick one of those.

> 
> > + * (3 bits in a single cell). A pair should regroup all pages that are sharing
> > + * the same cell. Pairs are then indexed in ascending order.
> > + *
> > + * @group is defining the position of a page in a given pair. It can also be
> > + * seen as the bit position in the cell: page attached to bit 0 belongs to
> > + * group 0, page attached to bit 1 belongs to group 1, etc.
> > + *
> > + * Example:
> > + * The H27UCG8T2BTR-BC datasheet describes the following pairing scheme:
> > + *
> > + *		group-0		group-1
> > + *
> > + *  pair-0	page-0		page-4
> > + *  pair-1	page-1		page-5
> > + *  pair-2	page-2		page-8
> > + *  ...
> > + *  pair-127	page-251	page-255
> > + *
> > + *
> > + * Note that the "group" and "pair" terms were extracted from Samsung and
> > + * Hynix datasheets, and might be referenced under other names in other
> > + * datasheets (Micron is describing this concept as "shared pages").  
> 
> Very, very helpful (to me, even though I'm moderately familiar with the
> concepts, but hopefully moreso for others who want to read and
> understand this). Thanks for writing this up.

Actually, the more I think about it, the more I doubt those terms are
appropriate (even if they are widely used in technical documents).

How about using the following names instead:

struct mtd_cell_sharing_scheme {
	...
};

struct mtd_cell_sharing_info {
	/* the bit position in the cell */
	int bitpos;
	/*
	 * What was previously known as 'pair': an id representing a
	 * group of cells forming a 'pair of pages'.
	 * I can't find a good description/word for this concept. Do
	 * you have better ideas?
	 */
	int group;
};

What do you think?

> 
> > + */
> > +struct mtd_pairing_info {
> > +	int pair;
> > +	int group;
> > +};
> > +
> > +/**
> > + * struct mtd_pairing_scheme - page pairing scheme description
> > + *
> > + * @ngroups: number of groups. Should be related to the number of bits
> > + *	     per cell.
> > + * @get_info: converts a write-unit (page number within an erase block) into
> > + *	      mtd_pairing information (pair + group). This function should
> > + *	      fill the info parameter based on the wunit index.
> > + * @get_wunit: converts pairing information into a write-unit (page) number.
> > + *	       This function should return the wunit index pointed by the
> > + *	       pairing information described in the info argument. It should
> > + *	       return -EINVAL, if there's no wunit corresponding to the
> > + *	       passed pairing information.
> > + *
> > + * See mtd_pairing_info documentation for a detailed explanation of the
> > + * pair and group concepts.
> > + *
> > + * The mtd_pairing_scheme structure provides a generic solution to represent
> > + * NAND page pairing scheme. Instead of exposing two big tables to do the
> > + * write-unit <-> (pair + group) conversions, we ask the MTD drivers to
> > + * implement the ->get_info() and ->get_wunit() functions.
> > + *
> > + * MTD users will then be able to query these information by using the
> > + * mtd_pairing_info_to_wunit() and mtd_wunit_to_pairing_info() helpers.
> > + *
> > + * @ngroups is here to help MTD users iterating over all the pages in a
> > + * given pair. This value can be retrieved by MTD users using the
> > + * mtd_pairing_groups() helper.
> > + *
> > + * Examples are given in the mtd_pairing_info_to_wunit() and
> > + * mtd_wunit_to_pairing_info() documentation.
> > + */
> > +struct mtd_pairing_scheme {
> > +	int ngroups;
> > +	void (*get_info)(struct mtd_info *mtd, int wunit,
> > +			 struct mtd_pairing_info *info);
> > +	int (*get_wunit)(struct mtd_info *mtd,
> > +			 const struct mtd_pairing_info *info);  
> 
> Wait, I noted above that get_info() doesn't return errors (and that's
> OK, if we do bounds checking in mtdcore), but why does get_wunit(),
> then? From the looks of it, you don't actually do any bounds checking in
> the implementations in patch 2, right? And couldn't we do any checking
> in the mtdcore.c helper anyway?
> 
> Unless I'm misunderstanding something, I think we should have both
> return errors, or neither.

I agree. ->get-info() could fill mtd_pairing_info to reflect an error,
but that's confusing. Let's switch to functions returning int and patch
the implementation to do bounds checking. 

> 
> > +};
> > +
> >  struct module;	/* only needed for owner field in mtd_info */
> >  
> >  struct mtd_info {
> > @@ -188,6 +263,9 @@ struct mtd_info {
> >  	/* OOB layout description */
> >  	const struct mtd_ooblayout_ops *ooblayout;
> >  
> > +	/* NAND pairing scheme, only provided for MLC/TLC NANDs */
> > +	const struct mtd_pairing_scheme *pairing;
> > +
> >  	/* the ecc step size. */
> >  	unsigned int ecc_step_size;
> >  
> > @@ -296,6 +374,12 @@ static inline void mtd_set_ooblayout(struct mtd_info *mtd,
> >  	mtd->ooblayout = ooblayout;
> >  }
> >  
> > +static inline void mtd_set_pairing_scheme(struct mtd_info *mtd,
> > +				const struct mtd_pairing_scheme *pairing)
> > +{
> > +	mtd->pairing = pairing;
> > +}
> > +
> >  static inline void mtd_set_of_node(struct mtd_info *mtd,
> >  				   struct device_node *np)
> >  {
> > @@ -312,6 +396,11 @@ static inline int mtd_oobavail(struct mtd_info *mtd, struct mtd_oob_ops *ops)
> >  	return ops->mode == MTD_OPS_AUTO_OOB ? mtd->oobavail : mtd->oobsize;
> >  }
> >  
> > +void mtd_wunit_to_pairing_info(struct mtd_info *mtd, int wunit,
> > +			       struct mtd_pairing_info *info);
> > +int mtd_pairing_info_to_wunit(struct mtd_info *mtd,
> > +			      const struct mtd_pairing_info *info);
> > +int mtd_pairing_groups(struct mtd_info *mtd);
> >  int mtd_erase(struct mtd_info *mtd, struct erase_info *instr);
> >  int mtd_point(struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen,
> >  	      void **virt, resource_size_t *phys);
> > @@ -397,6 +486,23 @@ static inline uint32_t mtd_mod_by_ws(uint64_t sz, struct mtd_info *mtd)
> >  	return do_div(sz, mtd->writesize);
> >  }
> >  
> > +static inline int mtd_wunit_per_eb(struct mtd_info *mtd)
> > +{
> > +	return mtd->erasesize / mtd->writesize;
> > +}
> > +
> > +static inline int mtd_offset_to_wunit(struct mtd_info *mtd, loff_t offs)
> > +{
> > +	return mtd_div_by_ws(mtd_mod_by_eb(offs, mtd), mtd);
> > +}
> > +
> > +static inline loff_t mtd_wunit_to_offset(struct mtd_info *mtd, loff_t base,
> > +					 int wunit)
> > +{
> > +	return base + (wunit * mtd->writesize);
> > +}
> > +
> > +
> >  static inline int mtd_has_oob(const struct mtd_info *mtd)
> >  {
> >  	return mtd->_read_oob && mtd->_write_oob;  
> 
> With the above addressed:
> 
> Reviewed-by: Brian Norris <computersforpeace@gmail.com>

Thanks for the review!

Regards,

Boris
Brian Norris Sept. 1, 2016, 6:15 p.m. UTC | #3
Hi,

I've had this on my plate to respond to for a while now, and I haven't
brought myself to actually care that much about the choice. So I'll
respond now to keep from leaving you hanging, but I'm not sure I'm that
helpful :(

On Tue, Aug 09, 2016 at 12:42:18AM +0200, Boris Brezillon wrote:
> On Thu, 4 Aug 2016 12:37:51 +0800
> Brian Norris <computersforpeace@gmail.com> wrote:
> > On Mon, Jun 20, 2016 at 03:50:16PM +0200, Boris Brezillon wrote:
> 
> > 
> > > + * (3 bits in a single cell). A pair should regroup all pages that are sharing
> > > + * the same cell. Pairs are then indexed in ascending order.
> > > + *
> > > + * @group is defining the position of a page in a given pair. It can also be
> > > + * seen as the bit position in the cell: page attached to bit 0 belongs to
> > > + * group 0, page attached to bit 1 belongs to group 1, etc.
> > > + *
> > > + * Example:
> > > + * The H27UCG8T2BTR-BC datasheet describes the following pairing scheme:
> > > + *
> > > + *		group-0		group-1
> > > + *
> > > + *  pair-0	page-0		page-4
> > > + *  pair-1	page-1		page-5
> > > + *  pair-2	page-2		page-8
> > > + *  ...
> > > + *  pair-127	page-251	page-255
> > > + *
> > > + *
> > > + * Note that the "group" and "pair" terms were extracted from Samsung and
> > > + * Hynix datasheets, and might be referenced under other names in other
> > > + * datasheets (Micron is describing this concept as "shared pages").  
> > 
> > Very, very helpful (to me, even though I'm moderately familiar with the
> > concepts, but hopefully moreso for others who want to read and
> > understand this). Thanks for writing this up.
> 
> Actually, the more I think about it, the more I doubt those terms are
> appropriate (even if they are widely used in technical documents).
> 
> How about using the following names instead:
> 
> struct mtd_cell_sharing_scheme {
> 	...
> };
> 
> struct mtd_cell_sharing_info {
> 	/* the bit position in the cell */
> 	int bitpos;
> 	/*
> 	 * What was previously known as 'pair': an id representing a

Wait, so you're replacing the literature's "pair" term with "group", but
the literature already used "group" to mean something else? That seems
to be an unwise choice. (Or I'm misreading you.)

> 	 * group of cells forming a 'pair of pages'.
> 	 * I can't find a good description/word for this concept. Do
> 	 * you have better ideas?
> 	 */
> 	int group;
> };
> 
> What do you think?

I think there's something to be said for matching the literature out
there, and I personally thought that simply providing a little bit of
clarifying explanation in the comments was sufficient. But if you feel
like choosing a more generic name is better, then that's probably OK
too. So other than the above comment (don't overload terms too freely!),
I'd use your judgment.

FWIW, it still takes me a while to parse what the "pair" and "group" (or
"bitpos" and "group" -- although "bitpos" is actually quite clear, so I
guess I like that) actually mean, so I tend to refer back to these
comments every time I'm reading it.

Brian
Boris Brezillon Sept. 4, 2016, 7:06 p.m. UTC | #4
On Thu, 1 Sep 2016 11:15:24 -0700
Brian Norris <computersforpeace@gmail.com> wrote:

> Hi,
> 
> I've had this on my plate to respond to for a while now, and I haven't
> brought myself to actually care that much about the choice. So I'll
> respond now to keep from leaving you hanging, but I'm not sure I'm that
> helpful :(

No problem. Actually, I've been busy with other problems too.

> 
> On Tue, Aug 09, 2016 at 12:42:18AM +0200, Boris Brezillon wrote:
> > On Thu, 4 Aug 2016 12:37:51 +0800
> > Brian Norris <computersforpeace@gmail.com> wrote:  
> > > On Mon, Jun 20, 2016 at 03:50:16PM +0200, Boris Brezillon wrote:  
> >   
> > >   
> > > > + * (3 bits in a single cell). A pair should regroup all pages that are sharing
> > > > + * the same cell. Pairs are then indexed in ascending order.
> > > > + *
> > > > + * @group is defining the position of a page in a given pair. It can also be
> > > > + * seen as the bit position in the cell: page attached to bit 0 belongs to
> > > > + * group 0, page attached to bit 1 belongs to group 1, etc.
> > > > + *
> > > > + * Example:
> > > > + * The H27UCG8T2BTR-BC datasheet describes the following pairing scheme:
> > > > + *
> > > > + *		group-0		group-1
> > > > + *
> > > > + *  pair-0	page-0		page-4
> > > > + *  pair-1	page-1		page-5
> > > > + *  pair-2	page-2		page-8
> > > > + *  ...
> > > > + *  pair-127	page-251	page-255
> > > > + *
> > > > + *
> > > > + * Note that the "group" and "pair" terms were extracted from Samsung and
> > > > + * Hynix datasheets, and might be referenced under other names in other
> > > > + * datasheets (Micron is describing this concept as "shared pages").    
> > > 
> > > Very, very helpful (to me, even though I'm moderately familiar with the
> > > concepts, but hopefully moreso for others who want to read and
> > > understand this). Thanks for writing this up.  
> > 
> > Actually, the more I think about it, the more I doubt those terms are
> > appropriate (even if they are widely used in technical documents).
> > 
> > How about using the following names instead:
> > 
> > struct mtd_cell_sharing_scheme {
> > 	...
> > };
> > 
> > struct mtd_cell_sharing_info {
> > 	/* the bit position in the cell */
> > 	int bitpos;
> > 	/*
> > 	 * What was previously known as 'pair': an id representing a  
> 
> Wait, so you're replacing the literature's "pair" term with "group", but
> the literature already used "group" to mean something else? That seems
> to be an unwise choice. (Or I'm misreading you.)
> 
> > 	 * group of cells forming a 'pair of pages'.
> > 	 * I can't find a good description/word for this concept. Do
> > 	 * you have better ideas?
> > 	 */
> > 	int group;
> > };
> > 
> > What do you think?  
> 
> I think there's something to be said for matching the literature out
> there, and I personally thought that simply providing a little bit of
> clarifying explanation in the comments was sufficient. But if you feel
> like choosing a more generic name is better, then that's probably OK
> too. So other than the above comment (don't overload terms too freely!),
> I'd use your judgment.
> 
> FWIW, it still takes me a while to parse what the "pair" and "group" (or
> "bitpos" and "group" -- although "bitpos" is actually quite clear, so I
> guess I like that) actually mean, so I tend to refer back to these
> comments every time I'm reading it.

Let's stick to my first proposal. I'll address you comment and send a
new version. If you're happy with it, I'll create a branch that we can
share and ask you to pull it.

Thanks,

Boris
diff mbox

Patch

diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
index e3936b847c6b..decceb9fdf32 100644
--- a/drivers/mtd/mtdcore.c
+++ b/drivers/mtd/mtdcore.c
@@ -376,6 +376,100 @@  static int mtd_reboot_notifier(struct notifier_block *n, unsigned long state,
 }
 
 /**
+ * mtd_wunit_to_pairing_info - get pairing information of a wunit
+ * @mtd: pointer to new MTD device info structure
+ * @wunit: write unit we are interrested in
+ * @info: pairing information struct
+ *
+ * Retrieve pairing information associated to the wunit.
+ * This is mainly useful when dealing with MLC/TLC NANDs where pages can be
+ * paired together, and where programming a page may influence the page it is
+ * paired with.
+ * The notion of page is replaced by the term wunit (write-unit) to stay
+ * consistent with the ->writesize field.
+ *
+ * The @wunit argument can be extracted from an absolute offset using
+ * mtd_offset_to_wunit(). @info is filled with the pairing information attached
+ * to @wunit.
+ *
+ * From the pairing info the MTD user can find all the wunits paired with
+ * @wunit using the following loop:
+ *
+ * for (i = 0; i < mtd_pairing_groups(mtd); i++) {
+ *	info.pair = i;
+ *	mtd_pairing_info_to_wunit(mtd, &info);
+ *	...
+ * }
+ */
+void mtd_wunit_to_pairing_info(struct mtd_info *mtd, int wunit,
+			       struct mtd_pairing_info *info)
+{
+	if (!mtd->pairing || !mtd->pairing->get_info) {
+		info->group = 0;
+		info->pair = wunit;
+	} else {
+		mtd->pairing->get_info(mtd, wunit, info);
+	}
+}
+EXPORT_SYMBOL_GPL(mtd_wunit_to_pairing_info);
+
+/**
+ * mtd_wunit_to_pairing_info - get wunit from pairing information
+ * @mtd: pointer to new MTD device info structure
+ * @info: pairing information struct
+ *
+ * Returns a positive number representing the wunit associated to the info
+ * struct, or a negative error code.
+ *
+ * This is the reverse of mtd_wunit_to_pairing_info(), and can help one to
+ * iterate over all wunits of a given pair (see mtd_wunit_to_pairing_info()
+ * doc).
+ *
+ * It can also be used to only program the first page of each pair (i.e.
+ * page attached to group 0), which allows one to use an MLC NAND in
+ * software-emulated SLC mode:
+ *
+ * info.group = 0;
+ * for (info.pair = 0; info < mtd_wunit_per_eb(mtd); info.pair++) {
+ *	wunit = mtd_pairing_info_to_wunit(mtd, &info);
+ *	mtd_write(mtd, mtd_wunit_to_offset(mtd, blkoffs, wunit),
+ *		  mtd->writesize, &retlen, buf + (i * mtd->writesize));
+ * }
+ */
+int mtd_pairing_info_to_wunit(struct mtd_info *mtd,
+			      const struct mtd_pairing_info *info)
+{
+	if (!mtd->pairing || !mtd->pairing->get_info) {
+		if (info->group)
+			return -EINVAL;
+
+		return info->pair;
+	}
+
+	return mtd->pairing->get_wunit(mtd, info);
+}
+EXPORT_SYMBOL_GPL(mtd_pairing_info_to_wunit);
+
+/**
+ * mtd_pairing_groups - get the number of pairing groups
+ * @mtd: pointer to new MTD device info structure
+ *
+ * Returns the number of pairing groups.
+ *
+ * This number is usually equal to the number of bits exposed by a single
+ * cell, and can be used in conjunction with mtd_pairing_info_to_wunit()
+ * to iterate over all pages of a given pair.
+ */
+int mtd_pairing_groups(struct mtd_info *mtd)
+{
+	if (!mtd->pairing || !mtd->pairing->ngroups)
+		return 1;
+
+	return mtd->pairing->ngroups;
+}
+EXPORT_SYMBOL_GPL(mtd_pairing_groups);
+
+/**
  *	add_mtd_device - register an MTD device
  *	@mtd: pointer to new MTD device info structure
  *
diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
index 1f13e32556f8..e32a0ac2298f 100644
--- a/drivers/mtd/mtdpart.c
+++ b/drivers/mtd/mtdpart.c
@@ -397,6 +397,7 @@  static struct mtd_part *allocate_partition(struct mtd_info *master,
 	slave->mtd.oobsize = master->oobsize;
 	slave->mtd.oobavail = master->oobavail;
 	slave->mtd.subpage_sft = master->subpage_sft;
+	slave->mtd.pairing = master->pairing;
 
 	slave->mtd.name = name;
 	slave->mtd.owner = master->owner;
diff --git a/include/linux/mtd/mtd.h b/include/linux/mtd/mtd.h
index 29a170612203..00bcacb16176 100644
--- a/include/linux/mtd/mtd.h
+++ b/include/linux/mtd/mtd.h
@@ -127,6 +127,81 @@  struct mtd_ooblayout_ops {
 		    struct mtd_oob_region *oobfree);
 };
 
+/**
+ * struct mtd_pairing_info - page pairing information
+ *
+ * @pair: pair id
+ * @group: group id
+ *
+ * The pair word is used here, even though TLC NANDs might group pages by 3
+ * (3 bits in a single cell). A pair should regroup all pages that are sharing
+ * the same cell. Pairs are then indexed in ascending order.
+ *
+ * @group is defining the position of a page in a given pair. It can also be
+ * seen as the bit position in the cell: page attached to bit 0 belongs to
+ * group 0, page attached to bit 1 belongs to group 1, etc.
+ *
+ * Example:
+ * The H27UCG8T2BTR-BC datasheet describes the following pairing scheme:
+ *
+ *		group-0		group-1
+ *
+ *  pair-0	page-0		page-4
+ *  pair-1	page-1		page-5
+ *  pair-2	page-2		page-8
+ *  ...
+ *  pair-127	page-251	page-255
+ *
+ *
+ * Note that the "group" and "pair" terms were extracted from Samsung and
+ * Hynix datasheets, and might be referenced under other names in other
+ * datasheets (Micron is describing this concept as "shared pages").
+ */
+struct mtd_pairing_info {
+	int pair;
+	int group;
+};
+
+/**
+ * struct mtd_pairing_scheme - page pairing scheme description
+ *
+ * @ngroups: number of groups. Should be related to the number of bits
+ *	     per cell.
+ * @get_info: converts a write-unit (page number within an erase block) into
+ *	      mtd_pairing information (pair + group). This function should
+ *	      fill the info parameter based on the wunit index.
+ * @get_wunit: converts pairing information into a write-unit (page) number.
+ *	       This function should return the wunit index pointed by the
+ *	       pairing information described in the info argument. It should
+ *	       return -EINVAL, if there's no wunit corresponding to the
+ *	       passed pairing information.
+ *
+ * See mtd_pairing_info documentation for a detailed explanation of the
+ * pair and group concepts.
+ *
+ * The mtd_pairing_scheme structure provides a generic solution to represent
+ * NAND page pairing scheme. Instead of exposing two big tables to do the
+ * write-unit <-> (pair + group) conversions, we ask the MTD drivers to
+ * implement the ->get_info() and ->get_wunit() functions.
+ *
+ * MTD users will then be able to query these information by using the
+ * mtd_pairing_info_to_wunit() and mtd_wunit_to_pairing_info() helpers.
+ *
+ * @ngroups is here to help MTD users iterating over all the pages in a
+ * given pair. This value can be retrieved by MTD users using the
+ * mtd_pairing_groups() helper.
+ *
+ * Examples are given in the mtd_pairing_info_to_wunit() and
+ * mtd_wunit_to_pairing_info() documentation.
+ */
+struct mtd_pairing_scheme {
+	int ngroups;
+	void (*get_info)(struct mtd_info *mtd, int wunit,
+			 struct mtd_pairing_info *info);
+	int (*get_wunit)(struct mtd_info *mtd,
+			 const struct mtd_pairing_info *info);
+};
+
 struct module;	/* only needed for owner field in mtd_info */
 
 struct mtd_info {
@@ -188,6 +263,9 @@  struct mtd_info {
 	/* OOB layout description */
 	const struct mtd_ooblayout_ops *ooblayout;
 
+	/* NAND pairing scheme, only provided for MLC/TLC NANDs */
+	const struct mtd_pairing_scheme *pairing;
+
 	/* the ecc step size. */
 	unsigned int ecc_step_size;
 
@@ -296,6 +374,12 @@  static inline void mtd_set_ooblayout(struct mtd_info *mtd,
 	mtd->ooblayout = ooblayout;
 }
 
+static inline void mtd_set_pairing_scheme(struct mtd_info *mtd,
+				const struct mtd_pairing_scheme *pairing)
+{
+	mtd->pairing = pairing;
+}
+
 static inline void mtd_set_of_node(struct mtd_info *mtd,
 				   struct device_node *np)
 {
@@ -312,6 +396,11 @@  static inline int mtd_oobavail(struct mtd_info *mtd, struct mtd_oob_ops *ops)
 	return ops->mode == MTD_OPS_AUTO_OOB ? mtd->oobavail : mtd->oobsize;
 }
 
+void mtd_wunit_to_pairing_info(struct mtd_info *mtd, int wunit,
+			       struct mtd_pairing_info *info);
+int mtd_pairing_info_to_wunit(struct mtd_info *mtd,
+			      const struct mtd_pairing_info *info);
+int mtd_pairing_groups(struct mtd_info *mtd);
 int mtd_erase(struct mtd_info *mtd, struct erase_info *instr);
 int mtd_point(struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen,
 	      void **virt, resource_size_t *phys);
@@ -397,6 +486,23 @@  static inline uint32_t mtd_mod_by_ws(uint64_t sz, struct mtd_info *mtd)
 	return do_div(sz, mtd->writesize);
 }
 
+static inline int mtd_wunit_per_eb(struct mtd_info *mtd)
+{
+	return mtd->erasesize / mtd->writesize;
+}
+
+static inline int mtd_offset_to_wunit(struct mtd_info *mtd, loff_t offs)
+{
+	return mtd_div_by_ws(mtd_mod_by_eb(offs, mtd), mtd);
+}
+
+static inline loff_t mtd_wunit_to_offset(struct mtd_info *mtd, loff_t base,
+					 int wunit)
+{
+	return base + (wunit * mtd->writesize);
+}
+
+
 static inline int mtd_has_oob(const struct mtd_info *mtd)
 {
 	return mtd->_read_oob && mtd->_write_oob;