diff mbox

[U-Boot,RFC,V2,1/3] drivers: block: add block device cache

Message ID 1458524727-4643-2-git-send-email-eric@nelint.com
State RFC
Delegated to: Tom Rini
Headers show

Commit Message

Eric Nelson March 21, 2016, 1:45 a.m. UTC
Add a block device cache to speed up repeated reads of block devices by 
various filesystems.

This small amount of cache can dramatically speed up filesystem
operations by skipping repeated reads of common areas of a block
device (typically directory structures).

This has shown to have some benefit on FAT filesystem operations of
loading a kernel and RAM disk, but more dramatic benefits on ext4
filesystems when the kernel and/or RAM disk are spread across 
multiple extent header structures as described in commit fc0fc50.

The cache is implemented through a minimal list (block_cache) maintained
in most-recently-used order and count of the current number of entries
(cache_count). It uses a maximum block count setting to prevent copies
of large block reads and an upper bound on the number of cached areas.

The maximum number of entries in the cache defaults to 4 and the maximum
number of blocks per cache entry has a default of 1, which matches the
current implementation of at least FAT and ext4 that only read a single
block at a time.

The 'blkcache' command (enabled through CONFIG_CMD_BLOCK_CACHE) allows
changing these values and can be used to tune for a particular filesystem
layout.

Signed-off-by: Eric Nelson <eric@nelint.com>
---
 drivers/block/Makefile      |   1 +
 drivers/block/cache_block.c | 240 ++++++++++++++++++++++++++++++++++++++++++++
 include/part.h              |  69 +++++++++++++
 3 files changed, 310 insertions(+)
 create mode 100644 drivers/block/cache_block.c

Comments

Eric Nelson March 21, 2016, 5:59 p.m. UTC | #1
On 03/20/2016 06:45 PM, Eric Nelson wrote:
> Add a block device cache to speed up repeated reads of block devices by 
> various filesystems.
> 
...
> 
> Signed-off-by: Eric Nelson <eric@nelint.com>
> ---
>  drivers/block/Makefile      |   1 +
>  drivers/block/cache_block.c | 240 ++++++++++++++++++++++++++++++++++++++++++++
>  include/part.h              |  69 +++++++++++++
>  3 files changed, 310 insertions(+)
>  create mode 100644 drivers/block/cache_block.c
> 
...

> diff --git a/include/part.h b/include/part.h
> index dc8e72e..1ac73dcc 100644
> --- a/include/part.h
> +++ b/include/part.h
> @@ -376,4 +376,73 @@ int gpt_verify_partitions(struct blk_desc *dev_desc,
>  			  gpt_header *gpt_head, gpt_entry **gpt_pte);
>  #endif
>  

I think this stuff now belongs in blk.h instead of part.h:

> +#ifdef CONFIG_BLOCK_CACHE
> +/**
> + * cache_block_read() - attempt to read a set of blocks from cache
> + *
> + * @param iftype - IF_TYPE_x for type of device
Stephen Warren March 23, 2016, 5:22 p.m. UTC | #2
On 03/20/2016 07:45 PM, Eric Nelson wrote:
> Add a block device cache to speed up repeated reads of block devices by
> various filesystems.
>
> This small amount of cache can dramatically speed up filesystem
> operations by skipping repeated reads of common areas of a block
> device (typically directory structures).
>
> This has shown to have some benefit on FAT filesystem operations of
> loading a kernel and RAM disk, but more dramatic benefits on ext4
> filesystems when the kernel and/or RAM disk are spread across
> multiple extent header structures as described in commit fc0fc50.
>
> The cache is implemented through a minimal list (block_cache) maintained
> in most-recently-used order and count of the current number of entries
> (cache_count). It uses a maximum block count setting to prevent copies
> of large block reads and an upper bound on the number of cached areas.
>
> The maximum number of entries in the cache defaults to 4 and the maximum
> number of blocks per cache entry has a default of 1, which matches the
> current implementation of at least FAT and ext4 that only read a single
> block at a time.

If the maximum size of the cache is fixed and small (which judging by 
the description it is), why not use an array rather than a linked list. 
That would be far simpler to manage.

> The 'blkcache' command (enabled through CONFIG_CMD_BLOCK_CACHE) allows
> changing these values and can be used to tune for a particular filesystem
> layout.

Even with this dynamic adjustment, using an array still feels simpler, 
although I have't looked at the code yet.

> diff --git a/drivers/block/cache_block.c b/drivers/block/cache_block.c

> +/*
> + * search for a set of blocks in the cache
> + *
> + * remove and return the node if found so it can be
> + * added back at the start of the list and maintain MRU order)
> + */

Splitting the remove/add feels a bit dangerous, since forgetting to 
re-do the add (or e.g. some error causing that to be skipped) could 
cause cache_count to become inconsistent.

The function name "find" is a bit misleading. cache_find_and_remove() 
would make its semantics more obvious. Or, simply put the list_del() 
somewhere else; it doesn't feel like part of a "find" operation to me. 
Or, put the entire move-to-front operation inside this function so it 
isn't split up - if so, cache_find_and_move_to_head().

> +static struct block_cache_node *cache_find
> +	(int iftype, int devnum,

Nit: the ( and first n arguments shouldn't be wrapped.

> +int cache_block_read(int iftype, int devnum,
> +		     lbaint_t start, lbaint_t blkcnt,
> +		     unsigned long blksz, void *buffer)

> +		memcpy(buffer, src, blksz*blkcnt);

Nit: Space around operators. checkpatch should catch this.

> +		if (trace)
> +			printf("hit: start " LBAF ", count " LBAFU "\n",
> +			       start, blkcnt);

I guess I'll find that trace can be adjusted at run-time somewhere later 
in this patch, but it's more typical to use debug() without the if 
statement for this. It would be awesome if debug() could be adjusted at 
run-time...

> +void cache_block_fill(int iftype, int devnum,
...
> +	if (node->cache == 0) {
> +		node->cache = malloc(bytes);
> +		if (!node->cache) {

Nit: may as well be consistent with those NULL checks.

> +void cache_block_invalidate(int iftype, int devnum)
...
> +void cache_block_invalidate_all(void)

There's no invalidate_blocks(); I imagine that writing-to/erasing (some 
blocks of) a device must call cache_block_invalidate() which will wipe 
out even data that wasn't written.

> +static void dump_block_cache(void)
...
> +	list_for_each_safe(entry, n, &block_cache) {

Nit: This doesn't need to be _safe since the list is not being modified.

> +static int do_blkcache(cmd_tbl_t *cmdtp, int flag,
> +		       int argc, char * const argv[])
> +{
> +	if ((argc == 1) || !strcmp("show", argv[1])) {
> +		printf("block cache:\n"
> +		       "%u\thits\n"
> +		       "%u\tmisses\n"
> +		       "%u\tentries in cache\n"
> +		       "trace %s\n"
> +		       "max blocks/entry %lu\n"
> +		       "max entries %lu\n",
> +		       block_cache_hits, block_cache_misses, cache_count,
> +		       trace ? "on" : "off",
> +		       max_blocks_per_entry, max_cache_entries);

Nit: Let's print the value and "label" in a consistent order. I would 
suggest everything being formatted as:

"    description: %u"

so that it's indented, has a delimiter between description and value, 
and so that irrespective of the length of the converted value, the 
indentation of the other parts of the string don't change (\t doesn't 
guarantee this after a certain number length).

I would rather have expected "show" to dump the entries too, but I 
suppose it's fine that they're separate.

> +	} else if ((argc == 2) && ('c' == argv[1][0])) {

Nit: the value of 'c' is always 'c', so it's pointless to test whether 
it's equal to something. The value being compared is arv[1][0], so the 
parameters to == should be swapped. I'm aware that == is mathematically 
commutative, but typical English reading of the operator is "is equal 
to" which has non-commutative implications re: what is being tested. I'm 
also aware that this construct is used to trigger compiler warnings if 
someone types = instead of ==. However, (a) you didn't and (b) compilers 
warn about this these days, so there's no need to write unusual code to 
get that feature anymore.

I worry that being imprecise in command-line parsing (i.e. treating both 
"crud" and "clear" as the same thing) will lead to problems expanding 
the command-line format in the future; we won't be able to ever add 
another option starting with "c" and maintain the same abbreviation 
semantics. I'd suggest requiring the full option name always.

> +	} else if ((argc == 3) && isdigit(argv[1][0]) && isdigit(argv[2][0])) {

Let's make this "blkcache set" or "blkcache configure" so that other 
sub-commands can take numeric parameters in the future, and have 
consistent command-line syntax.

> +U_BOOT_CMD(
> +	blkcache,	3,	0,	do_blkcache,
> +	"block cache control/statistics",
> +	"[show] - show statistics\n"
> +	"blkcache c[lear] - clear statistics\n"
> +	"blkcache d[ump] - dump cache entries\n"
> +	"blkcache i[nvalidate] - invalidate cache\n"
> +	"blkcache #maxblocks #maxentries\n"
> +	"\tset maximum blocks per cache entry, maximum entries\n"
> +	"blkcache t[race] [off] - enable (disable) tracing"
> +);

BTW, isn't there some support in U-Boot for sub-commands already, so you 
can write a separate function per sub-command and skip writing all the 
strcmp logic to select between them? I'm pretty sure I saw that somewhere.

> diff --git a/include/part.h b/include/part.h

> +static inline void cache_block_invalidate_all(void){}

Is that useful outside of the debug commands?
Eric Nelson March 23, 2016, 5:43 p.m. UTC | #3
Hi Stephen,

Thanks again for the detailed review.

On 03/23/2016 10:22 AM, Stephen Warren wrote:
> On 03/20/2016 07:45 PM, Eric Nelson wrote:
>> Add a block device cache to speed up repeated reads of block devices by
>> various filesystems.
>>
>> This small amount of cache can dramatically speed up filesystem
>> operations by skipping repeated reads of common areas of a block
>> device (typically directory structures).
>>
>> This has shown to have some benefit on FAT filesystem operations of
>> loading a kernel and RAM disk, but more dramatic benefits on ext4
>> filesystems when the kernel and/or RAM disk are spread across
>> multiple extent header structures as described in commit fc0fc50.
>>
>> The cache is implemented through a minimal list (block_cache) maintained
>> in most-recently-used order and count of the current number of entries
>> (cache_count). It uses a maximum block count setting to prevent copies
>> of large block reads and an upper bound on the number of cached areas.
>>
>> The maximum number of entries in the cache defaults to 4 and the maximum
>> number of blocks per cache entry has a default of 1, which matches the
>> current implementation of at least FAT and ext4 that only read a single
>> block at a time.
> 
> If the maximum size of the cache is fixed and small (which judging by
> the description it is), why not use an array rather than a linked list.
> That would be far simpler to manage.
> 

You seem to agree with Marek, and I must be dain-bramaged because I
just don't see it.

>> The 'blkcache' command (enabled through CONFIG_CMD_BLOCK_CACHE) allows
>> changing these values and can be used to tune for a particular filesystem
>> layout.
> 
> Even with this dynamic adjustment, using an array still feels simpler,
> although I have't looked at the code yet.
> 
>> diff --git a/drivers/block/cache_block.c b/drivers/block/cache_block.c
> 
>> +/*
>> + * search for a set of blocks in the cache
>> + *
>> + * remove and return the node if found so it can be
>> + * added back at the start of the list and maintain MRU order)
>> + */
> 
> Splitting the remove/add feels a bit dangerous, since forgetting to
> re-do the add (or e.g. some error causing that to be skipped) could
> cause cache_count to become inconsistent.
> 
> The function name "find" is a bit misleading. cache_find_and_remove()
> would make its semantics more obvious. Or, simply put the list_del()
> somewhere else; it doesn't feel like part of a "find" operation to me.
> Or, put the entire move-to-front operation inside this function so it
> isn't split up - if so, cache_find_and_move_to_head().
>

You're right. I'll look at that for a V3 (hopefully non-RFC)
patch set.

>> +static struct block_cache_node *cache_find
>> +    (int iftype, int devnum,
> 
> Nit: the ( and first n arguments shouldn't be wrapped.
> 
>> +int cache_block_read(int iftype, int devnum,
>> +             lbaint_t start, lbaint_t blkcnt,
>> +             unsigned long blksz, void *buffer)
> 
>> +        memcpy(buffer, src, blksz*blkcnt);
> 
> Nit: Space around operators. checkpatch should catch this.
> 

Thanks.

>> +        if (trace)
>> +            printf("hit: start " LBAF ", count " LBAFU "\n",
>> +                   start, blkcnt);
> 
> I guess I'll find that trace can be adjusted at run-time somewhere later
> in this patch, but it's more typical to use debug() without the if
> statement for this. It would be awesome if debug() could be adjusted at
> run-time...
> 

I wanted to keep this as a run-time thing because it's useful in
tuning things and I'm not yet certain if/when users might need
to update the sizes.

I could wrap these with some #ifdef stuff but I'm not certain
which is the right thing to do.

>> +void cache_block_fill(int iftype, int devnum,
> ...
>> +    if (node->cache == 0) {
>> +        node->cache = malloc(bytes);
>> +        if (!node->cache) {
> 
> Nit: may as well be consistent with those NULL checks.
> 

Yep.

>> +void cache_block_invalidate(int iftype, int devnum)
> ...
>> +void cache_block_invalidate_all(void)
> 
> There's no invalidate_blocks(); I imagine that writing-to/erasing (some
> blocks of) a device must call cache_block_invalidate() which will wipe
> out even data that wasn't written.
> 

Right. I figured that this wasn't worth extra code.

The 99.99% use case for U-Boot is read-only, so optimizing this
just seems like bloat.

>> +static void dump_block_cache(void)
> ...
>> +    list_for_each_safe(entry, n, &block_cache) {
> 
> Nit: This doesn't need to be _safe since the list is not being modified.
> 

Good catch.

>> +static int do_blkcache(cmd_tbl_t *cmdtp, int flag,
>> +               int argc, char * const argv[])
>> +{
>> +    if ((argc == 1) || !strcmp("show", argv[1])) {
>> +        printf("block cache:\n"
>> +               "%u\thits\n"
>> +               "%u\tmisses\n"
>> +               "%u\tentries in cache\n"
>> +               "trace %s\n"
>> +               "max blocks/entry %lu\n"
>> +               "max entries %lu\n",
>> +               block_cache_hits, block_cache_misses, cache_count,
>> +               trace ? "on" : "off",
>> +               max_blocks_per_entry, max_cache_entries);
> 
> Nit: Let's print the value and "label" in a consistent order. I would
> suggest everything being formatted as:
> 
> "    description: %u"
> 
> so that it's indented, has a delimiter between description and value,
> and so that irrespective of the length of the converted value, the
> indentation of the other parts of the string don't change (\t doesn't
> guarantee this after a certain number length).
> 

Thanks.

> I would rather have expected "show" to dump the entries too, but I
> suppose it's fine that they're separate.
> 

While testing, I found myself mostly using 'show' to see results.

I added dump as a debug tool and almost removed it, but figured this
was still an RFC, so I left it in.

>> +    } else if ((argc == 2) && ('c' == argv[1][0])) {
> 
> Nit: the value of 'c' is always 'c', so it's pointless to test whether
> it's equal to something. The value being compared is arv[1][0], so the
> parameters to == should be swapped. I'm aware that == is mathematically
> commutative, but typical English reading of the operator is "is equal
> to" which has non-commutative implications re: what is being tested. I'm
> also aware that this construct is used to trigger compiler warnings if
> someone types = instead of ==. However, (a) you didn't and (b) compilers
> warn about this these days, so there's no need to write unusual code to
> get that feature anymore.
> 

:) muscle-memory again.

It's usually Marek that catches my yoda-expressions, which are usually
tests against zero.

> I worry that being imprecise in command-line parsing (i.e. treating both
> "crud" and "clear" as the same thing) will lead to problems expanding
> the command-line format in the future; we won't be able to ever add
> another option starting with "c" and maintain the same abbreviation
> semantics. I'd suggest requiring the full option name always.
> 
>> +    } else if ((argc == 3) && isdigit(argv[1][0]) &&
>> isdigit(argv[2][0])) {
> 
> Let's make this "blkcache set" or "blkcache configure" so that other
> sub-commands can take numeric parameters in the future, and have
> consistent command-line syntax.
> 

Sounds good to me.

>> +U_BOOT_CMD(
>> +    blkcache,    3,    0,    do_blkcache,
>> +    "block cache control/statistics",
>> +    "[show] - show statistics\n"
>> +    "blkcache c[lear] - clear statistics\n"
>> +    "blkcache d[ump] - dump cache entries\n"
>> +    "blkcache i[nvalidate] - invalidate cache\n"
>> +    "blkcache #maxblocks #maxentries\n"
>> +    "\tset maximum blocks per cache entry, maximum entries\n"
>> +    "blkcache t[race] [off] - enable (disable) tracing"
>> +);
> 
> BTW, isn't there some support in U-Boot for sub-commands already, so you
> can write a separate function per sub-command and skip writing all the
> strcmp logic to select between them? I'm pretty sure I saw that somewhere.
> 

There is, but I haven't (yet) used it. I'll take a look at this in V3.

>> diff --git a/include/part.h b/include/part.h
> 
>> +static inline void cache_block_invalidate_all(void){}
> 
> Is that useful outside of the debug commands?
> 

Not at the moment, and it probably won't be needed when I figure
out how this glues to the block layer and/or drivers in a cleaner
way.

Regards,


Eric
diff mbox

Patch

diff --git a/drivers/block/Makefile b/drivers/block/Makefile
index b5c7ae1..056a48b 100644
--- a/drivers/block/Makefile
+++ b/drivers/block/Makefile
@@ -24,3 +24,4 @@  obj-$(CONFIG_IDE_SIL680) += sil680.o
 obj-$(CONFIG_SANDBOX) += sandbox.o
 obj-$(CONFIG_SCSI_SYM53C8XX) += sym53c8xx.o
 obj-$(CONFIG_SYSTEMACE) += systemace.o
+obj-$(CONFIG_BLOCK_CACHE) += cache_block.o
diff --git a/drivers/block/cache_block.c b/drivers/block/cache_block.c
new file mode 100644
index 0000000..e4ebeda
--- /dev/null
+++ b/drivers/block/cache_block.c
@@ -0,0 +1,240 @@ 
+/*
+ * Copyright (C) Nelson Integration, LLC 2016
+ * Author: Eric Nelson<eric@nelint.com>
+ *
+ * SPDX-License-Identifier:	GPL-2.0+
+ *
+ */
+#include <config.h>
+#include <common.h>
+#include <malloc.h>
+#include <part.h>
+#include <linux/ctype.h>
+#include <linux/list.h>
+
+struct block_cache_node {
+	struct list_head lh;
+	int iftype;
+	int devnum;
+	lbaint_t start;
+	lbaint_t blkcnt;
+	unsigned long blksz;
+	char *cache;
+};
+
+static LIST_HEAD(block_cache);
+static unsigned cache_count;
+
+static unsigned long max_blocks_per_entry = 1;
+static unsigned long max_cache_entries = 4;
+
+static unsigned block_cache_misses;
+static unsigned block_cache_hits;
+
+static int trace;
+
+/*
+ * search for a set of blocks in the cache
+ *
+ * remove and return the node if found so it can be
+ * added back at the start of the list and maintain MRU order)
+ */
+static struct block_cache_node *cache_find
+	(int iftype, int devnum,
+	 lbaint_t start, lbaint_t blkcnt,
+	 unsigned long blksz)
+{
+	struct block_cache_node *node;
+
+	list_for_each_entry(node, &block_cache, lh)
+		if ((node->iftype == iftype) &&
+		    (node->devnum == devnum) &&
+		    (node->blksz == blksz) &&
+		    (node->start <= start) &&
+		    (node->start + node->blkcnt >= start + blkcnt)) {
+			list_del(&node->lh);
+			return node;
+		}
+	return 0;
+}
+
+int cache_block_read(int iftype, int devnum,
+		     lbaint_t start, lbaint_t blkcnt,
+		     unsigned long blksz, void *buffer)
+{
+	struct block_cache_node *node = cache_find(iftype, devnum, start,
+						   blkcnt, blksz);
+	if (node) {
+		const char *src = node->cache + (start - node->start) * blksz;
+		list_add(&node->lh, &block_cache);
+		memcpy(buffer, src, blksz*blkcnt);
+		if (trace)
+			printf("hit: start " LBAF ", count " LBAFU "\n",
+			       start, blkcnt);
+		++block_cache_hits;
+		return 1;
+	}
+
+	if (trace)
+		printf("miss: start " LBAF ", count " LBAFU "\n",
+		       start, blkcnt);
+	++block_cache_misses;
+	return 0;
+}
+
+void cache_block_fill(int iftype, int devnum,
+		      lbaint_t start, lbaint_t blkcnt,
+		      unsigned long blksz, void const *buffer)
+{
+	lbaint_t bytes;
+	struct block_cache_node *node;
+
+	/* don't cache big stuff */
+	if (blkcnt > max_blocks_per_entry)
+		return;
+
+	if (max_cache_entries == 0)
+		return;
+
+	bytes = blksz * blkcnt;
+	if (max_cache_entries <= cache_count) {
+		/* pop LRU */
+		node = (struct block_cache_node *)block_cache.prev;
+		list_del(&node->lh);
+		cache_count--;
+		if (node->blkcnt * node->blksz < bytes) {
+			free(node->cache);
+			node->cache = 0;
+		}
+	} else {
+		node = malloc(sizeof(*node));
+		if (!node)
+			return;
+		node->cache = 0;
+	}
+
+	if (node->cache == 0) {
+		node->cache = malloc(bytes);
+		if (!node->cache) {
+			free(node);
+			return;
+		}
+	}
+
+	if (trace)
+		printf("fill: start " LBAF ", count " LBAFU "\n",
+		       start, blkcnt);
+
+	node->iftype = iftype;
+	node->devnum = devnum;
+	node->start = start;
+	node->blkcnt = blkcnt;
+	node->blksz = blksz;
+	memcpy(node->cache, buffer, bytes);
+	list_add(&node->lh, &block_cache);
+	cache_count++;
+}
+
+void cache_block_invalidate(int iftype, int devnum)
+{
+	struct list_head *entry, *n;
+	struct block_cache_node *node;
+
+	list_for_each_safe(entry, n, &block_cache) {
+		node = (struct block_cache_node *)entry;
+		if ((node->iftype == iftype) &&
+		    (node->devnum == devnum)) {
+			list_del(entry);
+			free(node->cache);
+			free(node);
+			--cache_count;
+		}
+	}
+}
+
+void cache_block_invalidate_all(void)
+{
+	struct list_head *entry, *n;
+	struct block_cache_node *node;
+
+	list_for_each_safe(entry, n, &block_cache) {
+		node = (struct block_cache_node *)entry;
+		list_del(entry);
+		free(node->cache);
+		free(node);
+	}
+
+	cache_count = 0;
+}
+
+#ifdef CONFIG_CMD_BLOCK_CACHE
+
+static void dump_block_cache(void)
+{
+	struct list_head *entry, *n;
+	struct block_cache_node *node;
+	int i = 0;
+
+	list_for_each_safe(entry, n, &block_cache) {
+		node = (struct block_cache_node *)entry;
+		printf("----- cache entry[%d]\n", i++);
+		printf("iftype %d\n", node->iftype);
+		printf("devnum %d\n", node->devnum);
+		printf("blksize " LBAFU "\n", node->blksz);
+		printf("start " LBAF "\n", node->start);
+		printf("count " LBAFU "\n", node->blkcnt);
+	}
+}
+
+static int do_blkcache(cmd_tbl_t *cmdtp, int flag,
+		       int argc, char * const argv[])
+{
+	if ((argc == 1) || !strcmp("show", argv[1])) {
+		printf("block cache:\n"
+		       "%u\thits\n"
+		       "%u\tmisses\n"
+		       "%u\tentries in cache\n"
+		       "trace %s\n"
+		       "max blocks/entry %lu\n"
+		       "max entries %lu\n",
+		       block_cache_hits, block_cache_misses, cache_count,
+		       trace ? "on" : "off",
+		       max_blocks_per_entry, max_cache_entries);
+	} else if ((argc == 2) && ('c' == argv[1][0])) {
+		block_cache_hits = 0;
+		block_cache_misses = 0;
+	} else if ((argc == 2) && ('d' == argv[1][0])) {
+		dump_block_cache();
+	} else if ((argc == 2) && ('i' == argv[1][0])) {
+		cache_block_invalidate_all();
+	} else if ((argc >= 2) && ('t' == argv[1][0])) {
+		if ((argc == 3) && !strcmp("off", argv[2]))
+			trace = 0;
+		else
+			trace = 1;
+	} else if ((argc == 3) && isdigit(argv[1][0]) && isdigit(argv[2][0])) {
+		max_blocks_per_entry = simple_strtoul(argv[1], 0, 0);
+		max_cache_entries = simple_strtoul(argv[2], 0, 0);
+		cache_block_invalidate_all();
+		printf("changed to max of %lu entries of %lu blocks each\n",
+		       max_cache_entries, max_blocks_per_entry);
+	} else {
+		return CMD_RET_USAGE;
+	}
+
+	return 0;
+}
+
+U_BOOT_CMD(
+	blkcache,	3,	0,	do_blkcache,
+	"block cache control/statistics",
+	"[show] - show statistics\n"
+	"blkcache c[lear] - clear statistics\n"
+	"blkcache d[ump] - dump cache entries\n"
+	"blkcache i[nvalidate] - invalidate cache\n"
+	"blkcache #maxblocks #maxentries\n"
+	"\tset maximum blocks per cache entry, maximum entries\n"
+	"blkcache t[race] [off] - enable (disable) tracing"
+);
+
+#endif
diff --git a/include/part.h b/include/part.h
index dc8e72e..1ac73dcc 100644
--- a/include/part.h
+++ b/include/part.h
@@ -376,4 +376,73 @@  int gpt_verify_partitions(struct blk_desc *dev_desc,
 			  gpt_header *gpt_head, gpt_entry **gpt_pte);
 #endif
 
+#ifdef CONFIG_BLOCK_CACHE
+/**
+ * cache_block_read() - attempt to read a set of blocks from cache
+ *
+ * @param iftype - IF_TYPE_x for type of device
+ * @param dev - device index of particular type
+ * @param start - starting block number
+ * @param blkcnt - number of blocks to read
+ * @param blksz - size in bytes of each block
+ * @param buf - buffer to contain cached data
+ *
+ * @return - '1' if block returned from cache, '0' otherwise.
+ */
+int cache_block_read
+	(int iftype, int dev,
+	 lbaint_t start, lbaint_t blkcnt,
+	 unsigned long blksz, void *buffer);
+
+/**
+ * cache_block_fill() - make data read from a block device available
+ * to the block cache
+ *
+ * @param iftype - IF_TYPE_x for type of device
+ * @param dev - device index of particular type
+ * @param start - starting block number
+ * @param blkcnt - number of blocks available
+ * @param blksz - size in bytes of each block
+ * @param buf - buffer containing data to cache
+ *
+ */
+void cache_block_fill
+	(int iftype, int dev,
+	 lbaint_t start, lbaint_t blkcnt,
+	 unsigned long blksz, void const *buffer);
+
+/**
+ * cache_block_invalidate() - discard the cache for a set of blocks
+ * because of a write or device (re)initialization.
+ *
+ * @param iftype - IF_TYPE_x for type of device
+ * @param dev - device index of particular type
+ */
+void cache_block_invalidate
+	(int iftype, int dev);
+
+void cache_block_invalidate_all(void);
+
+#else
+
+static inline int cache_block_read
+	(int iftype, int dev,
+	 lbaint_t start, lbaint_t blkcnt,
+	 unsigned long blksz, void *buffer)
+{
+	return 0;
+}
+
+static inline void cache_block_fill
+	(int iftype, int dev,
+	 lbaint_t start, lbaint_t blkcnt,
+	 unsigned long blksz, void const *buffer) {}
+
+static inline void cache_block_invalidate
+	(int iftype, int dev) {}
+
+static inline void cache_block_invalidate_all(void){}
+
+#endif
+
 #endif /* _PART_H */