diff mbox

[10/18] mm: teach truncate_inode_pages_range() to handle non page aligned ranges

Message ID 1359715424-32318-11-git-send-email-lczerner@redhat.com
State Superseded, archived
Headers show

Commit Message

Lukas Czerner Feb. 1, 2013, 10:43 a.m. UTC
This commit changes truncate_inode_pages_range() so it can handle non
page aligned regions of the truncate. Currently we can hit BUG_ON when
the end of the range is not page aligned, but we can handle unaligned
start of the range.

Being able to handle non page aligned regions of the page can help file
system punch_hole implementations and save some work, because once we're
holding the page we might as well deal with it right away.

In previous commits we've changed ->invalidatepage() prototype to accept
'length' argument to be able to specify range to invalidate. No we can
use that new ability in truncate_inode_pages_range().

This was based on the code provided by Hugh Dickins with some small
changes to make use of do_invalidatepage_range().

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Hugh Dickins <hughd@google.com>
---
 mm/truncate.c |   73 ++++++++++++++++++++++++++++++++++++---------------------
 1 files changed, 46 insertions(+), 27 deletions(-)

Comments

Andrew Morton Feb. 1, 2013, 11:15 p.m. UTC | #1
On Fri,  1 Feb 2013 11:43:36 +0100
Lukas Czerner <lczerner@redhat.com> wrote:

> This commit changes truncate_inode_pages_range() so it can handle non
> page aligned regions of the truncate. Currently we can hit BUG_ON when
> the end of the range is not page aligned, but we can handle unaligned
> start of the range.
> 
> Being able to handle non page aligned regions of the page can help file
> system punch_hole implementations and save some work, because once we're
> holding the page we might as well deal with it right away.
> 
> In previous commits we've changed ->invalidatepage() prototype to accept
> 'length' argument to be able to specify range to invalidate. No we can
> use that new ability in truncate_inode_pages_range().

The change seems sensible.

> This was based on the code provided by Hugh Dickins

Despite this ;)

> changes to make use of do_invalidatepage_range().
>
> ...
>
>  void truncate_inode_pages_range(struct address_space *mapping,
>  				loff_t lstart, loff_t lend)
>  {
> -	const pgoff_t start = (lstart + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT;
> -	const unsigned partial = lstart & (PAGE_CACHE_SIZE - 1);
> +	pgoff_t start = (lstart + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
> +	pgoff_t end = (lend + 1) >> PAGE_CACHE_SHIFT;
> +	unsigned int partial_start = lstart & (PAGE_CACHE_SIZE - 1);
> +	unsigned int partial_end = (lend + 1) & (PAGE_CACHE_SIZE - 1);
>  	struct pagevec pvec;
>  	pgoff_t index;
> -	pgoff_t end;
>  	int i;

This is starting to get pretty hairy.  Some of these "end" variables
are inclusive and some are exclusive.

Can we improve things?  We can drop all this tiresome
intialisation-at-declaration-site stuff and do:

	pgoff_t start;			/* inclusive */
	pgoff_t end;			/* exclusive */
	unsigned int partial_start;	/* inclusive */
	unsigned int partial_end;	/* exclusive */
	struct pagevec pvec;
	pgoff_t index;
	int i;

	start = (lstart + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
	end = (lend + 1) >> PAGE_CACHE_SHIFT;
	partial_start = lstart & (PAGE_CACHE_SIZE - 1);
	partial_end = (lend + 1) & (PAGE_CACHE_SIZE - 1);

And lo, I see that the "inclusive" thing only applies to incoming arg
`lend'.  I seem to recall that being my handiwork and somehow I seem to
not have documented the reason: it was so that we can pass
lend=0xffffffff into truncate_inode_pages_range) to indicate "end of
file".

Your code handles this in a rather nasty fashion.  It permits the above
overflow to occur then later fixes it up with an explicit test for -1. 
And it then sets `end' (which is a pgoff_t!) to -1.

I guess this works, but let's make it clearer, with something like:

	if (lend == -1) {
		/*
		 * Nice explanation goes here
		 */
		end = -1;
	} else {
		end = (lend + 1) >> PAGE_CACHE_SHIFT;
	}


>  	cleancache_invalidate_inode(mapping);
>  	if (mapping->nrpages == 0)
>  		return;
>  
> -	BUG_ON((lend & (PAGE_CACHE_SIZE - 1)) != (PAGE_CACHE_SIZE - 1));
> -	end = (lend >> PAGE_CACHE_SHIFT);
> +	if (lend == -1)
> +		end = -1;	/* unsigned, so actually very big */
>  
>  	pagevec_init(&pvec, 0);
>  	index = start;
> -	while (index <= end && pagevec_lookup(&pvec, mapping, index,
> -			min(end - index, (pgoff_t)PAGEVEC_SIZE - 1) + 1)) {
> +	while (index < end && pagevec_lookup(&pvec, mapping, index,
> +			min(end - index, (pgoff_t)PAGEVEC_SIZE))) {

Here, my brain burst.  You've effectively added 1 to (end - index).  Is
that correct?

>  		mem_cgroup_uncharge_start();
>  		for (i = 0; i < pagevec_count(&pvec); i++) {
>  			struct page *page = pvec.pages[i];
>  
>  			/* We rely upon deletion not changing page->index */
>  			index = page->index;
> -			if (index > end)
> +			if (index >= end)

hm.  This change implies that the patch changed `end' from inclusive to
exclusive.  But the patch didn't do that.

>  				break;
>  
>  			if (!trylock_page(page))
> @@ -250,27 +247,51 @@ void truncate_inode_pages_range(struct address_space *mapping,
>  		index++;
>  	}
>  
> -	if (partial) {
> +	if (partial_start) {
>  		struct page *page = find_lock_page(mapping, start - 1);
>  		if (page) {
> +			unsigned int top = PAGE_CACHE_SIZE;
> +			if (start > end) {

How can this be true?

> +				top = partial_end;
> +				partial_end = 0;
> +			}
> +			wait_on_page_writeback(page);
> +			zero_user_segment(page, partial_start, top);
> +			cleancache_invalidate_page(mapping, page);
> +			if (page_has_private(page))
> +				do_invalidatepage(page, partial_start,
> +						  top - partial_start);
> +			unlock_page(page);
> +			page_cache_release(page);
> +		}
> +	}
> +	if (partial_end) {
> +		struct page *page = find_lock_page(mapping, end);
> +		if (page) {
>  			wait_on_page_writeback(page);
> -			truncate_partial_page(page, partial);
> +			zero_user_segment(page, 0, partial_end);
> +			cleancache_invalidate_page(mapping, page);
> +			if (page_has_private(page))
> +				do_invalidatepage(page, 0,
> +						  partial_end);
>  			unlock_page(page);
>  			page_cache_release(page);
>  		}
>  	}
> +	if (start >= end)
> +		return;

Again, how can start be greater than end??

I suspect a lot of the confustion and churn in here is due to `end'
being kinda-exclusive.  If `lend' was 4094 then `end' is zero.  But if
`lend' was 4095' then `end' is 1.  So even though `end' refers to the same
page, it has a different value!

Would the code be simpler and clearer if we were to make `end' "pgoff_t
of the last-affected page", and document it as such?


--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Czerner Feb. 4, 2013, 2:51 p.m. UTC | #2
On Fri, 1 Feb 2013, Andrew Morton wrote:

> Date: Fri, 1 Feb 2013 15:15:02 -0800
> From: Andrew Morton <akpm@linux-foundation.org>
> To: Lukas Czerner <lczerner@redhat.com>
> Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
>     linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
>     xfs@oss.sgi.com, Hugh Dickins <hughd@google.com>
> Subject: Re: [PATCH 10/18] mm: teach truncate_inode_pages_range() to handle
>     non page aligned ranges
> 
> On Fri,  1 Feb 2013 11:43:36 +0100
> Lukas Czerner <lczerner@redhat.com> wrote:
> 
> > This commit changes truncate_inode_pages_range() so it can handle non
> > page aligned regions of the truncate. Currently we can hit BUG_ON when
> > the end of the range is not page aligned, but we can handle unaligned
> > start of the range.
> > 
> > Being able to handle non page aligned regions of the page can help file
> > system punch_hole implementations and save some work, because once we're
> > holding the page we might as well deal with it right away.
> > 
> > In previous commits we've changed ->invalidatepage() prototype to accept
> > 'length' argument to be able to specify range to invalidate. No we can
> > use that new ability in truncate_inode_pages_range().
> 
> The change seems sensible.
> 
> > This was based on the code provided by Hugh Dickins
> 
> Despite this ;)
> 
> > changes to make use of do_invalidatepage_range().
> >
> > ...
> >
> >  void truncate_inode_pages_range(struct address_space *mapping,
> >  				loff_t lstart, loff_t lend)
> >  {
> > -	const pgoff_t start = (lstart + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT;
> > -	const unsigned partial = lstart & (PAGE_CACHE_SIZE - 1);
> > +	pgoff_t start = (lstart + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
> > +	pgoff_t end = (lend + 1) >> PAGE_CACHE_SHIFT;
> > +	unsigned int partial_start = lstart & (PAGE_CACHE_SIZE - 1);
> > +	unsigned int partial_end = (lend + 1) & (PAGE_CACHE_SIZE - 1);
> >  	struct pagevec pvec;
> >  	pgoff_t index;
> > -	pgoff_t end;
> >  	int i;
> 
> This is starting to get pretty hairy.  Some of these "end" variables
> are inclusive and some are exclusive.

Yes, I agree that it's little bit confusing.

> 
> Can we improve things?  We can drop all this tiresome
> intialisation-at-declaration-site stuff and do:

Yes, I agree that this will make things cleaner.

> 
> 	pgoff_t start;			/* inclusive */
> 	pgoff_t end;			/* exclusive */
> 	unsigned int partial_start;	/* inclusive */
> 	unsigned int partial_end;	/* exclusive */
> 	struct pagevec pvec;
> 	pgoff_t index;
> 	int i;
> 
> 	start = (lstart + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
> 	end = (lend + 1) >> PAGE_CACHE_SHIFT;
> 	partial_start = lstart & (PAGE_CACHE_SIZE - 1);
> 	partial_end = (lend + 1) & (PAGE_CACHE_SIZE - 1);
> 
> And lo, I see that the "inclusive" thing only applies to incoming arg
> `lend'.  I seem to recall that being my handiwork and somehow I seem to
> not have documented the reason: it was so that we can pass
> lend=0xffffffff into truncate_inode_pages_range) to indicate "end of
> file".
> 
> Your code handles this in a rather nasty fashion.  It permits the above
> overflow to occur then later fixes it up with an explicit test for -1. 
> And it then sets `end' (which is a pgoff_t!) to -1.
> 
> I guess this works, but let's make it clearer, with something like:
> 
> 	if (lend == -1) {
> 		/*
> 		 * Nice explanation goes here
> 		 */
> 		end = -1;
> 	} else {
> 		end = (lend + 1) >> PAGE_CACHE_SHIFT;
> 	}

Good point, this is better.

> 
> 
> >  	cleancache_invalidate_inode(mapping);
> >  	if (mapping->nrpages == 0)
> >  		return;
> >  
> > -	BUG_ON((lend & (PAGE_CACHE_SIZE - 1)) != (PAGE_CACHE_SIZE - 1));
> > -	end = (lend >> PAGE_CACHE_SHIFT);
> > +	if (lend == -1)
> > +		end = -1;	/* unsigned, so actually very big */
> >  
> >  	pagevec_init(&pvec, 0);
> >  	index = start;
> > -	while (index <= end && pagevec_lookup(&pvec, mapping, index,
> > -			min(end - index, (pgoff_t)PAGEVEC_SIZE - 1) + 1)) {
> > +	while (index < end && pagevec_lookup(&pvec, mapping, index,
> > +			min(end - index, (pgoff_t)PAGEVEC_SIZE))) {
> 
> Here, my brain burst.  You've effectively added 1 to (end - index).  Is
> that correct?

Not sure what do you mean by that. I have to admit that I've changed
the 'end' variable from previous inclusive to exclusive for two
reasons. First of all it makes more sense to me and second of all it
solves the pain where we're dealing with the partial truncation within
the first page.

> 
> >  		mem_cgroup_uncharge_start();
> >  		for (i = 0; i < pagevec_count(&pvec); i++) {
> >  			struct page *page = pvec.pages[i];
> >  
> >  			/* We rely upon deletion not changing page->index */
> >  			index = page->index;
> > -			if (index > end)
> > +			if (index >= end)
> 
> hm.  This change implies that the patch changed `end' from inclusive to
> exclusive.  But the patch didn't do that.

Yes, the patch is doing exactly that, but I should have documented I
guess, sorry about that...

> 
> >  				break;
> >  
> >  			if (!trylock_page(page))
> > @@ -250,27 +247,51 @@ void truncate_inode_pages_range(struct address_space *mapping,
> >  		index++;
> >  	}
> >  
> > -	if (partial) {
> > +	if (partial_start) {
> >  		struct page *page = find_lock_page(mapping, start - 1);
> >  		if (page) {
> > +			unsigned int top = PAGE_CACHE_SIZE;
> > +			if (start > end) {
> 
> How can this be true?

It can in the case that we're dealing with partial truncation within the
single page. Because 'start' and 'end' covers only the full pages.
Partial pages are covered with 'partial_start' and 'partial_end' and
it is obvious which page it is.. either the one before start or/and the
one at the 'end'.


> 
> > +				top = partial_end;
> > +				partial_end = 0;
> > +			}
> > +			wait_on_page_writeback(page);
> > +			zero_user_segment(page, partial_start, top);
> > +			cleancache_invalidate_page(mapping, page);
> > +			if (page_has_private(page))
> > +				do_invalidatepage(page, partial_start,
> > +						  top - partial_start);
> > +			unlock_page(page);
> > +			page_cache_release(page);
> > +		}
> > +	}
> > +	if (partial_end) {
> > +		struct page *page = find_lock_page(mapping, end);
> > +		if (page) {
> >  			wait_on_page_writeback(page);
> > -			truncate_partial_page(page, partial);
> > +			zero_user_segment(page, 0, partial_end);
> > +			cleancache_invalidate_page(mapping, page);
> > +			if (page_has_private(page))
> > +				do_invalidatepage(page, 0,
> > +						  partial_end);
> >  			unlock_page(page);
> >  			page_cache_release(page);
> >  		}
> >  	}
> > +	if (start >= end)
> > +		return;
> 
> Again, how can start be greater than end??
> 
> I suspect a lot of the confustion and churn in here is due to `end'
> being kinda-exclusive.  If `lend' was 4094 then `end' is zero.  But if
> `lend' was 4095' then `end' is 1.  So even though `end' refers to the same
> page, it has a different value!

As I mentioned above 'start' and 'end' covers only full pages.
Partial pages are outside the range and those are covered by the
'partial_start' and 'partial_end' variables. Also as you mentioned
'lend' is inclusive.

That said, in your example 'end' does not refer to the same page,
because if 'lend' is 4094 we have a partial truncate (and start-end
does not cover that) and if 'lend' is 4096 we have a full page
truncate (assuming that 'start' is zero) so we cover the whole range
with 'end' being exclusive.

> 
> Would the code be simpler and clearer if we were to make `end' "pgoff_t
> of the last-affected page", and document it as such?
> 

I am not sure about this. It make better sense to me with 'start'
and 'end' covering the range of fully truncated pages with 'end'
being of course exclusive.

I hope I explained myself well enough :). Are you ok with this king
of approach ? If so, I'll resend the patch set without the
initialisation-at-declaration.

Thanks!
-Lukas
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andrew Morton Feb. 4, 2013, 8:51 p.m. UTC | #3
On Mon, 4 Feb 2013 15:51:19 +0100 (CET)
Luk____ Czerner <lczerner@redhat.com> wrote:

> I hope I explained myself well enough :). Are you ok with this king
> of approach ? If so, I'll resend the patch set without the
> initialisation-at-declaration.

uh, maybe.  Next time I'll apply the patch and look at the end result! 
Try to make it as understandable and (hence) maintainable as possible,
OK?

--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Lukas Czerner Feb. 5, 2013, 7:14 a.m. UTC | #4
On Mon, 4 Feb 2013, Andrew Morton wrote:

> Date: Mon, 4 Feb 2013 12:51:36 -0800
> From: Andrew Morton <akpm@linux-foundation.org>
> To: Lukáš Czerner <lczerner@redhat.com>
> Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
>     linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org,
>     xfs@oss.sgi.com, Hugh Dickins <hughd@google.com>
> Subject: Re: [PATCH 10/18] mm: teach truncate_inode_pages_range() to handle
>     non page aligned ranges
> 
> On Mon, 4 Feb 2013 15:51:19 +0100 (CET)
> Luk____ Czerner <lczerner@redhat.com> wrote:
> 
> > I hope I explained myself well enough :). Are you ok with this king
> > of approach ? If so, I'll resend the patch set without the
> > initialisation-at-declaration.
> 
> uh, maybe.  Next time I'll apply the patch and look at the end result! 
> Try to make it as understandable and (hence) maintainable as possible,
> OK?

Agreed.

Thanks!
-Lukas
diff mbox

Patch

diff --git a/mm/truncate.c b/mm/truncate.c
index fdba083..57a5ea3 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -52,14 +52,6 @@  void do_invalidatepage(struct page *page, unsigned int offset,
 		(*invalidatepage)(page, offset, length);
 }
 
-static inline void truncate_partial_page(struct page *page, unsigned partial)
-{
-	zero_user_segment(page, partial, PAGE_CACHE_SIZE);
-	cleancache_invalidate_page(page->mapping, page);
-	if (page_has_private(page))
-		do_invalidatepage(page, partial, PAGE_CACHE_SIZE - partial);
-}
-
 /*
  * This cancels just the dirty bit on the kernel page itself, it
  * does NOT actually remove dirty bits on any mmap's that may be
@@ -191,8 +183,8 @@  int invalidate_inode_page(struct page *page)
  * @lend: offset to which to truncate
  *
  * Truncate the page cache, removing the pages that are between
- * specified offsets (and zeroing out partial page
- * (if lstart is not page aligned)).
+ * specified offsets (and zeroing out partial pages
+ * if lstart or lend + 1 is not page aligned).
  *
  * Truncate takes two passes - the first pass is nonblocking.  It will not
  * block on page locks and it will not block on writeback.  The second pass
@@ -203,35 +195,40 @@  int invalidate_inode_page(struct page *page)
  * We pass down the cache-hot hint to the page freeing code.  Even if the
  * mapping is large, it is probably the case that the final pages are the most
  * recently touched, and freeing happens in ascending file offset order.
+ *
+ * Note that since ->invalidatepage() accepts range to invalidate
+ * truncate_inode_pages_range is able to handle cases where lend + 1 is not
+ * page aligned properly.
  */
 void truncate_inode_pages_range(struct address_space *mapping,
 				loff_t lstart, loff_t lend)
 {
-	const pgoff_t start = (lstart + PAGE_CACHE_SIZE-1) >> PAGE_CACHE_SHIFT;
-	const unsigned partial = lstart & (PAGE_CACHE_SIZE - 1);
+	pgoff_t start = (lstart + PAGE_CACHE_SIZE - 1) >> PAGE_CACHE_SHIFT;
+	pgoff_t end = (lend + 1) >> PAGE_CACHE_SHIFT;
+	unsigned int partial_start = lstart & (PAGE_CACHE_SIZE - 1);
+	unsigned int partial_end = (lend + 1) & (PAGE_CACHE_SIZE - 1);
 	struct pagevec pvec;
 	pgoff_t index;
-	pgoff_t end;
 	int i;
 
 	cleancache_invalidate_inode(mapping);
 	if (mapping->nrpages == 0)
 		return;
 
-	BUG_ON((lend & (PAGE_CACHE_SIZE - 1)) != (PAGE_CACHE_SIZE - 1));
-	end = (lend >> PAGE_CACHE_SHIFT);
+	if (lend == -1)
+		end = -1;	/* unsigned, so actually very big */
 
 	pagevec_init(&pvec, 0);
 	index = start;
-	while (index <= end && pagevec_lookup(&pvec, mapping, index,
-			min(end - index, (pgoff_t)PAGEVEC_SIZE - 1) + 1)) {
+	while (index < end && pagevec_lookup(&pvec, mapping, index,
+			min(end - index, (pgoff_t)PAGEVEC_SIZE))) {
 		mem_cgroup_uncharge_start();
 		for (i = 0; i < pagevec_count(&pvec); i++) {
 			struct page *page = pvec.pages[i];
 
 			/* We rely upon deletion not changing page->index */
 			index = page->index;
-			if (index > end)
+			if (index >= end)
 				break;
 
 			if (!trylock_page(page))
@@ -250,27 +247,51 @@  void truncate_inode_pages_range(struct address_space *mapping,
 		index++;
 	}
 
-	if (partial) {
+	if (partial_start) {
 		struct page *page = find_lock_page(mapping, start - 1);
 		if (page) {
+			unsigned int top = PAGE_CACHE_SIZE;
+			if (start > end) {
+				top = partial_end;
+				partial_end = 0;
+			}
+			wait_on_page_writeback(page);
+			zero_user_segment(page, partial_start, top);
+			cleancache_invalidate_page(mapping, page);
+			if (page_has_private(page))
+				do_invalidatepage(page, partial_start,
+						  top - partial_start);
+			unlock_page(page);
+			page_cache_release(page);
+		}
+	}
+	if (partial_end) {
+		struct page *page = find_lock_page(mapping, end);
+		if (page) {
 			wait_on_page_writeback(page);
-			truncate_partial_page(page, partial);
+			zero_user_segment(page, 0, partial_end);
+			cleancache_invalidate_page(mapping, page);
+			if (page_has_private(page))
+				do_invalidatepage(page, 0,
+						  partial_end);
 			unlock_page(page);
 			page_cache_release(page);
 		}
 	}
+	if (start >= end)
+		return;
 
 	index = start;
 	for ( ; ; ) {
 		cond_resched();
 		if (!pagevec_lookup(&pvec, mapping, index,
-			min(end - index, (pgoff_t)PAGEVEC_SIZE - 1) + 1)) {
+			min(end - index, (pgoff_t)PAGEVEC_SIZE))) {
 			if (index == start)
 				break;
 			index = start;
 			continue;
 		}
-		if (index == start && pvec.pages[0]->index > end) {
+		if (index == start && pvec.pages[0]->index >= end) {
 			pagevec_release(&pvec);
 			break;
 		}
@@ -280,7 +301,7 @@  void truncate_inode_pages_range(struct address_space *mapping,
 
 			/* We rely upon deletion not changing page->index */
 			index = page->index;
-			if (index > end)
+			if (index >= end)
 				break;
 
 			lock_page(page);
@@ -601,10 +622,8 @@  void truncate_pagecache_range(struct inode *inode, loff_t lstart, loff_t lend)
 	 * This rounding is currently just for example: unmap_mapping_range
 	 * expands its hole outwards, whereas we want it to contract the hole
 	 * inwards.  However, existing callers of truncate_pagecache_range are
-	 * doing their own page rounding first; and truncate_inode_pages_range
-	 * currently BUGs if lend is not pagealigned-1 (it handles partial
-	 * page at start of hole, but not partial page at end of hole).  Note
-	 * unmap_mapping_range allows holelen 0 for all, and we allow lend -1.
+	 * doing their own page rounding first.  Note that unmap_mapping_range
+	 * allows holelen 0 for all, and we allow lend -1 for end of file.
 	 */
 
 	/*