# manual: Document mprotect and introduce section on memory protection

Message ID 20171114142614.BFCBD4010386D@oldenburg.str.redhat.com New show manual: Document mprotect and introduce section on memory protection show

## Commit Message

Florian Weimer Nov. 14, 2017, 2:26 p.m.
2017-11-14  Florian Weimer  <fweimer@redhat.com>

manual: Document mprotect
* manual/memory.texi (Memory Protection): New section.
* manual/llio.texi (Memory-mapped I/O): Reference this section.
Remove duplicate documentation of PROT_* flags.


## Comments

Rical Jasan Nov. 15, 2017, 6:44 a.m. | #1
On 11/14/2017 06:26 AM, Florian Weimer wrote:
> 2017-11-14  Florian Weimer  <fweimer@redhat.com>
>
> 	manual: Document mprotect
> 	* manual/memory.texi (Memory Protection): New section.
> 	* manual/llio.texi (Memory-mapped I/O): Reference this section.
> 	Remove duplicate documentation of PROT_* flags.

"this section" sounds like it refers to Memory-mapped I/O.  What about:

"Reference the new section.  Move documentation of PROT_* flags there."

>
> diff --git a/manual/llio.texi b/manual/llio.texi
> index 10ad546723..e635c0eaa8 100644
> --- a/manual/llio.texi
> +++ b/manual/llio.texi
> @@ -1411,19 +1411,11 @@ is created, which is not removed by closing the file.
>  address is automatically removed.  The address you give may still be
>  changed, unless you use the @code{MAP_FIXED} flag.
>
> -@vindex PROT_READ
> -@vindex PROT_WRITE
> -@vindex PROT_EXEC
>  @var{protect} contains flags that control what kind of access is
>  permitted.  They include @code{PROT_READ}, @code{PROT_WRITE}, and
> -@code{PROT_EXEC}, which permit reading, writing, and execution,
> -respectively.  Inappropriate access will cause a segfault (@pxref{Program
> -Error Signals}).
> -
> -Note that most hardware designs cannot support write permission without
> -read permission, and many do not distinguish read and execute permission.
> -Thus, you may receive wider permissions than you ask for, and mappings of
> -write-only files may be denied even if you do not use @code{PROT_READ}.
> +@code{PROT_EXEC}.  The special flag @code{PROT_NONE} reserve a region of

"reserves"

> +address space for future use.  The @code{mprotect} function can be used
> +to change the protection flags.  @xref{Memory Protection}.
>
>  @var{flags} contains flags that control the nature of the map.
>  One of @code{MAP_SHARED} or @code{MAP_PRIVATE} must be specified.
> diff --git a/manual/memory.texi b/manual/memory.texi
> index 51a5f4e83c..9a9340cace 100644
> --- a/manual/memory.texi
> +++ b/manual/memory.texi
> @@ -17,6 +17,7 @@ and allocation of real memory.
>  * Memory Concepts::             An introduction to concepts and terminology.
>  * Memory Allocation::           Allocating storage for your program data
>  * Resizing the Data Segment::   @code{brk}, @code{sbrk}
> +* Memory Protection::      	Controlling access to memory regions.

Looks like a tab is throwing off the alignment.

>  * Locking Pages::               Preventing page faults
>  @end menu
>
> @@ -3047,7 +3048,120 @@ of the data segment is.
>
>  @end deftypefun
>
> +@node Memory Protection
> +@section Memory Protection
> +@cindex memory protection
> +@cindex page protection
> +@cindex protection flags
>
> +When a page is mapped using @code{mmap}, page protection flags can be
> +specified using the protection flags argument.  @xref{Memory-mapped
> +I/O}.
> +
> +The following flags are availabe.

"available:"

> +
> +@table @code
> +@item PROT_WRITE
> +@vindex PROT_WRITE

Make this an @vtable and drop the @vindex entries (@items in an @vtable
are automatically @vindex-ed).  The @items should then have @standards
beneath them with the relevant standard and header, which will also add
them to the Summary of Library Facilities.

> +The memory can be written to.
> +
> +@item PROT_READ
> +@vindex PROT_READ
> +The memory can be read.  On some architectures, this flag implies that

A personal preference, but I would drop "that" as superfluous.

> +the memory can be executed as well (as if @code{PROT_EXEC} had been
> +specified at the same time).

This seems a fair warning, but how would one know if their architecture
was affected?  Is there a good general method, other than simply trying
to execute something?

> +
> +@item PROT_EXEC
> +@vindex PROT_EXEC
> +The memory can be used store instructions which can then be executed.

"used to store"

> +On most architectures, this flag implies that the memory can be read (as

Similarly for "that" here, as above.

> +if @code{PROT_READ} had been specified).

The use of "most" with the READ/EXEC caveat here contrasts with the use
of "some" above, in a good way, if that was intentional.  I think the
relationship here is more intuitive, anyway, which the language seems to
suggest.  If there is a good way to tell, though, it would be nice to be
complete and provide some direction here as well, or maybe for both
after the @table.

> +

Would an @cindex anonymous mappings' make sense here?  (We need one
above MAP_ANON, but that's beyond the scope of this patch.)

> +@item PROT_NONE
> +@vindex PROT_NONE
> +This flag must be specified on its own.
> +
> +The memory is reserved, but cannot be read, written, or executed.  If
> +this flag is specified in a call to @code{mmap}, a virtual memory area
> +will be set aside for future use in the process, and @code{mmap} calls
> +without the @code{MAP_FIXED} flag will not use it for subsequent
> +allocations.  For anonymous mappings, the kernel will not reserve any
> +physical memory for the allocation, though.
> +@end table
> +
> +The operating system may keep of these flags separately even if the

"keep track"

> +underlying hardware treats them the same for the purposes of access
> +checking (as it happens with @code{PROT_READ} and @code{PROT_EXEC} on

I would just say, "as happens".

> +some platforms).
> +
> +Inappropriate access will cause a segfault (@pxref{Program Error
> +Signals}).
> +
> +After allocation, protection flags can be changed using the
> +@code{mprotect} function.
> +
> +@deftypefun int mprotect (void *@var{address}, size_t @var{length}, int @var{protection})
> +@standards{POSIX, sys/mman.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +
> +A successful call to the @code{mprotect} function changes the protection
> +flags of at least @var{length} bytes of memory, starting at
> +@var{address}.
> +
> +@var{address} must be aligned to the page size for the mapping.  The
> +system page size can be obtained using @samp{sysconf (_SC_PAGE_SIZE)}.

"_SC_PAGESIZE"

I believe we avoid function calls in sentences, so this should either be
reworded or made into an @[small]example.  Rewording might make it
easier to more tightly bind the reference with an @pxref.  Perhaps:

"The system page size can be obtained by calling @code{sysconf} with the
@code{_SC_PAGESIZE} parameter (@pxref{Sysconf Definition})."

> +@xref{Sysconf Definition}.  The system page size is the granularity in
> +which the page protection of anonymous memory mappings and most file
> +mappings can be changed.  Memory which is mapped from special files or
> +devices may have larger page granularity than the system page size and
> +may require larger alignment.
> +
> +@var{length} is the number of bytes whose protection flags must be
> +changed.  It is automatically rounded up to the next multiple of the
> +system page size.
> +
> +@var{protection} is a combination of the @code{PROT_}* described above.

"the @code{PROT_*} flags described above."  (Two changes: the addition
of "flags", and the manual puts the wildcard within the brace.)

Regarding the @table below, the manual has ingrained in me the
traditional introductory phrase:

"The following @code{errno} error conditions are defined for this function:"

> +
> +@table @code
> +@item ENOMEM
> +The system was not able to allocate resources to fulfill the request.
> +This can happen if there is not enough physical memory in the system for
> +the allocation of backing storage (if the memory was initially mapped
> +with @code{PROT_NONE}).  The error can also occur if the new protection

Will calling mprotect always cause backing storage to be allocated if it
wasn't already?  If so, that might be mentioned above.

Also, the PROT_NONE description says virtual memory is reserved, but the
use of "though" in the sentence about anonymous mappings makes it sound
like this error would only be encountered if PROT_NONE was used with
MAP_ANON, so it might be good to clarify whether the use of PROT_NONE
without MAP_ANON will, won't, or may allocate backing storage.

> +flags would cause the memory region to be split from its neighbors, and
> +the process limit for the number of such distinct memory regions would
> +be exceeded.

Is there a way to know that limit, and the process' current count?  May
be TMI, but these kinds of warnings always make me wonder.

> +
> +@item EINVAL
> +@var{address} is not properly aligned to a page boundary for the
> +mapping, or @var{length} (after rounding up to the system page size) is
> +not a multiple of the applicable page size for the mapping, or the
> +combination of flags in @var{protection} is not valid.
> +
> +@item EACCES
> +The file for a file-based mapping was not opened with flags which match
> +@var{protection}.

Isn't mprotect supposed to set the flags to protection?  Is there
somewhere we can reference what is special about file-based mappings
such that they can prevent the use of mprotect?  How is this different
from EPERM, below?  The content below the @deftypefun makes it sound
like this might only be returned by some systems.

> +
> +@item EPERM
> +The system security policy does not allow a mapping with the specified
> +flags.  For example, mappings which are both @code{PROT_EXEC} and
> +@code{PROT_WRITE} at the same time might not be allowed.
> +@end table
> +@end deftypefun
> +
> +If the @code{mprotect} function is used to make a region of memory
> +inaccessible by specifying the @code{PROT_NONE} protection flag and
> +access is later restored, the memory retains its previous contents.
> +
> +On some systems, it may not be possible to specify additional flags
> +which were not present when the mapping was first created.  For example,
> +an attempt make a region of memory executable could fail if the initial

"attempt to make"

> +protection flags where @samp{PROT_READ | PROT_WRITE}.

"were"

> +
> +In general, the @code{mprotect} function can be used to change any
> +process memory, no matter how it was allocated.  However, portable use
> +of the function requires that it is only used with memory regions
> +returned by @code{mmap} or @code{mmap64}.
>
>  @node Locking Pages
>  @section Locking Pages
>

I think this is quality documentation.  Thank you.

Rical

Florian Weimer Nov. 15, 2017, 1:40 p.m. | #2
On 11/15/2017 07:44 AM, Rical Jasan wrote:
> On 11/14/2017 06:26 AM, Florian Weimer wrote:
>> 2017-11-14  Florian Weimer  <fweimer@redhat.com>
>>
>> 	manual: Document mprotect
>> 	* manual/memory.texi (Memory Protection): New section.
>> 	* manual/llio.texi (Memory-mapped I/O): Reference this section.
>> 	Remove duplicate documentation of PROT_* flags.
>
> "this section" sounds like it refers to Memory-mapped I/O.  What about:
>
> "Reference the new section.  Move documentation of PROT_* flags there."

I've replaced it with the section name.

>> -Note that most hardware designs cannot support write permission without
>> -read permission, and many do not distinguish read and execute permission.
>> -Thus, you may receive wider permissions than you ask for, and mappings of
>> -write-only files may be denied even if you do not use @code{PROT_READ}.
>> +@code{PROT_EXEC}.  The special flag @code{PROT_NONE} reserve a region of
>
> "reserves"

Thanks, fixed.

>> --- a/manual/memory.texi
>> +++ b/manual/memory.texi
>> @@ -17,6 +17,7 @@ and allocation of real memory.
>>   * Memory Concepts::             An introduction to concepts and terminology.
>>   * Memory Allocation::           Allocating storage for your program data
>>   * Resizing the Data Segment::   @code{brk}, @code{sbrk}
>> +* Memory Protection::      	Controlling access to memory regions.
>
> Looks like a tab is throwing off the alignment.

Interesting.  Not sure how this happened.  I switched to spaces, like
the surrounding entries.

>> +The following flags are availabe.
>
> "available:"

Fixed.

>> +@table @code
>> +@item PROT_WRITE
>> +@vindex PROT_WRITE
>
> Make this an @vtable and drop the @vindex entries (@items in an @vtable
> are automatically @vindex-ed).  The @items should then have @standards
> beneath them with the relevant standard and header, which will also add
> them to the Summary of Library Facilities.

Nice!

>> +The memory can be written to.
>> +
>> +@item PROT_READ
>> +@vindex PROT_READ
>> +The memory can be read.  On some architectures, this flag implies that
>
> A personal preference, but I would drop "that" as superfluous.

I find it easier to read because it indicates that the following nominal
phrase belongs to a subclause.

>> +the memory can be executed as well (as if @code{PROT_EXEC} had been
>> +specified at the same time).
>
> This seems a fair warning, but how would one know if their architecture
> was affected?  Is there a good general method, other than simply trying
> to execute something?

There is no general mechanism to figure out what the system will do if
execution is attempted.  Linux will show the selected protection flags
in /proc/self/maps even if the hardware does not enforce them.  There's
a funny story everyone assumed that s390x had non-executable
data/stack/heap because it shows up in /proc, and the test which
attempted to execute run-time generated code always crashed.
Eventually, it turned out that the test crashed no matter what (it used
sizeof (func) to compute the size of a function, which is always 1, a
GCC extension). 8-/

For device memory, all kinds of funny things can happen.  With memory
protection keys, the situation is even more murky because system may use
reserved keys to implement otherwise unsupported protection flag
combinations, so you get different baseline (non-key mediated) access
privileges if you specify a key explicitly using pkey_mprotect.

>> +
>> +@item PROT_EXEC
>> +@vindex PROT_EXEC
>> +The memory can be used store instructions which can then be executed.
>
> "used to store"
>
>> +On most architectures, this flag implies that the memory can be read (as
>
> Similarly for "that" here, as above.
>
>> +if @code{PROT_READ} had been specified).
>
> The use of "most" with the READ/EXEC caveat here contrasts with the use
> of "some" above, in a good way, if that was intentional.

It is.  8-)

> I think the
> relationship here is more intuitive, anyway, which the language seems to
> suggest.  If there is a good way to tell, though, it would be nice to be
> complete and provide some direction here as well, or maybe for both
> after the @table.

A fuller explanation needs to reference the (currently undocumented)
personality function.  We could say that on GNU, PROT_EXEC always
implies PROT_READ, as a matter of policy.

>> +
>
> Would an @cindex anonymous mappings' make sense here?  (We need one
> above MAP_ANON, but that's beyond the scope of this patch.)
>
>> +@item PROT_NONE
>> +@vindex PROT_NONE
>> +This flag must be specified on its own.
>> +
>> +The memory is reserved, but cannot be read, written, or executed.  If
>> +this flag is specified in a call to @code{mmap}, a virtual memory area
>> +will be set aside for future use in the process, and @code{mmap} calls
>> +without the @code{MAP_FIXED} flag will not use it for subsequent
>> +allocations.  For anonymous mappings, the kernel will not reserve any
>> +physical memory for the allocation, though.
>> +@end table

I'm changing the last sentence to:

“For anonymous mappings, the kernel will not reserve any
physical memory for the allocation at the time the mapping is created.”

>> +The operating system may keep of these flags separately even if the
>
> "keep track"

Fixed.

>> +underlying hardware treats them the same for the purposes of access
>> +checking (as it happens with @code{PROT_READ} and @code{PROT_EXEC} on
>
> I would just say, "as happens".

Okay.

>> +some platforms).
>> +
>> +Inappropriate access will cause a segfault (@pxref{Program Error
>> +Signals}).
>> +
>> +After allocation, protection flags can be changed using the
>> +@code{mprotect} function.
>> +
>> +@deftypefun int mprotect (void *@var{address}, size_t @var{length}, int @var{protection})
>> +@standards{POSIX, sys/mman.h}
>> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>> +
>> +A successful call to the @code{mprotect} function changes the protection
>> +flags of at least @var{length} bytes of memory, starting at
>> +@var{address}.
>> +
>> +@var{address} must be aligned to the page size for the mapping.  The
>> +system page size can be obtained using @samp{sysconf (_SC_PAGE_SIZE)}.
>
> "_SC_PAGESIZE"

Oops.

> I believe we avoid function calls in sentences, so this should either be
> reworded or made into an @[small]example.  Rewording might make it
> easier to more tightly bind the reference with an @pxref.  Perhaps:
>
> "The system page size can be obtained by calling @code{sysconf} with the
> @code{_SC_PAGESIZE} parameter (@pxref{Sysconf Definition})."

I've adopted this suggestions.

>> +@xref{Sysconf Definition}.  The system page size is the granularity in
>> +which the page protection of anonymous memory mappings and most file
>> +mappings can be changed.  Memory which is mapped from special files or
>> +devices may have larger page granularity than the system page size and
>> +may require larger alignment.
>> +
>> +@var{length} is the number of bytes whose protection flags must be
>> +changed.  It is automatically rounded up to the next multiple of the
>> +system page size.
>> +
>> +@var{protection} is a combination of the @code{PROT_}* described above.
>
> "the @code{PROT_*} flags described above."  (Two changes: the addition
> of "flags", and the manual puts the wildcard within the brace.)

Good.

> Regarding the @table below, the manual has ingrained in me the
> traditional introductory phrase:
>
> "The following @code{errno} error conditions are defined for this function:"

Right, I missed that.

>> +@table @code
>> +@item ENOMEM
>> +The system was not able to allocate resources to fulfill the request.
>> +This can happen if there is not enough physical memory in the system for
>> +the allocation of backing storage (if the memory was initially mapped
>> +with @code{PROT_NONE}).  The error can also occur if the new protection
>
> Will calling mprotect always cause backing storage to be allocated if it
> wasn't already?  If so, that might be mentioned above.

That depends on the overrcommit configuration of the system.  The Linux
default is not to reserve backing store, but ridiculously large requests
will still fail (even before hitting architectural virtual memory limits).

> Also, the PROT_NONE description says virtual memory is reserved, but the
> use of "though" in the sentence about anonymous mappings makes it sound
> like this error would only be encountered if PROT_NONE was used with
> MAP_ANON, so it might be good to clarify whether the use of PROT_NONE
> without MAP_ANON will, won't, or may allocate backing storage.

I tweaked the PROT_NONE documentation, as explained above.

The full story is complex.  For regular file-based memory and a
MAP_SHARED mapping or a mapping which is never written to, there is
backing store.  But if you map a file from tmpfs, Linux will by default
not allocate physical memory (again depending on the overcommit
configuration).

>> +flags would cause the memory region to be split from its neighbors, and
>> +the process limit for the number of such distinct memory regions would
>> +be exceeded.
>
> Is there a way to know that limit, and the process' current count?  May
> be TMI, but these kinds of warnings always make me wonder.

On Linux, you can read it from /proc, but it's difficult to predict when
splitting is needed because it depends on arch-specific memory
management aspects.

>> +
>> +@item EINVAL
>> +@var{address} is not properly aligned to a page boundary for the
>> +mapping, or @var{length} (after rounding up to the system page size) is
>> +not a multiple of the applicable page size for the mapping, or the
>> +combination of flags in @var{protection} is not valid.
>> +
>> +@item EACCES
>> +The file for a file-based mapping was not opened with flags which match
>> +@var{protection}.
>
> Isn't mprotect supposed to set the flags to protection?

Right, the wording isn't correct.

“
The file for a file-based mapping was not opened with open flags which
are compatible with @var{protection}.
“”

> Is there
> somewhere we can reference what is special about file-based mappings
> such that they can prevent the use of mprotect?  How is this different
> from EPERM, below?  The content below the @deftypefun makes it sound
> like this might only be returned by some systems.

Maybe we should just say “non-anonymous mappings“ instead?

You can map device memory using files under /dev and /proc, even
starting with regular files, and strange things can happen there.  These
are technically file-based mappings because that's how you create them,
but they are very different from what files would behave like.

EACCES vs EPERM is a bit tricky.  On Linux, EACCES generally refers to
the permission bits/ACLs on the objects directly involved, and EPERM is
more about policy violations and exceeded limits.  Many library
functions can return both, and I'm not sure this is the right place to
elaborate on the difference.

>> +@item EPERM
>> +The system security policy does not allow a mapping with the specified
>> +flags.  For example, mappings which are both @code{PROT_EXEC} and
>> +@code{PROT_WRITE} at the same time might not be allowed.
>> +@end table
>> +@end deftypefun
>> +
>> +If the @code{mprotect} function is used to make a region of memory
>> +inaccessible by specifying the @code{PROT_NONE} protection flag and
>> +access is later restored, the memory retains its previous contents.
>> +
>> +On some systems, it may not be possible to specify additional flags
>> +which were not present when the mapping was first created.  For example,
>> +an attempt make a region of memory executable could fail if the initial
>
> "attempt to make"
>
>> +protection flags where @samp{PROT_READ | PROT_WRITE}.
>
> "were"

Fixed.

>> +In general, the @code{mprotect} function can be used to change any
>> +process memory, no matter how it was allocated.  However, portable use
>> +of the function requires that it is only used with memory regions
>> +returned by @code{mmap} or @code{mmap64}.
>>
>>   @node Locking Pages
>>   @section Locking Pages
>>
>
> I think this is quality documentation.  Thank you.

Thanks a lot for your review!

I'm attaching the updated patch.

Florian
2017-11-14  Florian Weimer  <fweimer@redhat.com>

manual: Document mprotect
* manual/memory.texi (Memory Protection): New section.
* manual/llio.texi (Memory-mapped I/O): Remove duplicate
documentation of PROT_* flags and reference the Memory Protection
section instead.

diff --git a/manual/llio.texi b/manual/llio.texi
index 10ad546723..cbcd2221f6 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -1411,19 +1411,11 @@ is created, which is not removed by closing the file.
address is automatically removed.  The address you give may still be
changed, unless you use the @code{MAP_FIXED} flag.

-@vindex PROT_READ
-@vindex PROT_WRITE
-@vindex PROT_EXEC
@var{protect} contains flags that control what kind of access is
permitted.  They include @code{PROT_READ}, @code{PROT_WRITE}, and
-@code{PROT_EXEC}, which permit reading, writing, and execution,
-respectively.  Inappropriate access will cause a segfault (@pxref{Program
-Error Signals}).
-
-Note that most hardware designs cannot support write permission without
-read permission, and many do not distinguish read and execute permission.
-Thus, you may receive wider permissions than you ask for, and mappings of
-write-only files may be denied even if you do not use @code{PROT_READ}.
+@code{PROT_EXEC}.  The special flag @code{PROT_NONE} reserves a region
+of address space for future use.  The @code{mprotect} function can be
+used to change the protection flags.  @xref{Memory Protection}.

@var{flags} contains flags that control the nature of the map.
One of @code{MAP_SHARED} or @code{MAP_PRIVATE} must be specified.
diff --git a/manual/memory.texi b/manual/memory.texi
index 51a5f4e83c..dd6ac37c10 100644
--- a/manual/memory.texi
+++ b/manual/memory.texi
@@ -17,6 +17,7 @@ and allocation of real memory.
* Memory Concepts::             An introduction to concepts and terminology.
* Memory Allocation::           Allocating storage for your program data
* Resizing the Data Segment::   @code{brk}, @code{sbrk}
+* Memory Protection::           Controlling access to memory regions.
* Locking Pages::               Preventing page faults
@end menu

@@ -3047,7 +3048,126 @@ of the data segment is.

@end deftypefun

+@node Memory Protection
+@section Memory Protection
+@cindex memory protection
+@cindex page protection
+@cindex protection flags

+When a page is mapped using @code{mmap}, page protection flags can be
+specified using the protection flags argument.  @xref{Memory-mapped
+I/O}.
+
+The following flags are available:
+
+@vtable @code
+@item PROT_WRITE
+@standards{POSIX, sys/mman.h}
+The memory can be written to.
+
+@item PROT_READ
+@standards{POSIX, sys/mman.h}
+The memory can be read.  On some architectures, this flag implies that
+the memory can be executed as well (as if @code{PROT_EXEC} had been
+specified at the same time).
+
+@item PROT_EXEC
+@standards{POSIX, sys/mman.h}
+The memory can be used to store instructions which can then be executed.
+On most architectures, this flag implies that the memory can be read (as
+if @code{PROT_READ} had been specified).
+
+@item PROT_NONE
+@standards{POSIX, sys/mman.h}
+This flag must be specified on its own.
+
+The memory is reserved, but cannot be read, written, or executed.  If
+this flag is specified in a call to @code{mmap}, a virtual memory area
+will be set aside for future use in the process, and @code{mmap} calls
+without the @code{MAP_FIXED} flag will not use it for subsequent
+allocations.  For anonymous mappings, the kernel will not reserve any
+physical memory for the allocation at the time the mapping is created.
+@end vtable
+
+The operating system may keep track of these flags separately even if
+the underlying hardware treats them the same for the purposes of access
+checking (as happens with @code{PROT_READ} and @code{PROT_EXEC} on some
+platforms).  On GNU systems, @code{PROT_EXEC} always implies
+@code{PROT_READ}, so that users can view the machine code which is
+executing on their system.
+
+Inappropriate access will cause a segfault (@pxref{Program Error
+Signals}).
+
+After allocation, protection flags can be changed using the
+@code{mprotect} function.
+
+@deftypefun int mprotect (void *@var{address}, size_t @var{length}, int @var{protection})
+@standards{POSIX, sys/mman.h}
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+
+A successful call to the @code{mprotect} function changes the protection
+flags of at least @var{length} bytes of memory, starting at
+@var{address}.
+
+@var{address} must be aligned to the page size for the mapping.  The
+system page size can be obtained by calling @code{sysconf} with the
+@code{_SC_PAGESIZE} parameter (@pxref{Sysconf Definition}).  The system
+page size is the granularity in which the page protection of anonymous
+memory mappings and most file mappings can be changed.  Memory which is
+mapped from special files or devices may have larger page granularity
+than the system page size and may require larger alignment.
+
+@var{length} is the number of bytes whose protection flags must be
+changed.  It is automatically rounded up to the next multiple of the
+system page size.
+
+@var{protection} is a combination of the @code{PROT_*} flags described
+above.
+
+The following @code{errno} error conditions are defined for this
+function:
+
+@table @code
+@item ENOMEM
+The system was not able to allocate resources to fulfill the request.
+This can happen if there is not enough physical memory in the system for
+the allocation of backing storage (if the memory was initially mapped
+with @code{PROT_NONE}).  The error can also occur if the new protection
+flags would cause the memory region to be split from its neighbors, and
+the process limit for the number of such distinct memory regions would
+be exceeded.
+
+@item EINVAL
+@var{address} is not properly aligned to a page boundary for the
+mapping, or @var{length} (after rounding up to the system page size) is
+not a multiple of the applicable page size for the mapping, or the
+combination of flags in @var{protection} is not valid.
+
+@item EACCES
+The file for a file-based mapping was not opened with open flags which
+are compatible with @var{protection}.
+
+@item EPERM
+The system security policy does not allow a mapping with the specified
+flags.  For example, mappings which are both @code{PROT_EXEC} and
+@code{PROT_WRITE} at the same time might not be allowed.
+@end table
+@end deftypefun
+
+If the @code{mprotect} function is used to make a region of memory
+inaccessible by specifying the @code{PROT_NONE} protection flag and
+access is later restored, the memory retains its previous contents.
+
+On some systems, it may not be possible to specify additional flags
+which were not present when the mapping was first created.  For example,
+an attempt to make a region of memory executable could fail if the
+initial protection flags were @samp{PROT_READ | PROT_WRITE}.
+
+In general, the @code{mprotect} function can be used to change any
+process memory, no matter how it was allocated.  However, portable use
+of the function requires that it is only used with memory regions
+returned by @code{mmap} or @code{mmap64}.

@node Locking Pages
@section Locking Pages

Rical Jasan Nov. 16, 2017, 12:05 p.m. | #3
On 11/15/2017 05:40 AM, Florian Weimer wrote:
> On 11/15/2017 07:44 AM, Rical Jasan wrote:
>> On 11/14/2017 06:26 AM, Florian Weimer wrote:
>>> 2017-11-14  Florian Weimer  <fweimer@redhat.com>
>>>
>>>     manual: Document mprotect
>>>     * manual/memory.texi (Memory Protection): New section.
>>>     * manual/llio.texi (Memory-mapped I/O): Reference this section.
>>>     Remove duplicate documentation of PROT_* flags.
>>
>> "this section" sounds like it refers to Memory-mapped I/O.  What about:
>>
>> "Reference the new section.  Move documentation of PROT_* flags there."
>
> I've replaced it with the section name.
>
>>> -Note that most hardware designs cannot support write permission without
>>> -read permission, and many do not distinguish read and execute permission.
>>>
>>> -Thus, you may receive wider permissions than you ask for, and mappings of
>>>
>>> -write-only files may be denied even if you do not use @code{PROT_READ}.
>>> +@code{PROT_EXEC}.  The special flag @code{PROT_NONE} reserve a region of
>>>
>>
>> "reserves"
>
> Thanks, fixed.
>
>>> --- a/manual/memory.texi
>>> +++ b/manual/memory.texi
>>> @@ -17,6 +17,7 @@ and allocation of real memory.
>>>   * Memory Concepts::             An introduction to concepts and terminology.
>>>
>>>   * Memory Allocation::           Allocating storage for your program data
>>>
>>>   * Resizing the Data Segment::   @code{brk}, @code{sbrk}
>>> +* Memory Protection::          Controlling access to memory regions.
>>
>> Looks like a tab is throwing off the alignment.
>
> Interesting.  Not sure how this happened.  I switched to spaces, like
> the surrounding entries.
>
>>> +The following flags are availabe.
>>
>> "available:"
>
> Fixed.
>
>>> +@table @code
>>> +@item PROT_WRITE
>>> +@vindex PROT_WRITE
>>
>> Make this an @vtable and drop the @vindex entries (@items in an @vtable
>> are automatically @vindex-ed).  The @items should then have @standards
>> beneath them with the relevant standard and header, which will also add
>> them to the Summary of Library Facilities.
>
> Nice!
>
>>> +The memory can be written to.
>>> +
>>> +@item PROT_READ
>>> +@vindex PROT_READ
>>> +The memory can be read.  On some architectures, this flag implies that
>>
>> A personal preference, but I would drop "that" as superfluous.
>
> I find it easier to read because it indicates that the following nominal
> phrase belongs to a subclause.
>
>>> +the memory can be executed as well (as if @code{PROT_EXEC} had been
>>> +specified at the same time).
>>
>> This seems a fair warning, but how would one know if their architecture
>> was affected?  Is there a good general method, other than simply trying
>> to execute something?
>
> There is no general mechanism to figure out what the system will do if
> execution is attempted.  Linux will show the selected protection flags
> in /proc/self/maps even if the hardware does not enforce them.  There's
> a funny story everyone assumed that s390x had non-executable
> data/stack/heap because it shows up in /proc, and the test which
> attempted to execute run-time generated code always crashed. Eventually,
> it turned out that the test crashed no matter what (it used sizeof
> (func) to compute the size of a function, which is always 1, a
> GCC extension). 8-/
>
> For device memory, all kinds of funny things can happen.  With memory
> protection keys, the situation is even more murky because system may use
> reserved keys to implement otherwise unsupported protection flag
> combinations, so you get different baseline (non-key mediated) access
> privileges if you specify a key explicitly using pkey_mprotect.
>
>>> +
>>> +@item PROT_EXEC
>>> +@vindex PROT_EXEC
>>> +The memory can be used store instructions which can then be executed.
>>
>> "used to store"
>>
>>> +On most architectures, this flag implies that the memory can be read (as
>>>
>>
>> Similarly for "that" here, as above.
>>
>>> +if @code{PROT_READ} had been specified).
>>
>> The use of "most" with the READ/EXEC caveat here contrasts with the use
>> of "some" above, in a good way, if that was intentional.
>
> It is.  8-)
>
>> I think the
>> relationship here is more intuitive, anyway, which the language seems to
>> suggest.  If there is a good way to tell, though, it would be nice to be
>> complete and provide some direction here as well, or maybe for both
>> after the @table.
>
> A fuller explanation needs to reference the (currently undocumented)
> personality function.  We could say that on GNU, PROT_EXEC always
> implies PROT_READ, as a matter of policy.>
>>> +
>>
>> Would an @cindex anonymous mappings' make sense here?  (We need one
>> above MAP_ANON, but that's beyond the scope of this patch.)
>>
>>> +@item PROT_NONE
>>> +@vindex PROT_NONE
>>> +This flag must be specified on its own.
>>> +
>>> +The memory is reserved, but cannot be read, written, or executed.  If
>>> +this flag is specified in a call to @code{mmap}, a virtual memory area
>>> +will be set aside for future use in the process, and @code{mmap} calls
>>> +without the @code{MAP_FIXED} flag will not use it for subsequent
>>> +allocations.  For anonymous mappings, the kernel will not reserve any
>>> +physical memory for the allocation, though.
>>> +@end table
>
> I'm changing the last sentence to:
>
> “For anonymous mappings, the kernel will not reserve any
> physical memory for the allocation at the time the mapping is created.”
>
>>> +The operating system may keep of these flags separately even if the
>>
>> "keep track"
>
> Fixed.
>
>>> +underlying hardware treats them the same for the purposes of access
>>> +checking (as it happens with @code{PROT_READ} and @code{PROT_EXEC} on
>>
>> I would just say, "as happens".
>
> Okay.
>
>>> +some platforms).
>>> +
>>> +Inappropriate access will cause a segfault (@pxref{Program Error
>>> +Signals}).
>>> +
>>> +After allocation, protection flags can be changed using the
>>> +@code{mprotect} function.
>>> +
>>> +@deftypefun int mprotect (void *@var{address}, size_t @var{length}, int @var{protection})
>>>
>>> +@standards{POSIX, sys/mman.h}
>>> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
>>> +
>>> +A successful call to the @code{mprotect} function changes the protection
>>>
>>> +flags of at least @var{length} bytes of memory, starting at
>>> +@var{address}.
>>> +
>>> +@var{address} must be aligned to the page size for the mapping.  The
>>> +system page size can be obtained using @samp{sysconf (_SC_PAGE_SIZE)}.
>>
>> "_SC_PAGESIZE"
>
> Oops.
>
>> I believe we avoid function calls in sentences, so this should either be
>> reworded or made into an @[small]example.  Rewording might make it
>> easier to more tightly bind the reference with an @pxref.  Perhaps:
>>
>> "The system page size can be obtained by calling @code{sysconf} with the
>> @code{_SC_PAGESIZE} parameter (@pxref{Sysconf Definition})."
>
> I've adopted this suggestions.
>
>>> +@xref{Sysconf Definition}.  The system page size is the granularity in
>>> +which the page protection of anonymous memory mappings and most file
>>> +mappings can be changed.  Memory which is mapped from special files or
>>> +devices may have larger page granularity than the system page size and
>>> +may require larger alignment.
>>> +
>>> +@var{length} is the number of bytes whose protection flags must be
>>> +changed.  It is automatically rounded up to the next multiple of the
>>> +system page size.
>>> +
>>> +@var{protection} is a combination of the @code{PROT_}* described above.
>>
>> "the @code{PROT_*} flags described above."  (Two changes: the addition
>> of "flags", and the manual puts the wildcard within the brace.)
>
> Good.
>
>> Regarding the @table below, the manual has ingrained in me the
>> traditional introductory phrase:
>>
>> "The following @code{errno} error conditions are defined for this function:"
>>
>
> Right, I missed that.
>
>>> +@table @code
>>> +@item ENOMEM
>>> +The system was not able to allocate resources to fulfill the request.
>>> +This can happen if there is not enough physical memory in the system for
>>>
>>> +the allocation of backing storage (if the memory was initially mapped
>>> +with @code{PROT_NONE}).  The error can also occur if the new protection
>>
>> Will calling mprotect always cause backing storage to be allocated if it
>> wasn't already?  If so, that might be mentioned above.
>
> That depends on the overrcommit configuration of the system.  The Linux
> default is not to reserve backing store, but ridiculously large requests
> will still fail (even before hitting architectural virtual memory limits).
>
>> Also, the PROT_NONE description says virtual memory is reserved, but the
>> use of "though" in the sentence about anonymous mappings makes it sound
>> like this error would only be encountered if PROT_NONE was used with
>> MAP_ANON, so it might be good to clarify whether the use of PROT_NONE
>> without MAP_ANON will, won't, or may allocate backing storage.
>
> I tweaked the PROT_NONE documentation, as explained above.
>
> The full story is complex.  For regular file-based memory and a
> MAP_SHARED mapping or a mapping which is never written to, there is
> backing store.  But if you map a file from tmpfs, Linux will by default
> not allocate physical memory (again depending on the overcommit
> configuration).
>
>>> +flags would cause the memory region to be split from its neighbors, and
>>> +the process limit for the number of such distinct memory regions would
>>> +be exceeded.
>>
>> Is there a way to know that limit, and the process' current count?  May
>> be TMI, but these kinds of warnings always make me wonder.
>
> On Linux, you can read it from /proc, but it's difficult to predict when
> splitting is needed because it depends on arch-specific memory
> management aspects.
>
>>> +
>>> +@item EINVAL
>>> +@var{address} is not properly aligned to a page boundary for the
>>> +mapping, or @var{length} (after rounding up to the system page size) is
>>> +not a multiple of the applicable page size for the mapping, or the
>>> +combination of flags in @var{protection} is not valid.
>>> +
>>> +@item EACCES
>>> +The file for a file-based mapping was not opened with flags which match
>>> +@var{protection}.
>>
>> Isn't mprotect supposed to set the flags to protection?
>
> Right, the wording isn't correct.
>
> “
> The file for a file-based mapping was not opened with open flags which
> are compatible with @var{protection}.
> “”
>
>> Is there
>> somewhere we can reference what is special about file-based mappings
>> such that they can prevent the use of mprotect?  How is this different
>> from EPERM, below?  The content below the @deftypefun makes it sound
>> like this might only be returned by some systems.
>
> Maybe we should just say “non-anonymous mappings“ instead?
>
> You can map device memory using files under /dev and /proc, even
> starting with regular files, and strange things can happen there.  These
> are technically file-based mappings because that's how you create them,
> but they are very different from what files would behave like.

Given the is-or-isn't nature of "[non-]anonymous", that could work.  If
"file-based" is equivalent to "non-anonymous" and we stuck with the
latter, that'd be fine, but my concern is the (lack of) introduction of
these terms.  "Anonymous mapping" only appears once in the manual, isn't
introduced with an @dfn, and doesn't have an @cindex to find it by; and
this is the first occurrence of "file-based mapping".

I can submit a patch that introduces and indexes the anonymous mapping
concept where it's mentioned in llio.texi, but I don't know where we
would provide a similarly cursory mention of file-based mappings.
Ultimately, I'd rather see this patch go in than get stalled on my own
unfamiliarity with the vernacular, and if "file-based" is how you as the
author would naturally refer to it, that's probably best.

>
> EACCES vs EPERM is a bit tricky.  On Linux, EACCES generally refers to
> the permission bits/ACLs on the objects directly involved, and EPERM is
> more about policy violations and exceeded limits.  Many library
> functions can return both, and I'm not sure this is the right place to
> elaborate on the difference.
>
>>> +@item EPERM
>>> +The system security policy does not allow a mapping with the specified
>>> +flags.  For example, mappings which are both @code{PROT_EXEC} and
>>> +@code{PROT_WRITE} at the same time might not be allowed.
>>> +@end table
>>> +@end deftypefun
>>> +
>>> +If the @code{mprotect} function is used to make a region of memory
>>> +inaccessible by specifying the @code{PROT_NONE} protection flag and
>>> +access is later restored, the memory retains its previous contents.
>>> +
>>> +On some systems, it may not be possible to specify additional flags
>>> +which were not present when the mapping was first created.  For example,
>>>
>>> +an attempt make a region of memory executable could fail if the initial
>>
>> "attempt to make"
>>
>>> +protection flags where @samp{PROT_READ | PROT_WRITE}.
>>
>> "were"
>
> Fixed.
>
>>> +In general, the @code{mprotect} function can be used to change any
>>> +process memory, no matter how it was allocated.  However, portable use
>>> +of the function requires that it is only used with memory regions
>>> +returned by @code{mmap} or @code{mmap64}.
>>>     @node Locking Pages
>>>   @section Locking Pages
>>>
>>
>> I think this is quality documentation.  Thank you.
>
> Thanks a lot for your review!
>
> I'm attaching the updated patch.
>
> Florian
>
> mprotect.patch
>
>
> 2017-11-14  Florian Weimer  <fweimer@redhat.com>
>
> 	manual: Document mprotect
> 	* manual/memory.texi (Memory Protection): New section.
> 	* manual/llio.texi (Memory-mapped I/O): Remove duplicate
> 	documentation of PROT_* flags and reference the Memory Protection
> 	section instead.
>
> diff --git a/manual/llio.texi b/manual/llio.texi
> index 10ad546723..cbcd2221f6 100644
> --- a/manual/llio.texi
> +++ b/manual/llio.texi
> @@ -1411,19 +1411,11 @@ is created, which is not removed by closing the file.
>  address is automatically removed.  The address you give may still be
>  changed, unless you use the @code{MAP_FIXED} flag.
>
> -@vindex PROT_READ
> -@vindex PROT_WRITE
> -@vindex PROT_EXEC
>  @var{protect} contains flags that control what kind of access is
>  permitted.  They include @code{PROT_READ}, @code{PROT_WRITE}, and
> -@code{PROT_EXEC}, which permit reading, writing, and execution,
> -respectively.  Inappropriate access will cause a segfault (@pxref{Program
> -Error Signals}).
> -
> -Note that most hardware designs cannot support write permission without
> -read permission, and many do not distinguish read and execute permission.
> -Thus, you may receive wider permissions than you ask for, and mappings of
> -write-only files may be denied even if you do not use @code{PROT_READ}.

Coming back to this now, is there a reason not to leave this paragraph in?

> +@code{PROT_EXEC}.  The special flag @code{PROT_NONE} reserves a region
> +of address space for future use.  The @code{mprotect} function can be
> +used to change the protection flags.  @xref{Memory Protection}.
>
>  @var{flags} contains flags that control the nature of the map.
>  One of @code{MAP_SHARED} or @code{MAP_PRIVATE} must be specified.
> diff --git a/manual/memory.texi b/manual/memory.texi
> index 51a5f4e83c..dd6ac37c10 100644
> --- a/manual/memory.texi
> +++ b/manual/memory.texi
> @@ -17,6 +17,7 @@ and allocation of real memory.
>  * Memory Concepts::             An introduction to concepts and terminology.
>  * Memory Allocation::           Allocating storage for your program data
>  * Resizing the Data Segment::   @code{brk}, @code{sbrk}
> +* Memory Protection::           Controlling access to memory regions.
>  * Locking Pages::               Preventing page faults
>  @end menu
>
> @@ -3047,7 +3048,126 @@ of the data segment is.
>
>  @end deftypefun
>
> +@node Memory Protection
> +@section Memory Protection
> +@cindex memory protection
> +@cindex page protection
> +@cindex protection flags
>
> +When a page is mapped using @code{mmap}, page protection flags can be
> +specified using the protection flags argument.  @xref{Memory-mapped
> +I/O}.
> +
> +The following flags are available:
> +
> +@vtable @code
> +@item PROT_WRITE
> +@standards{POSIX, sys/mman.h}

I skipped over this before since we haven't canonicalized standards yet,
but can you provide a specific POSIX version (e.g., POSIX.1-2001)?

> +The memory can be written to.
> +
> +@item PROT_READ
> +@standards{POSIX, sys/mman.h}
> +The memory can be read.  On some architectures, this flag implies that
> +the memory can be executed as well (as if @code{PROT_EXEC} had been
> +specified at the same time).
> +
> +@item PROT_EXEC
> +@standards{POSIX, sys/mman.h}
> +The memory can be used to store instructions which can then be executed.
> +On most architectures, this flag implies that the memory can be read (as
> +if @code{PROT_READ} had been specified).
> +
> +@item PROT_NONE
> +@standards{POSIX, sys/mman.h}
> +This flag must be specified on its own.
> +
> +The memory is reserved, but cannot be read, written, or executed.  If
> +this flag is specified in a call to @code{mmap}, a virtual memory area
> +will be set aside for future use in the process, and @code{mmap} calls
> +without the @code{MAP_FIXED} flag will not use it for subsequent
> +allocations.  For anonymous mappings, the kernel will not reserve any
> +physical memory for the allocation at the time the mapping is created.

Just to make sure I'm reading this correctly, if you do use PROT_NONE
and don't use MAP_ANON, the kernel may allocate the backing storage.

> +@end vtable
> +
> +The operating system may keep track of these flags separately even if
> +the underlying hardware treats them the same for the purposes of access
> +checking (as happens with @code{PROT_READ} and @code{PROT_EXEC} on some
> +platforms).  On GNU systems, @code{PROT_EXEC} always implies
> +@code{PROT_READ}, so that users can view the machine code which is
> +executing on their system.

+1

> +
> +Inappropriate access will cause a segfault (@pxref{Program Error
> +Signals}).
> +
> +After allocation, protection flags can be changed using the
> +@code{mprotect} function.
> +
> +@deftypefun int mprotect (void *@var{address}, size_t @var{length}, int @var{protection})
> +@standards{POSIX, sys/mman.h}
> +@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
> +
> +A successful call to the @code{mprotect} function changes the protection
> +flags of at least @var{length} bytes of memory, starting at
> +@var{address}.
> +
> +@var{address} must be aligned to the page size for the mapping.  The
> +system page size can be obtained by calling @code{sysconf} with the
> +@code{_SC_PAGESIZE} parameter (@pxref{Sysconf Definition}).  The system
> +page size is the granularity in which the page protection of anonymous
> +memory mappings and most file mappings can be changed.  Memory which is
> +mapped from special files or devices may have larger page granularity
> +than the system page size and may require larger alignment.
> +
> +@var{length} is the number of bytes whose protection flags must be
> +changed.  It is automatically rounded up to the next multiple of the
> +system page size.
> +
> +@var{protection} is a combination of the @code{PROT_*} flags described
> +above.
> +

The return values of the function are also missing:

"@code{mprotect} returns @code{0} on success or @code{-1} on error."

> +The following @code{errno} error conditions are defined for this
> +function:
> +
> +@table @code
> +@item ENOMEM
> +The system was not able to allocate resources to fulfill the request.
> +This can happen if there is not enough physical memory in the system for
> +the allocation of backing storage (if the memory was initially mapped
> +with @code{PROT_NONE}).  The error can also occur if the new protection

Can we simply drop the parenthetical "if" here?

> +flags would cause the memory region to be split from its neighbors, and
> +the process limit for the number of such distinct memory regions would
> +be exceeded.
> +
> +@item EINVAL
> +@var{address} is not properly aligned to a page boundary for the
> +mapping, or @var{length} (after rounding up to the system page size) is
> +not a multiple of the applicable page size for the mapping, or the
> +combination of flags in @var{protection} is not valid.
> +
> +@item EACCES
> +The file for a file-based mapping was not opened with open flags which
> +are compatible with @var{protection}.
> +
> +@item EPERM
> +The system security policy does not allow a mapping with the specified
> +flags.  For example, mappings which are both @code{PROT_EXEC} and
> +@code{PROT_WRITE} at the same time might not be allowed.
> +@end table
> +@end deftypefun
> +
> +If the @code{mprotect} function is used to make a region of memory
> +inaccessible by specifying the @code{PROT_NONE} protection flag and
> +access is later restored, the memory retains its previous contents.
> +
> +On some systems, it may not be possible to specify additional flags
> +which were not present when the mapping was first created.  For example,
> +an attempt to make a region of memory executable could fail if the
> +initial protection flags were @samp{PROT_READ | PROT_WRITE}.
> +
> +In general, the @code{mprotect} function can be used to change any
> +process memory, no matter how it was allocated.  However, portable use
> +of the function requires that it is only used with memory regions
> +returned by @code{mmap} or @code{mmap64}.
>
>  @node Locking Pages
>  @section Locking Pages

This is looking really good.

Rical

Florian Weimer Nov. 18, 2017, 5:04 p.m. | #4
* Rical Jasan:

>> Right, the wording isn't correct.
>>
>> “
>> The file for a file-based mapping was not opened with open flags which
>> are compatible with @var{protection}.
>> “”
>>
>>> Is there
>>> somewhere we can reference what is special about file-based mappings
>>> such that they can prevent the use of mprotect?  How is this different
>>> from EPERM, below?  The content below the @deftypefun makes it sound
>>> like this might only be returned by some systems.
>>
>> Maybe we should just say “non-anonymous mappings“ instead?
>>
>> You can map device memory using files under /dev and /proc, even
>> starting with regular files, and strange things can happen there.  These
>> are technically file-based mappings because that's how you create them,
>> but they are very different from what files would behave like.
>
> Given the is-or-isn't nature of "[non-]anonymous", that could work.  If
> "file-based" is equivalent to "non-anonymous" and we stuck with the
> latter, that'd be fine, but my concern is the (lack of) introduction of
> these terms.  "Anonymous mapping" only appears once in the manual, isn't
> introduced with an @dfn, and doesn't have an @cindex to find it by; and
> this is the first occurrence of "file-based mapping".
>
> I can submit a patch that introduces and indexes the anonymous mapping
> concept where it's mentioned in llio.texi, but I don't know where we
> would provide a similarly cursory mention of file-based mappings.
> Ultimately, I'd rather see this patch go in than get stalled on my own
> unfamiliarity with the vernacular, and if "file-based" is how you as the
> author would naturally refer to it, that's probably best.

We could document the MAP_FILE flag used by some systems, where a call
to mmap must either specify MAP_FILE or MAP_ANONYMOUS (or its older
name, MAP_ANON).

>> -@vindex PROT_READ
>> -@vindex PROT_WRITE
>> -@vindex PROT_EXEC
>>  @var{protect} contains flags that control what kind of access is
>>  permitted.  They include @code{PROT_READ}, @code{PROT_WRITE}, and
>> -@code{PROT_EXEC}, which permit reading, writing, and execution,
>> -respectively.  Inappropriate access will cause a segfault (@pxref{Program
>> -Error Signals}).
>> -
>> -Note that most hardware designs cannot support write permission without
>> -read permission, and many do not distinguish read and execute permission.
>> -Thus, you may receive wider permissions than you ask for, and mappings of
>> -write-only files may be denied even if you do not use @code{PROT_READ}.
>
> Coming back to this now, is there a reason not to leave this paragraph in?

It's redundant with the new section on memory protection.

>> +When a page is mapped using @code{mmap}, page protection flags can be
>> +specified using the protection flags argument.  @xref{Memory-mapped
>> +I/O}.
>> +
>> +The following flags are available:
>> +
>> +@vtable @code
>> +@item PROT_WRITE
>> +@standards{POSIX, sys/mman.h}
>
> I skipped over this before since we haven't canonicalized standards yet,
> but can you provide a specific POSIX version (e.g., POSIX.1-2001)?

I can't check easily.  I don't have the paper standards, and I don't
know where the only versions were changed retroactively.

>> +The memory is reserved, but cannot be read, written, or executed.  If
>> +this flag is specified in a call to @code{mmap}, a virtual memory area
>> +will be set aside for future use in the process, and @code{mmap} calls
>> +without the @code{MAP_FIXED} flag will not use it for subsequent
>> +allocations.  For anonymous mappings, the kernel will not reserve any
>> +physical memory for the allocation at the time the mapping is created.
>
> Just to make sure I'm reading this correctly, if you do use PROT_NONE
> and don't use MAP_ANON, the kernel may allocate the backing storage.

Yes, the special is case for MAP_ANON(YMOUS) only.  I don't know what
happens with a file on a hugetlbfs file system, for example.

>> +@var{protection} is a combination of the @code{PROT_*} flags described
>> +above.
>> +
>
> The return values of the function are also missing:
>
> "@code{mprotect} returns @code{0} on success or @code{-1} on error."

Do we use @code{-1} or @math{-1}?  Existing practice is incosistent.

>> +The following @code{errno} error conditions are defined for this
>> +function:
>> +
>> +@table @code
>> +@item ENOMEM
>> +The system was not able to allocate resources to fulfill the request.
>> +This can happen if there is not enough physical memory in the system for
>> +the allocation of backing storage (if the memory was initially mapped
>> +with @code{PROT_NONE}).  The error can also occur if the new protection
>
> Can we simply drop the parenthetical "if" here?

Sure, I dropped it.

Do you think this can go in as-is (with this and the return value
documentation change)?

PS: It is easier for me to response from my work mail account if you
trim your response and leave out irrelevant parts.

Rical Jasan Nov. 18, 2017, 11:49 p.m. | #5
On 11/18/2017 09:04 AM, Florian Weimer wrote:
> * Rical Jasan:
>> Given the is-or-isn't nature of "[non-]anonymous", that could work.  If
>> "file-based" is equivalent to "non-anonymous" and we stuck with the
>> latter, that'd be fine, but my concern is the (lack of) introduction of
>> these terms.  "Anonymous mapping" only appears once in the manual, isn't
>> introduced with an @dfn, and doesn't have an @cindex to find it by; and
>> this is the first occurrence of "file-based mapping".
>>
>> I can submit a patch that introduces and indexes the anonymous mapping
>> concept where it's mentioned in llio.texi, but I don't know where we
>> would provide a similarly cursory mention of file-based mappings.
>> Ultimately, I'd rather see this patch go in than get stalled on my own
>> unfamiliarity with the vernacular, and if "file-based" is how you as the
>> author would naturally refer to it, that's probably best.
>
> We could document the MAP_FILE flag used by some systems, where a call
> to mmap must either specify MAP_FILE or MAP_ANONYMOUS (or its older
> name, MAP_ANON).

That sounds good.  I can take a stab at it with the anonymous mappings
indexing.

>>> -@vindex PROT_READ
>>> -@vindex PROT_WRITE
>>> -@vindex PROT_EXEC
>>>  @var{protect} contains flags that control what kind of access is
>>>  permitted.  They include @code{PROT_READ}, @code{PROT_WRITE}, and
>>> -@code{PROT_EXEC}, which permit reading, writing, and execution,
>>> -respectively.  Inappropriate access will cause a segfault (@pxref{Program
>>> -Error Signals}).
>>> -
>>> -Note that most hardware designs cannot support write permission without
>>> -read permission, and many do not distinguish read and execute permission.
>>> -Thus, you may receive wider permissions than you ask for, and mappings of
>>> -write-only files may be denied even if you do not use @code{PROT_READ}.
>>
>> Coming back to this now, is there a reason not to leave this paragraph in?
>
> It's redundant with the new section on memory protection.

OK.

>>> +When a page is mapped using @code{mmap}, page protection flags can be
>>> +specified using the protection flags argument.  @xref{Memory-mapped
>>> +I/O}.
>>> +
>>> +The following flags are available:
>>> +
>>> +@vtable @code
>>> +@item PROT_WRITE
>>> +@standards{POSIX, sys/mman.h}
>>
>> I skipped over this before since we haven't canonicalized standards yet,
>> but can you provide a specific POSIX version (e.g., POSIX.1-2001)?
>
> I can't check easily.  I don't have the paper standards, and I don't
> know where the only versions were changed retroactively.

OK.  I can't check right now either, but I don't think it should stop
the patch from going in.

>>> +@var{protection} is a combination of the @code{PROT_*} flags described
>>> +above.
>>> +
>>
>> The return values of the function are also missing:
>>
>> "@code{mprotect} returns @code{0} on success or @code{-1} on error."
>
> Do we use @code{-1} or @math{-1}?  Existing practice is incosistent.

@math is for mathematical expressions, and the Texinfo manual says: "In
essence, @math switches into plain TeX math mode." [1]  While the
Texinfo manual never explicitly states return values should be formatted
with @code, consider the example they use for how @code looks when it's
rendered: "The function returns @code{nil}." [2]

> Do you think this can go in as-is (with this and the return value
> documentation change)?

I do.  Thank you for taking the time to write it.

> PS: It is easier for me to response from my work mail account if you
> trim your response and leave out irrelevant parts.

Sorry about that.  I usually trim a fair amount, but was going back and
forth a lot in the message, so it was hard to remove the context.  That
was a lot of uncommented-on text in the beginning I should have removed,
though.

Thanks for putting up with the review!

Rical

[1]
https://www.gnu.org/software/texinfo/manual/texinfo/html_node/Inserting-Math.html
[2]
https://www.gnu.org/software/texinfo/manual/texinfo/html_node/_0040code.html

Florian Weimer Nov. 19, 2017, 10:27 a.m. | #6
On 11/19/2017 12:49 AM, Rical Jasan wrote:

>>>> +@var{protection} is a combination of the @code{PROT_*} flags described
>>>> +above.
>>>> +
>>>
>>> The return values of the function are also missing:
>>>
>>> "@code{mprotect} returns @code{0} on success or @code{-1} on error."
>>
>> Do we use @code{-1} or @math{-1}?  Existing practice is incosistent.
>
> @math is for mathematical expressions, and the Texinfo manual says: "In
> essence, @math switches into plain TeX math mode." [1]  While the
> Texinfo manual never explicitly states return values should be formatted
> with @code, consider the example they use for how @code looks when it's
> rendered: "The function returns @code{nil}." [2]

$nil$ is a product of the variables $n$, $i$, $l$, so it's clearly
different.  In LaTeX, inside a mathematical expression, you would use
something like $\mathrm{nil}$, or more likely, a special symbol.

Thanks,
Florian

Rical Jasan Nov. 21, 2017, 12:42 a.m. | #7
On 11/19/2017 02:27 AM, Florian Weimer wrote:

>>>>> +@var{protection} is a combination of the @code{PROT_*} flags
>>>>> described
>>>>> +above.
>>>>> +
>>>>
>>>> The return values of the function are also missing:
>>>>
>>>> "@code{mprotect} returns @code{0} on success or @code{-1} on error."
>>>
>>> Do we use @code{-1} or @math{-1}?  Existing practice is incosistent.
>>
>> @math is for mathematical expressions, and the Texinfo manual says: "In
>> essence, @math switches into plain TeX math mode." [1]  While the
>> Texinfo manual never explicitly states return values should be formatted
>> with @code, consider the example they use for how @code looks when it's
>> rendered: "The function returns @code{nil}." [2]
>
> $nil$ is a product of the variables $n$, $i$, $l$, so it's clearly
> different.  In LaTeX, inside a mathematical expression, you would use
> something like $\mathrm{nil}$, or more likely, a special symbol.

I usually opt for @code because of the function context, but have always
held reservations because it is referring to an abstract value.  I think
you have a good argument for @math, if I'm understanding your point.

There are some interesting partially-numeric use cases for return
values, though.  mbrtowc is a good example.

Rical


## Patch

diff --git a/manual/llio.texi b/manual/llio.texi
index 10ad546723..e635c0eaa8 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -1411,19 +1411,11 @@  is created, which is not removed by closing the file.
address is automatically removed.  The address you give may still be
changed, unless you use the @code{MAP_FIXED} flag.

-@vindex PROT_READ
-@vindex PROT_WRITE
-@vindex PROT_EXEC
@var{protect} contains flags that control what kind of access is
permitted.  They include @code{PROT_READ}, @code{PROT_WRITE}, and
-@code{PROT_EXEC}, which permit reading, writing, and execution,
-respectively.  Inappropriate access will cause a segfault (@pxref{Program
-Error Signals}).
-
-Note that most hardware designs cannot support write permission without
-read permission, and many do not distinguish read and execute permission.
-Thus, you may receive wider permissions than you ask for, and mappings of
-write-only files may be denied even if you do not use @code{PROT_READ}.
+@code{PROT_EXEC}.  The special flag @code{PROT_NONE} reserve a region of
+address space for future use.  The @code{mprotect} function can be used
+to change the protection flags.  @xref{Memory Protection}.

@var{flags} contains flags that control the nature of the map.
One of @code{MAP_SHARED} or @code{MAP_PRIVATE} must be specified.
diff --git a/manual/memory.texi b/manual/memory.texi
index 51a5f4e83c..9a9340cace 100644
--- a/manual/memory.texi
+++ b/manual/memory.texi
@@ -17,6 +17,7 @@  and allocation of real memory.
* Memory Concepts::             An introduction to concepts and terminology.
* Memory Allocation::           Allocating storage for your program data
* Resizing the Data Segment::   @code{brk}, @code{sbrk}
+* Memory Protection::      	Controlling access to memory regions.
* Locking Pages::               Preventing page faults
@end menu

@@ -3047,7 +3048,120 @@  of the data segment is.

@end deftypefun

+@node Memory Protection
+@section Memory Protection
+@cindex memory protection
+@cindex page protection
+@cindex protection flags

+When a page is mapped using @code{mmap}, page protection flags can be
+specified using the protection flags argument.  @xref{Memory-mapped
+I/O}.
+
+The following flags are availabe.
+
+@table @code
+@item PROT_WRITE
+@vindex PROT_WRITE
+The memory can be written to.
+
+@item PROT_READ
+@vindex PROT_READ
+The memory can be read.  On some architectures, this flag implies that
+the memory can be executed as well (as if @code{PROT_EXEC} had been
+specified at the same time).
+
+@item PROT_EXEC
+@vindex PROT_EXEC
+The memory can be used store instructions which can then be executed.
+On most architectures, this flag implies that the memory can be read (as
+if @code{PROT_READ} had been specified).
+
+@item PROT_NONE
+@vindex PROT_NONE
+This flag must be specified on its own.
+
+The memory is reserved, but cannot be read, written, or executed.  If
+this flag is specified in a call to @code{mmap}, a virtual memory area
+will be set aside for future use in the process, and @code{mmap} calls
+without the @code{MAP_FIXED} flag will not use it for subsequent
+allocations.  For anonymous mappings, the kernel will not reserve any
+physical memory for the allocation, though.
+@end table
+
+The operating system may keep of these flags separately even if the
+underlying hardware treats them the same for the purposes of access
+checking (as it happens with @code{PROT_READ} and @code{PROT_EXEC} on
+some platforms).
+
+Inappropriate access will cause a segfault (@pxref{Program Error
+Signals}).
+
+After allocation, protection flags can be changed using the
+@code{mprotect} function.
+
+@deftypefun int mprotect (void *@var{address}, size_t @var{length}, int @var{protection})
+@standards{POSIX, sys/mman.h}
+@safety{@prelim{}@mtsafe{}@assafe{}@acsafe{}}
+
+A successful call to the @code{mprotect} function changes the protection
+flags of at least @var{length} bytes of memory, starting at
+@var{address}.
+
+@var{address} must be aligned to the page size for the mapping.  The
+system page size can be obtained using @samp{sysconf (_SC_PAGE_SIZE)}.
+@xref{Sysconf Definition}.  The system page size is the granularity in
+which the page protection of anonymous memory mappings and most file
+mappings can be changed.  Memory which is mapped from special files or
+devices may have larger page granularity than the system page size and
+may require larger alignment.
+
+@var{length} is the number of bytes whose protection flags must be
+changed.  It is automatically rounded up to the next multiple of the
+system page size.
+
+@var{protection} is a combination of the @code{PROT_}* described above.
+
+@table @code
+@item ENOMEM
+The system was not able to allocate resources to fulfill the request.
+This can happen if there is not enough physical memory in the system for
+the allocation of backing storage (if the memory was initially mapped
+with @code{PROT_NONE}).  The error can also occur if the new protection
+flags would cause the memory region to be split from its neighbors, and
+the process limit for the number of such distinct memory regions would
+be exceeded.
+
+@item EINVAL
+@var{address} is not properly aligned to a page boundary for the
+mapping, or @var{length} (after rounding up to the system page size) is
+not a multiple of the applicable page size for the mapping, or the
+combination of flags in @var{protection} is not valid.
+
+@item EACCES
+The file for a file-based mapping was not opened with flags which match
+@var{protection}.
+
+@item EPERM
+The system security policy does not allow a mapping with the specified
+flags.  For example, mappings which are both @code{PROT_EXEC} and
+@code{PROT_WRITE} at the same time might not be allowed.
+@end table
+@end deftypefun
+
+If the @code{mprotect} function is used to make a region of memory
+inaccessible by specifying the @code{PROT_NONE} protection flag and
+access is later restored, the memory retains its previous contents.
+
+On some systems, it may not be possible to specify additional flags
+which were not present when the mapping was first created.  For example,
+an attempt make a region of memory executable could fail if the initial
+protection flags where @samp{PROT_READ | PROT_WRITE}.
+
+In general, the @code{mprotect} function can be used to change any
+process memory, no matter how it was allocated.  However, portable use
+of the function requires that it is only used with memory regions
+returned by @code{mmap} or @code{mmap64}.

@node Locking Pages
@section Locking Pages

`