diff mbox series

[11/13] ext4: import inode data fork chapter from wiki page

Message ID 153124307379.17949.10394970013817894612.stgit@magnolia
State Superseded, archived
Headers show
Series ext4: major documentation surgery | expand

Commit Message

Darrick Wong July 10, 2018, 5:17 p.m. UTC
From: Darrick J. Wong <darrick.wong@oracle.com>

Import the chapter about inode data fork from the on-disk format wiki
page into the kernel documentation.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 Documentation/filesystems/ext4/ondisk/blockmap.rst |   49 +++++
 Documentation/filesystems/ext4/ondisk/dynamic.rst  |    1 
 Documentation/filesystems/ext4/ondisk/ifork.rst    |  188 ++++++++++++++++++++
 3 files changed, 238 insertions(+)
 create mode 100644 Documentation/filesystems/ext4/ondisk/blockmap.rst
 create mode 100644 Documentation/filesystems/ext4/ondisk/ifork.rst

Comments

Theodore Ts'o July 20, 2018, 1:36 a.m. UTC | #1
On Tue, Jul 10, 2018 at 10:17:53AM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Import the chapter about inode data fork from the on-disk format wiki
> page into the kernel documentation.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>

It looks like blockmap.rst causes "make pdfdocs" to die with:

Markup is unsupported in LaTeX:
filesystems/ext4/ondisk/index:: nested tables are not yet implemented.

Is there some way we can skip including blockmap.rst if we are
translating the .rst file to a LaTeX output?

					- Ted
Darrick Wong July 20, 2018, 6:10 a.m. UTC | #2
On Thu, Jul 19, 2018 at 09:36:16PM -0400, Theodore Y. Ts'o wrote:
> On Tue, Jul 10, 2018 at 10:17:53AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Import the chapter about inode data fork from the on-disk format wiki
> > page into the kernel documentation.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> It looks like blockmap.rst causes "make pdfdocs" to die with:
> 
> Markup is unsupported in LaTeX:
> filesystems/ext4/ondisk/index:: nested tables are not yet implemented.
> 
> Is there some way we can skip including blockmap.rst if we are
> translating the .rst file to a LaTeX output?

Probably not?  Maybe it's easier to fix the nested table... will look
further tomorrow.

--D

> 
> 					- Ted
Darrick Wong July 20, 2018, 4:12 p.m. UTC | #3
On Thu, Jul 19, 2018 at 09:36:16PM -0400, Theodore Y. Ts'o wrote:
> On Tue, Jul 10, 2018 at 10:17:53AM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Import the chapter about inode data fork from the on-disk format wiki
> > page into the kernel documentation.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> It looks like blockmap.rst causes "make pdfdocs" to die with:
> 
> Markup is unsupported in LaTeX:
> filesystems/ext4/ondisk/index:: nested tables are not yet implemented.
> 
> Is there some way we can skip including blockmap.rst if we are
> translating the .rst file to a LaTeX output?

Ok, I figured out how to do that -- sphinx has a plugin to conditionally
include a child block based on some python boolean eval()uated condition.
This involves tweaking the kerneldoc conf.py a little but overall it's
not too gross.

--D

> 
> 					- Ted
diff mbox series

Patch

diff --git a/Documentation/filesystems/ext4/ondisk/blockmap.rst b/Documentation/filesystems/ext4/ondisk/blockmap.rst
new file mode 100644
index 000000000000..30e25750d88a
--- /dev/null
+++ b/Documentation/filesystems/ext4/ondisk/blockmap.rst
@@ -0,0 +1,49 @@ 
+.. SPDX-License-Identifier: GPL-2.0
+
++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| i.i\_block Offset   | Where It Points                                                                                                                                                                                                              |
++=====================+==============================================================================================================================================================================================================================+
+| 0 to 11             | Direct map to file blocks 0 to 11.                                                                                                                                                                                           |
++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| 12                  | Indirect block: (file blocks 12 to (``$block_size`` / 4) + 11, or 12 to 1035 if 4KiB blocks)                                                                                                                                 |
+|                     |                                                                                                                                                                                                                              |
+|                     | +------------------------------+--------------------------------------------------------------------+                                                                                                                        |
+|                     | | Indirect Block Offset        | Where It Points                                                    |                                                                                                                        |
+|                     | +==============================+====================================================================+                                                                                                                        |
+|                     | | 0 to (``$block_size`` / 4)   | Direct map to (``$block_size`` / 4) blocks (1024 if 4KiB blocks)   |                                                                                                                        |
+|                     | +------------------------------+--------------------------------------------------------------------+                                                                                                                        |
++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| 13                  | Double-indirect block: (file blocks ``$block_size``/4 + 12 to (``$block_size`` / 4) ^ 2 + (``$block_size`` / 4) + 11, or 1036 to 1049611 if 4KiB blocks)                                                                     |
+|                     |                                                                                                                                                                                                                              |
+|                     | +--------------------------------+---------------------------------------------------------------------------------------------------------+                                                                                 |
+|                     | | Double Indirect Block Offset   | Where It Points                                                                                         |                                                                                 |
+|                     | +================================+=========================================================================================================+                                                                                 |
+|                     | | 0 to (``$block_size`` / 4)     | Map to (``$block_size`` / 4) indirect blocks (1024 if 4KiB blocks)                                      |                                                                                 |
+|                     | |                                |                                                                                                         |                                                                                 |
+|                     | |                                | +------------------------------+--------------------------------------------------------------------+   |                                                                                 |
+|                     | |                                | | Indirect Block Offset        | Where It Points                                                    |   |                                                                                 |
+|                     | |                                | +==============================+====================================================================+   |                                                                                 |
+|                     | |                                | | 0 to (``$block_size`` / 4)   | Direct map to (``$block_size`` / 4) blocks (1024 if 4KiB blocks)   |   |                                                                                 |
+|                     | |                                | +------------------------------+--------------------------------------------------------------------+   |                                                                                 |
+|                     | +--------------------------------+---------------------------------------------------------------------------------------------------------+                                                                                 |
++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+| 14                  | Triple-indirect block: (file blocks (``$block_size`` / 4) ^ 2 + (``$block_size`` / 4) + 12 to (``$block_size`` / 4) ^ 3 + (``$block_size`` / 4) ^ 2 + (``$block_size`` / 4) + 12, or 1049612 to 1074791436 if 4KiB blocks)   |
+|                     |                                                                                                                                                                                                                              |
+|                     | +--------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+                                          |
+|                     | | Triple Indirect Block Offset   | Where It Points                                                                                                                                |                                          |
+|                     | +================================+================================================================================================================================================+                                          |
+|                     | | 0 to (``$block_size`` / 4)     | Map to (``$block_size`` / 4) double indirect blocks (1024 if 4KiB blocks)                                                                      |                                          |
+|                     | |                                |                                                                                                                                                |                                          |
+|                     | |                                | +--------------------------------+---------------------------------------------------------------------------------------------------------+   |                                          |
+|                     | |                                | | Double Indirect Block Offset   | Where It Points                                                                                         |   |                                          |
+|                     | |                                | +================================+=========================================================================================================+   |                                          |
+|                     | |                                | | 0 to (``$block_size`` / 4)     | Map to (``$block_size`` / 4) indirect blocks (1024 if 4KiB blocks)                                      |   |                                          |
+|                     | |                                | |                                |                                                                                                         |   |                                          |
+|                     | |                                | |                                | +------------------------------+--------------------------------------------------------------------+   |   |                                          |
+|                     | |                                | |                                | | Indirect Block Offset        | Where It Points                                                    |   |   |                                          |
+|                     | |                                | |                                | +==============================+====================================================================+   |   |                                          |
+|                     | |                                | |                                | | 0 to (``$block_size`` / 4)   | Direct map to (``$block_size`` / 4) blocks (1024 if 4KiB blocks)   |   |   |                                          |
+|                     | |                                | |                                | +------------------------------+--------------------------------------------------------------------+   |   |                                          |
+|                     | |                                | +--------------------------------+---------------------------------------------------------------------------------------------------------+   |                                          |
+|                     | +--------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------+                                          |
++---------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
diff --git a/Documentation/filesystems/ext4/ondisk/dynamic.rst b/Documentation/filesystems/ext4/ondisk/dynamic.rst
index 7c5f5019b9d6..f090de8dd1c1 100644
--- a/Documentation/filesystems/ext4/ondisk/dynamic.rst
+++ b/Documentation/filesystems/ext4/ondisk/dynamic.rst
@@ -7,3 +7,4 @@  Dynamic metadata are created on the fly when files and blocks are
 allocated to files.
 
 .. include:: inodes.rst
+.. include:: ifork.rst
diff --git a/Documentation/filesystems/ext4/ondisk/ifork.rst b/Documentation/filesystems/ext4/ondisk/ifork.rst
new file mode 100644
index 000000000000..e6acb2a2cb59
--- /dev/null
+++ b/Documentation/filesystems/ext4/ondisk/ifork.rst
@@ -0,0 +1,188 @@ 
+.. SPDX-License-Identifier: GPL-2.0
+
+The Contents of inode.i\_block
+------------------------------
+
+Depending on the type of file an inode describes, the 60 bytes of
+storage in ``inode.i_block`` can be used in different ways. In general,
+regular files and directories will use it for file block indexing
+information, and special files will use it for special purposes.
+
+Symbolic Links
+~~~~~~~~~~~~~~
+
+The target of a symbolic link will be stored in this field if the target
+string is less than 60 bytes long. Otherwise, either extents or block
+maps will be used to allocate data blocks to store the link target.
+
+Direct/Indirect Block Addressing
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In ext2/3, file block numbers were mapped to logical block numbers by
+means of an (up to) three level 1-1 block map. To find the logical block
+that stores a particular file block, the code would navigate through
+this increasingly complicated structure. Notice that there is neither a
+magic number nor a checksum to provide any level of confidence that the
+block isn't full of garbage.
+
+.. include:: blockmap.rst
+
+Note that with this block mapping scheme, it is necessary to fill out a
+lot of mapping data even for a large contiguous file! This inefficiency
+led to the creation of the extent mapping scheme, discussed below.
+
+Notice also that a file using this mapping scheme cannot be placed
+higher than 2^32 blocks.
+
+Extent Tree
+~~~~~~~~~~~
+
+In ext4, the file to logical block map has been replaced with an extent
+tree. Under the old scheme, allocating a contiguous run of 1,000 blocks
+requires an indirect block to map all 1,000 entries; with extents, the
+mapping is reduced to a single ``struct ext4_extent`` with
+``ee_len = 1000``. If flex\_bg is enabled, it is possible to allocate
+very large files with a single extent, at a considerable reduction in
+metadata block use, and some improvement in disk efficiency. The inode
+must have the extents flag (0x80000) flag set for this feature to be in
+use.
+
+Extents are arranged as a tree. Each node of the tree begins with a
+``struct ext4_extent_header``. If the node is an interior node
+(``eh.eh_depth`` > 0), the header is followed by ``eh.eh_entries``
+instances of ``struct ext4_extent_idx``; each of these index entries
+points to a block containing more nodes in the extent tree. If the node
+is a leaf node (``eh.eh_depth == 0``), then the header is followed by
+``eh.eh_entries`` instances of ``struct ext4_extent``; these instances
+point to the file's data blocks. The root node of the extent tree is
+stored in ``inode.i_block``, which allows for the first four extents to
+be recorded without the use of extra metadata blocks.
+
+The extent tree header is recorded in ``struct ext4_extent_header``,
+which is 12 bytes long:
+
+.. list-table::
+   :widths: 1 1 1 77
+   :header-rows: 1
+
+   * - Offset
+     - Size
+     - Name
+     - Description
+   * - 0x0
+     - \_\_le16
+     - eh\_magic
+     - Magic number, 0xF30A.
+   * - 0x2
+     - \_\_le16
+     - eh\_entries
+     - Number of valid entries following the header.
+   * - 0x4
+     - \_\_le16
+     - eh\_max
+     - Maximum number of entries that could follow the header.
+   * - 0x6
+     - \_\_le16
+     - eh\_depth
+     - Depth of this extent node in the extent tree. 0 = this extent node
+       points to data blocks; otherwise, this extent node points to other
+       extent nodes. The extent tree can be at most 5 levels deep: a logical
+       block number can be at most ``2^32``, and the smallest ``n`` that
+       satisfies ``4*(((blocksize - 12)/12)^n) >= 2^32`` is 5.
+   * - 0x8
+     - \_\_le32
+     - eh\_generation
+     - Generation of the tree. (Used by Lustre, but not standard ext4).
+
+Internal nodes of the extent tree, also known as index nodes, are
+recorded as ``struct ext4_extent_idx``, and are 12 bytes long:
+
+.. list-table::
+   :widths: 1 1 1 77
+   :header-rows: 1
+
+   * - Offset
+     - Size
+     - Name
+     - Description
+   * - 0x0
+     - \_\_le32
+     - ei\_block
+     - This index node covers file blocks from 'block' onward.
+   * - 0x4
+     - \_\_le32
+     - ei\_leaf\_lo
+     - Lower 32-bits of the block number of the extent node that is the next
+       level lower in the tree. The tree node pointed to can be either another
+       internal node or a leaf node, described below.
+   * - 0x8
+     - \_\_le16
+     - ei\_leaf\_hi
+     - Upper 16-bits of the previous field.
+   * - 0xA
+     - \_\_u16
+     - ei\_unused
+     -
+
+Leaf nodes of the extent tree are recorded as ``struct ext4_extent``,
+and are also 12 bytes long:
+
+.. list-table::
+   :widths: 1 1 1 77
+   :header-rows: 1
+
+   * - Offset
+     - Size
+     - Name
+     - Description
+   * - 0x0
+     - \_\_le32
+     - ee\_block
+     - First file block number that this extent covers.
+   * - 0x4
+     - \_\_le16
+     - ee\_len
+     - Number of blocks covered by extent. If the value of this field is <=
+       32768, the extent is initialized. If the value of the field is > 32768,
+       the extent is uninitialized and the actual extent length is ``ee_len`` -
+       32768. Therefore, the maximum length of a initialized extent is 32768
+       blocks, and the maximum length of an uninitialized extent is 32767.
+   * - 0x6
+     - \_\_le16
+     - ee\_start\_hi
+     - Upper 16-bits of the block number to which this extent points.
+   * - 0x8
+     - \_\_le32
+     - ee\_start\_lo
+     - Lower 32-bits of the block number to which this extent points.
+
+Prior to the introduction of metadata checksums, the extent header +
+extent entries always left at least 4 bytes of unallocated space at the
+end of each extent tree data block (because (2^x % 12) >= 4). Therefore,
+the 32-bit checksum is inserted into this space. The 4 extents in the
+inode do not need checksumming, since the inode is already checksummed.
+The checksum is calculated against the FS UUID, the inode number, the
+inode generation, and the entire extent block leading up to (but not
+including) the checksum itself.
+
+``struct ext4_extent_tail`` is 4 bytes long:
+
+.. list-table::
+   :widths: 1 1 1 77
+   :header-rows: 1
+
+   * - Offset
+     - Size
+     - Name
+     - Description
+   * - 0x0
+     - \_\_le32
+     - eb\_checksum
+     - Checksum of the extent block, crc32c(uuid+inum+igeneration+extentblock)
+
+Inline Data
+~~~~~~~~~~~
+
+If the inline data feature is enabled for the filesystem and the flag is
+set for the inode, it is possible that the first 60 bytes of the file
+data are stored here.