diff mbox

[e2fsprogs] initdir: Writing inode after the initial write?

Message ID 85A86E8F-EEB9-495C-AB10-EF3C871EE2B9@dilger.ca
State Rejected, archived
Headers show

Commit Message

Andreas Dilger Dec. 1, 2012, 7:31 p.m. UTC
On 2012-11-30, at 10:08 PM, Darren Hart wrote:
> On 11/30/2012 08:23 PM, Andreas Dilger wrote:
>> On 2012-11-30, at 7:13 PM, Darren Hart wrote:
>>> I am working on creating some files after creating a filesystem in
>>> mke2fs. This is part of a larger project to add initial directory
>>> support to mke2fs.
>> 
>> Maybe some background on what you are trying to do would help us to
>> understand the problem?
> 
> Sure, a few are already aware, but I suppose some extra detail for
> the first post to this list is in order.
> 
> I work on the Yocto Project, and this particular effort is part of
> improving our deployment tooling. Specifically, the part of the build
> process that creates the root filesystem.
> 
> Most all filesystems have some mechanism to create prepopulated
> images without the need for root permissions. Many do this through
> a -r parameter to their corresponding mkfs.* tool. The exceptions to
> this are ext3 and ext4. Our current tooling relies on genext2fs and
> flipping some bits to "convert" the ext2 filesystem to ext3 and 4.
> Not ideal.
> 
> After exploring options like libguestfs and finding them to be
> considerably heavy weight for what we are trying to accomplish, I
> discussed the possibility of adding an argument to mke2fs which would
> populate a newly formatted filesystem from a specified directory. Ted
> suggested a clean set of patches implementing this were likely to be
> accepted.

Hmm, I wonder if libext2fs can itself create extent-mapped files,
or if these files will be block-mapped?  If they are small (< 1MB),
it is probably not a huge problem, but if your files are large it
may be that libext2fs also creates "ext2" files internally?

Maybe Ted can confirm whether that is true or not.  At least I recall
that the block allocator inside libext2fs was horrible, and creating
large files was problematic.

I guess the other question is why you don't use debugfs to create
the directory tree and copy the files into your new filesystem?
It already has "mkdir", "mknod" and "write" commands for use, and
it is a one-line patch to alias "write" to "cp" for easier use[*].

Then, it just needs a debugfs script to build your directory tree
and copy files over.  Possibly enhancing "cp" to call do_mknod() for
pipe/block/char devices would make this easier to use.

Something like the following, though it seems there isn't an "ln -s"
or "symlink" command for debugfs yet, that would need to be written.

#!/bin/bash
SRCDIR=$1
DEVICE=$2

{
	find $SRCDIR | while read FILE; do
		TGT=${FILE#$SRCDIR}
		case $(stat -c "%F" $FILE) in
		"directory")
			echo "mkdir $TGT"
			;;
		"regular file")
			echo "write $FILE $TGT"
			;;
		"symbolic link")
			LINK_TGT=$(ls -l $FILE | sed -e 's/.*-> //')
			echo "symlink $TGT $LINK_TGT"
			;;
		"block special file")
			DEVNO=$(stat -c "%t %T" $FILE)
			echo "mknod $F $DEVNO $TGT
			;;
		"character special file")
			DEVNO=$(stat -c "%t %T" $FILE)
			echo "mknod $TYPE $DEVNO $TGT
			;;
		*)
			echo "Unknown file $FILE" 1>&2
			;;
		done
	done
} | debugfs -w -f /dev/stdin $device

I would guess that implementing "symlink" support in debugfs will
be orders of magnitude less work, maintenance, and bugs than your
current patch.

This might be turned inside-out and just run a "find $SRCDIR" and
have the inner loop check the file type and call the appropriate
operation for it (mkdir, write/cp, mknod, symlink).  Note that
"find" will return the directories first, so this should be OK to
just consume the lines as they are output by find.

> I don't have much filesystem experience - most of my experience is
> with core kernel mechanisms, ipc, locking, etc. - so I'm mostly
> hacking my way to some basic functionality before refactoring. The
> libext2fs library documentation gave me a good start, but I
> occasionally trip over things like the problem described below as
> there is no documentation for what I'm trying to do specifically
> (of course) and many of the required functions are only minimally
> documented, and sometimes only listed in the index.

Definitely, if the documentation is lacking and you've spent cycles
figuring something out, then a patch to improve the documentation is
most welcome.

> The specific instance below is the result of me trying to format and
> populate a filesystem image (in a file) from a root directory that looks like this:
> 
> $ tree rootdir/
> rootdir/
> |-- dir1
> |   |-- hello.lnk -> /hello.txt
> |   `-- world.txt
> |-- hello.lnk -> /hello.txt
> |-- hello.txt
> |-- sda
> `-- ttyS0
> 
> $ cat rootdir/hello.txt
> hello
> 
> In mke2fs.c I setup the new getopt argument and call nftw() with a
> callback called init_dir_cb() which checks the file type and takes
> the appropriate action to duplicate each entry. The exact code is at:

To be honest, ntfw() will drag a bunch of bloat into e2fsprogs that
doesn't exist today, and isn't really portable.

> http://git.infradead.org/users/dvhart/e2fsprogs/blob/refs/heads/initialdir:/misc/mke2fs.c#l2319
> 
> As described below, when I update the inode.i_size after the initial
> write and copying of the file content, the above cat command fails to
> output anything when run on the loop mounted filesystem. If I just
> hack in the i_size prior to writing the inode for the first time and
> don't update it after copying the file content, then the cat command
> succeeds as above on the loop mounted image.

It probably makes sense to understand what is broken here, whether
it is the library or the program.  We definitely want to make sure
the API is usable and working correctly in any case.

> The commented out inode write is noted here:
> 
> http://git.infradead.org/users/dvhart/e2fsprogs/blob/refs/heads/initialdir:/misc/mke2fs.c#l2462
> 
> Does that help clarify the situation?
> 
> What I'm looking for is some insight into what it is I am not
> understanding about the filesystem structures that causes this behavior.

I hate to put a downer on your current work, but I think that you
are adding something overly complex that only has a very limited
usefulness, and your time could be better spent elsewhere.

[*] add debugfs "cp" command as an alias to "write":


> Thanks,
> 
> Darren
> 
>> 
>> Cheers, Andreas
>> 
>>> To make it easy for people to see what I'm working
>>> on, I've pushed my dev tree here:
>>> 
>>> http://git.infradead.org/users/dvhart/e2fsprogs/shortlog/refs/heads/initialdir
>>> 
>>> Note: the code is still just in the prototyping state. It is inelegant
>>> to say the least. The git tree will most definitely rebase. I'm trying
>>> to get it functional, once that is understand, I will refactor
>>> appropriately.
>>> 
>>> I can create a simple directory structure and link in files and fast
>>> symlinks. I'm currently working on copying content from files in the
>>> initial directory. The process I'm using is as follows:
>>> 
>>> 
>>> ext2fs_new_inode(&ino)
>>> ext2fs_link()
>>> 
>>> ext2fs_read_inode(ino, &inode)
>>> /* some initial inode setup */
>>> ext2fs_write_new_inode(ino, &inode)
>>> 
>>> ext2fs_file_open2(&inode)
>>> ext2fs_write_file()
>>> ext2fs_file_close()
>>> 
>>> inode.i_size = bytes_written
>>> ext2fs_write_inode()
>>> 
>>> ext2fs_inode_alloc_stats2(ino)
>>> 
>>> 
>>> When I mount the image, the size for the file is correct, by catting it
>>> returns nothing. If I instead hack in the known size during the initial
>>> inode setup and drop the last ext2fs_write_inode() call, then the size
>>> is right and catting the file works as expected.
>>> 
>>> Is it incorrect to write the inode more than once? If not, am I doing
>>> something that is somehow decoupling the block where the data was
>>> written from the inode associated with the file?
>>> 
>>> Thanks,
>>> 
>>> -- 
>>> Darren Hart
>>> Intel Open Source Technology Center
>>> Yocto Project - Technical Lead - Linux Kernel
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> 
>> Cheers, Andreas
>> 
>> 
>> 
>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Cheers, Andreas





--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Darren Hart Dec. 3, 2012, 7:46 p.m. UTC | #1
On 12/01/2012 11:31 AM, Andreas Dilger wrote:
> On 2012-11-30, at 10:08 PM, Darren Hart wrote:
>> On 11/30/2012 08:23 PM, Andreas Dilger wrote:
>>> On 2012-11-30, at 7:13 PM, Darren Hart wrote:
>>>> I am working on creating some files after creating a filesystem in
>>>> mke2fs. This is part of a larger project to add initial directory
>>>> support to mke2fs.
>>>
>>> Maybe some background on what you are trying to do would help us to
>>> understand the problem?
>>
>> Sure, a few are already aware, but I suppose some extra detail for
>> the first post to this list is in order.
>>
>> I work on the Yocto Project, and this particular effort is part of
>> improving our deployment tooling. Specifically, the part of the build
>> process that creates the root filesystem.
>>
>> Most all filesystems have some mechanism to create prepopulated
>> images without the need for root permissions. Many do this through
>> a -r parameter to their corresponding mkfs.* tool. The exceptions to
>> this are ext3 and ext4. Our current tooling relies on genext2fs and
>> flipping some bits to "convert" the ext2 filesystem to ext3 and 4.
>> Not ideal.
>>
>> After exploring options like libguestfs and finding them to be
>> considerably heavy weight for what we are trying to accomplish, I
>> discussed the possibility of adding an argument to mke2fs which would
>> populate a newly formatted filesystem from a specified directory. Ted
>> suggested a clean set of patches implementing this were likely to be
>> accepted.
>
> Hmm, I wonder if libext2fs can itself create extent-mapped files,
> or if these files will be block-mapped?  If they are small (< 1MB),
> it is probably not a huge problem, but if your files are large it
> may be that libext2fs also creates "ext2" files internally?
>
> Maybe Ted can confirm whether that is true or not.  At least I recall
> that the block allocator inside libext2fs was horrible, and creating
> large files was problematic.


Ted, can you confirm?


> I guess the other question is why you don't use debugfs to create
> the directory tree and copy the files into your new filesystem?
> It already has "mkdir", "mknod" and "write" commands for use, and
> it is a one-line patch to alias "write" to "cp" for easier use[*].


I just didn't know about it and it didn't come up in my polling :-)
(which would have been more fruitful had I done some of that here).


> Then, it just needs a debugfs script to build your directory tree
> and copy files over.  Possibly enhancing "cp" to call do_mknod() for
> pipe/block/char devices would make this easier to use.
>
> Something like the following, though it seems there isn't an "ln -s"
> or "symlink" command for debugfs yet, that would need to be written.
>
> #!/bin/bash
> SRCDIR=$1
> DEVICE=$2
>
> {
> 	find $SRCDIR | while read FILE; do
> 		TGT=${FILE#$SRCDIR}
> 		case $(stat -c "%F" $FILE) in
> 		"directory")
> 			echo "mkdir $TGT"
> 			;;
> 		"regular file")
> 			echo "write $FILE $TGT"
> 			;;
> 		"symbolic link")
> 			LINK_TGT=$(ls -l $FILE | sed -e 's/.*-> //')
> 			echo "symlink $TGT $LINK_TGT"
> 			;;
> 		"block special file")
> 			DEVNO=$(stat -c "%t %T" $FILE)
> 			echo "mknod $F $DEVNO $TGT
> 			;;
> 		"character special file")
> 			DEVNO=$(stat -c "%t %T" $FILE)
> 			echo "mknod $TYPE $DEVNO $TGT
> 			;;
> 		*)
> 			echo "Unknown file $FILE" 1>&2
> 			;;
> 		done
> 	done
> } | debugfs -w -f /dev/stdin $device


This is really promising. I've tweaked it a bit to use the basename and
cd into the directories as they are traversed by find so it doesn't try
and create filenames like "/dir1/hello.txt" in the root directory.

	#!/bin/sh
	SRCDIR=$1
	DEVICE=$2
	
	{
		find $SRCDIR | while read FILE; do
			#TGT=${FILE#$SRCDIR}
			TGT=$(basename ${FILE#$SRCDIR})
	
			# Skip the root dir
			if [ -z "$TGT" ]; then
				continue
			fi
	
			case $(stat -c "%F" $FILE) in
			"directory")
				echo "mkdir $TGT"
				echo "cd $TGT"
				;;
			"regular file")
				echo "write $FILE $TGT"
				;;
			"symbolic link")
				LINK_TGT=$(ls -l $FILE | sed -e 's/.*-> //')
				echo "symlink $TGT $LINK_TGT"
				;;
			"block special file")
				DEVNO=$(stat -c "%t %T" $FILE)
				echo "mknod $TGT b $DEVNO"
				;;
			"character special file")
				DEVNO=$(stat -c "%t %T" $FILE)
				echo "mknod $TGT c $DEVNO"
				;;
			*)
				echo "Unknown file $FILE" 1>&2
				;;
			esac
		done
	} | debugfs -w -f /dev/stdin $DEVICE


> I would guess that implementing "symlink" support in debugfs will
> be orders of magnitude less work, maintenance, and bugs than your
> current patch.


It needs symlink as you said, but I can relatively easily migrate my
code for that in mke2fs to debugfs.

Still needs permissions and such. Is that done with "modify_inode" ? If
so, how do I specify the new contents?

I need to look into how to detect and support hard links.


> This might be turned inside-out and just run a "find $SRCDIR" and
> have the inner loop check the file type and call the appropriate
> operation for it (mkdir, write/cp, mknod, symlink).  Note that
> "find" will return the directories first, so this should be OK to
> just consume the lines as they are output by find.


Yes, this seems to work just fine.


>> I don't have much filesystem experience - most of my experience is
>> with core kernel mechanisms, ipc, locking, etc. - so I'm mostly
>> hacking my way to some basic functionality before refactoring. The
>> libext2fs library documentation gave me a good start, but I
>> occasionally trip over things like the problem described below as
>> there is no documentation for what I'm trying to do specifically
>> (of course) and many of the required functions are only minimally
>> documented, and sometimes only listed in the index.
>
> Definitely, if the documentation is lacking and you've spent cycles
> figuring something out, then a patch to improve the documentation is
> most welcome.


I plan to update this as I go... although I'm going to have much less to
do if I use the debugfs approach. ;-)

I wonder if it would make sense to integrate the debugfs functionality
into libext2fs and enable both debugfs and mke2fs to use the same common
code. I think the "-r initialdir" option would still be nice to have for
mke2fs, and does make it more consistent with other FSs in this feature.


>
>> The specific instance below is the result of me trying to format and
>> populate a filesystem image (in a file) from a root directory that looks like this:
>>
>> $ tree rootdir/
>> rootdir/
>> |-- dir1
>> |   |-- hello.lnk -> /hello.txt
>> |   `-- world.txt
>> |-- hello.lnk -> /hello.txt
>> |-- hello.txt
>> |-- sda
>> `-- ttyS0
>>
>> $ cat rootdir/hello.txt
>> hello
>>
>> In mke2fs.c I setup the new getopt argument and call nftw() with a
>> callback called init_dir_cb() which checks the file type and takes
>> the appropriate action to duplicate each entry. The exact code is at:
>
> To be honest, ntfw() will drag a bunch of bloat into e2fsprogs that
> doesn't exist today, and isn't really portable.


OK, well it could also be done with ftw to be more portable, but I guess
it's still marked obsolete in POSIX.1-2008 :/

Similar functionality could be implemented relatively easily.


>
>> http://git.infradead.org/users/dvhart/e2fsprogs/blob/refs/heads/initialdir:/misc/mke2fs.c#l2319
>>
>> As described below, when I update the inode.i_size after the initial
>> write and copying of the file content, the above cat command fails to
>> output anything when run on the loop mounted filesystem. If I just
>> hack in the i_size prior to writing the inode for the first time and
>> don't update it after copying the file content, then the cat command
>> succeeds as above on the loop mounted image.
>
> It probably makes sense to understand what is broken here, whether
> it is the library or the program.  We definitely want to make sure
> the API is usable and working correctly in any case.


I should be able to compare with debugfs "write" and see what the
difference is.


>
>> The commented out inode write is noted here:
>>
>> http://git.infradead.org/users/dvhart/e2fsprogs/blob/refs/heads/initialdir:/misc/mke2fs.c#l2462
>>
>> Does that help clarify the situation?
>>
>> What I'm looking for is some insight into what it is I am not
>> understanding about the filesystem structures that causes this behavior.
>
> I hate to put a downer on your current work, but I think that you
> are adding something overly complex that only has a very limited
> usefulness, and your time could be better spent elsewhere.

Not at all! I appreciate the tip. And it hasn't been wasted time, I've
learned quite a bit, and as I said above, perhaps the debugfs copies and
such can be pushed into libext2fs and used in both. ext2fs_mkdir()
exists after all, why not ext2fs_mksymlink(), ext2fs_mknod() and
ext2fs_writefile() ?

Thanks a lot for the insight, exactly what I needed!

--
Darren

>
> [*] add debugfs "cp" command as an alias to "write":
>
> diff --git a/debugfs/debug_cmds.ct b/debugfs/debug_cmds.ct
> index a799dd7..3789dcd 100644
> --- a/debugfs/debug_cmds.ct
> +++ b/debugfs/debug_cmds.ct
> @@ -119,7 +119,7 @@ request do_undel, "Undelete file",
>         undelete, undel;
>
>  request do_write, "Copy a file from your native filesystem",
> -       write;
> +       write, cp;
>
>  request do_dump, "Dump an inode out to a file",
>         dump_inode, dump;
>
>> Thanks,
>>
>> Darren
>>
>>>
>>> Cheers, Andreas
>>>
>>>> To make it easy for people to see what I'm working
>>>> on, I've pushed my dev tree here:
>>>>
>>>> http://git.infradead.org/users/dvhart/e2fsprogs/shortlog/refs/heads/initialdir
>>>>
>>>> Note: the code is still just in the prototyping state. It is inelegant
>>>> to say the least. The git tree will most definitely rebase. I'm trying
>>>> to get it functional, once that is understand, I will refactor
>>>> appropriately.
>>>>
>>>> I can create a simple directory structure and link in files and fast
>>>> symlinks. I'm currently working on copying content from files in the
>>>> initial directory. The process I'm using is as follows:
>>>>
>>>>
>>>> ext2fs_new_inode(&ino)
>>>> ext2fs_link()
>>>>
>>>> ext2fs_read_inode(ino, &inode)
>>>> /* some initial inode setup */
>>>> ext2fs_write_new_inode(ino, &inode)
>>>>
>>>> ext2fs_file_open2(&inode)
>>>> ext2fs_write_file()
>>>> ext2fs_file_close()
>>>>
>>>> inode.i_size = bytes_written
>>>> ext2fs_write_inode()
>>>>
>>>> ext2fs_inode_alloc_stats2(ino)
>>>>
>>>>
>>>> When I mount the image, the size for the file is correct, by catting it
>>>> returns nothing. If I instead hack in the known size during the initial
>>>> inode setup and drop the last ext2fs_write_inode() call, then the size
>>>> is right and catting the file works as expected.
>>>>
>>>> Is it incorrect to write the inode more than once? If not, am I doing
>>>> something that is somehow decoupling the block where the data was
>>>> written from the inode associated with the file?
>>>>
>>>> Thanks,
>>>>
>>>> --
>>>> Darren Hart
>>>> Intel Open Source Technology Center
>>>> Yocto Project - Technical Lead - Linux Kernel
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>> Cheers, Andreas
>>>
>>>
>>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
> Cheers, Andreas
>
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yongqiang Yang Dec. 4, 2012, 10:45 a.m. UTC | #2
Hi,

If original images are ext4 format, this can be done by writing the
image to a new device and resizing the new device via resizefs.

Yongqiang,
Thanks,

On Tue, Dec 4, 2012 at 3:46 AM, Darren Hart <dvhart@infradead.org> wrote:
> On 12/01/2012 11:31 AM, Andreas Dilger wrote:
>> On 2012-11-30, at 10:08 PM, Darren Hart wrote:
>>> On 11/30/2012 08:23 PM, Andreas Dilger wrote:
>>>> On 2012-11-30, at 7:13 PM, Darren Hart wrote:
>>>>> I am working on creating some files after creating a filesystem in
>>>>> mke2fs. This is part of a larger project to add initial directory
>>>>> support to mke2fs.
>>>>
>>>> Maybe some background on what you are trying to do would help us to
>>>> understand the problem?
>>>
>>> Sure, a few are already aware, but I suppose some extra detail for
>>> the first post to this list is in order.
>>>
>>> I work on the Yocto Project, and this particular effort is part of
>>> improving our deployment tooling. Specifically, the part of the build
>>> process that creates the root filesystem.
>>>
>>> Most all filesystems have some mechanism to create prepopulated
>>> images without the need for root permissions. Many do this through
>>> a -r parameter to their corresponding mkfs.* tool. The exceptions to
>>> this are ext3 and ext4. Our current tooling relies on genext2fs and
>>> flipping some bits to "convert" the ext2 filesystem to ext3 and 4.
>>> Not ideal.
>>>
>>> After exploring options like libguestfs and finding them to be
>>> considerably heavy weight for what we are trying to accomplish, I
>>> discussed the possibility of adding an argument to mke2fs which would
>>> populate a newly formatted filesystem from a specified directory. Ted
>>> suggested a clean set of patches implementing this were likely to be
>>> accepted.
>>
>> Hmm, I wonder if libext2fs can itself create extent-mapped files,
>> or if these files will be block-mapped?  If they are small (< 1MB),
>> it is probably not a huge problem, but if your files are large it
>> may be that libext2fs also creates "ext2" files internally?
>>
>> Maybe Ted can confirm whether that is true or not.  At least I recall
>> that the block allocator inside libext2fs was horrible, and creating
>> large files was problematic.
>
>
> Ted, can you confirm?
>
>
>> I guess the other question is why you don't use debugfs to create
>> the directory tree and copy the files into your new filesystem?
>> It already has "mkdir", "mknod" and "write" commands for use, and
>> it is a one-line patch to alias "write" to "cp" for easier use[*].
>
>
> I just didn't know about it and it didn't come up in my polling :-)
> (which would have been more fruitful had I done some of that here).
>
>
>> Then, it just needs a debugfs script to build your directory tree
>> and copy files over.  Possibly enhancing "cp" to call do_mknod() for
>> pipe/block/char devices would make this easier to use.
>>
>> Something like the following, though it seems there isn't an "ln -s"
>> or "symlink" command for debugfs yet, that would need to be written.
>>
>> #!/bin/bash
>> SRCDIR=$1
>> DEVICE=$2
>>
>> {
>>       find $SRCDIR | while read FILE; do
>>               TGT=${FILE#$SRCDIR}
>>               case $(stat -c "%F" $FILE) in
>>               "directory")
>>                       echo "mkdir $TGT"
>>                       ;;
>>               "regular file")
>>                       echo "write $FILE $TGT"
>>                       ;;
>>               "symbolic link")
>>                       LINK_TGT=$(ls -l $FILE | sed -e 's/.*-> //')
>>                       echo "symlink $TGT $LINK_TGT"
>>                       ;;
>>               "block special file")
>>                       DEVNO=$(stat -c "%t %T" $FILE)
>>                       echo "mknod $F $DEVNO $TGT
>>                       ;;
>>               "character special file")
>>                       DEVNO=$(stat -c "%t %T" $FILE)
>>                       echo "mknod $TYPE $DEVNO $TGT
>>                       ;;
>>               *)
>>                       echo "Unknown file $FILE" 1>&2
>>                       ;;
>>               done
>>       done
>> } | debugfs -w -f /dev/stdin $device
>
>
> This is really promising. I've tweaked it a bit to use the basename and
> cd into the directories as they are traversed by find so it doesn't try
> and create filenames like "/dir1/hello.txt" in the root directory.
>
>         #!/bin/sh
>         SRCDIR=$1
>         DEVICE=$2
>
>         {
>                 find $SRCDIR | while read FILE; do
>                         #TGT=${FILE#$SRCDIR}
>                         TGT=$(basename ${FILE#$SRCDIR})
>
>                         # Skip the root dir
>                         if [ -z "$TGT" ]; then
>                                 continue
>                         fi
>
>                         case $(stat -c "%F" $FILE) in
>                         "directory")
>                                 echo "mkdir $TGT"
>                                 echo "cd $TGT"
>                                 ;;
>                         "regular file")
>                                 echo "write $FILE $TGT"
>                                 ;;
>                         "symbolic link")
>                                 LINK_TGT=$(ls -l $FILE | sed -e 's/.*-> //')
>                                 echo "symlink $TGT $LINK_TGT"
>                                 ;;
>                         "block special file")
>                                 DEVNO=$(stat -c "%t %T" $FILE)
>                                 echo "mknod $TGT b $DEVNO"
>                                 ;;
>                         "character special file")
>                                 DEVNO=$(stat -c "%t %T" $FILE)
>                                 echo "mknod $TGT c $DEVNO"
>                                 ;;
>                         *)
>                                 echo "Unknown file $FILE" 1>&2
>                                 ;;
>                         esac
>                 done
>         } | debugfs -w -f /dev/stdin $DEVICE
>
>
>> I would guess that implementing "symlink" support in debugfs will
>> be orders of magnitude less work, maintenance, and bugs than your
>> current patch.
>
>
> It needs symlink as you said, but I can relatively easily migrate my
> code for that in mke2fs to debugfs.
>
> Still needs permissions and such. Is that done with "modify_inode" ? If
> so, how do I specify the new contents?
>
> I need to look into how to detect and support hard links.
>
>
>> This might be turned inside-out and just run a "find $SRCDIR" and
>> have the inner loop check the file type and call the appropriate
>> operation for it (mkdir, write/cp, mknod, symlink).  Note that
>> "find" will return the directories first, so this should be OK to
>> just consume the lines as they are output by find.
>
>
> Yes, this seems to work just fine.
>
>
>>> I don't have much filesystem experience - most of my experience is
>>> with core kernel mechanisms, ipc, locking, etc. - so I'm mostly
>>> hacking my way to some basic functionality before refactoring. The
>>> libext2fs library documentation gave me a good start, but I
>>> occasionally trip over things like the problem described below as
>>> there is no documentation for what I'm trying to do specifically
>>> (of course) and many of the required functions are only minimally
>>> documented, and sometimes only listed in the index.
>>
>> Definitely, if the documentation is lacking and you've spent cycles
>> figuring something out, then a patch to improve the documentation is
>> most welcome.
>
>
> I plan to update this as I go... although I'm going to have much less to
> do if I use the debugfs approach. ;-)
>
> I wonder if it would make sense to integrate the debugfs functionality
> into libext2fs and enable both debugfs and mke2fs to use the same common
> code. I think the "-r initialdir" option would still be nice to have for
> mke2fs, and does make it more consistent with other FSs in this feature.
>
>
>>
>>> The specific instance below is the result of me trying to format and
>>> populate a filesystem image (in a file) from a root directory that looks like this:
>>>
>>> $ tree rootdir/
>>> rootdir/
>>> |-- dir1
>>> |   |-- hello.lnk -> /hello.txt
>>> |   `-- world.txt
>>> |-- hello.lnk -> /hello.txt
>>> |-- hello.txt
>>> |-- sda
>>> `-- ttyS0
>>>
>>> $ cat rootdir/hello.txt
>>> hello
>>>
>>> In mke2fs.c I setup the new getopt argument and call nftw() with a
>>> callback called init_dir_cb() which checks the file type and takes
>>> the appropriate action to duplicate each entry. The exact code is at:
>>
>> To be honest, ntfw() will drag a bunch of bloat into e2fsprogs that
>> doesn't exist today, and isn't really portable.
>
>
> OK, well it could also be done with ftw to be more portable, but I guess
> it's still marked obsolete in POSIX.1-2008 :/
>
> Similar functionality could be implemented relatively easily.
>
>
>>
>>> http://git.infradead.org/users/dvhart/e2fsprogs/blob/refs/heads/initialdir:/misc/mke2fs.c#l2319
>>>
>>> As described below, when I update the inode.i_size after the initial
>>> write and copying of the file content, the above cat command fails to
>>> output anything when run on the loop mounted filesystem. If I just
>>> hack in the i_size prior to writing the inode for the first time and
>>> don't update it after copying the file content, then the cat command
>>> succeeds as above on the loop mounted image.
>>
>> It probably makes sense to understand what is broken here, whether
>> it is the library or the program.  We definitely want to make sure
>> the API is usable and working correctly in any case.
>
>
> I should be able to compare with debugfs "write" and see what the
> difference is.
>
>
>>
>>> The commented out inode write is noted here:
>>>
>>> http://git.infradead.org/users/dvhart/e2fsprogs/blob/refs/heads/initialdir:/misc/mke2fs.c#l2462
>>>
>>> Does that help clarify the situation?
>>>
>>> What I'm looking for is some insight into what it is I am not
>>> understanding about the filesystem structures that causes this behavior.
>>
>> I hate to put a downer on your current work, but I think that you
>> are adding something overly complex that only has a very limited
>> usefulness, and your time could be better spent elsewhere.
>
> Not at all! I appreciate the tip. And it hasn't been wasted time, I've
> learned quite a bit, and as I said above, perhaps the debugfs copies and
> such can be pushed into libext2fs and used in both. ext2fs_mkdir()
> exists after all, why not ext2fs_mksymlink(), ext2fs_mknod() and
> ext2fs_writefile() ?
>
> Thanks a lot for the insight, exactly what I needed!
>
> --
> Darren
>
>>
>> [*] add debugfs "cp" command as an alias to "write":
>>
>> diff --git a/debugfs/debug_cmds.ct b/debugfs/debug_cmds.ct
>> index a799dd7..3789dcd 100644
>> --- a/debugfs/debug_cmds.ct
>> +++ b/debugfs/debug_cmds.ct
>> @@ -119,7 +119,7 @@ request do_undel, "Undelete file",
>>         undelete, undel;
>>
>>  request do_write, "Copy a file from your native filesystem",
>> -       write;
>> +       write, cp;
>>
>>  request do_dump, "Dump an inode out to a file",
>>         dump_inode, dump;
>>
>>> Thanks,
>>>
>>> Darren
>>>
>>>>
>>>> Cheers, Andreas
>>>>
>>>>> To make it easy for people to see what I'm working
>>>>> on, I've pushed my dev tree here:
>>>>>
>>>>> http://git.infradead.org/users/dvhart/e2fsprogs/shortlog/refs/heads/initialdir
>>>>>
>>>>> Note: the code is still just in the prototyping state. It is inelegant
>>>>> to say the least. The git tree will most definitely rebase. I'm trying
>>>>> to get it functional, once that is understand, I will refactor
>>>>> appropriately.
>>>>>
>>>>> I can create a simple directory structure and link in files and fast
>>>>> symlinks. I'm currently working on copying content from files in the
>>>>> initial directory. The process I'm using is as follows:
>>>>>
>>>>>
>>>>> ext2fs_new_inode(&ino)
>>>>> ext2fs_link()
>>>>>
>>>>> ext2fs_read_inode(ino, &inode)
>>>>> /* some initial inode setup */
>>>>> ext2fs_write_new_inode(ino, &inode)
>>>>>
>>>>> ext2fs_file_open2(&inode)
>>>>> ext2fs_write_file()
>>>>> ext2fs_file_close()
>>>>>
>>>>> inode.i_size = bytes_written
>>>>> ext2fs_write_inode()
>>>>>
>>>>> ext2fs_inode_alloc_stats2(ino)
>>>>>
>>>>>
>>>>> When I mount the image, the size for the file is correct, by catting it
>>>>> returns nothing. If I instead hack in the known size during the initial
>>>>> inode setup and drop the last ext2fs_write_inode() call, then the size
>>>>> is right and catting the file works as expected.
>>>>>
>>>>> Is it incorrect to write the inode more than once? If not, am I doing
>>>>> something that is somehow decoupling the block where the data was
>>>>> written from the inode associated with the file?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> --
>>>>> Darren Hart
>>>>> Intel Open Source Technology Center
>>>>> Yocto Project - Technical Lead - Linux Kernel
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>>>
>>>> Cheers, Andreas
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>>
>> Cheers, Andreas
>>
>>
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
Best Wishes
Yongqiang Yang
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Andreas Dilger Dec. 4, 2012, 10:59 a.m. UTC | #3
On 2012-12-03, at 12:46, Darren Hart <dvhart@infradead.org> wrote:
> 
> It needs symlink as you said, but I can relatively easily migrate my
> code for that in mke2fs to debugfs.
> 
> Still needs permissions and such. Is that done with "modify_inode" ? If
> so, how do I specify the new contents?

"modify_inode" is not a terribly easy use interface. Probably better to add something like "chmod" and "chown" for debugfs as well. 

> I need to look into how to detect and support hard links.

I was wondering about that, and hoped you wouldn't need them.  Maybe just keep a list if any files with nlink > 1 as { inode, pathname } as you go, and any inode with mlink > 1 are looked first in the duplicate list and the duplicate inode is hard linked to the original inode. 

Cheers, Andreas

>> This might be turned inside-out and just run a "find $SRCDIR" and
>> have the inner loop check the file type and call the appropriate
>> operation for it (mkdir, write/cp, mknod, symlink).  Note that
>> "find" will return the directories first, so this should be OK to
>> just consume the lines as they are output by find.
> 
> 
> Yes, this seems to work just fine.
> 
> 
>>> I don't have much filesystem experience - most of my experience is
>>> with core kernel mechanisms, ipc, locking, etc. - so I'm mostly
>>> hacking my way to some basic functionality before refactoring. The
>>> libext2fs library documentation gave me a good start, but I
>>> occasionally trip over things like the problem described below as
>>> there is no documentation for what I'm trying to do specifically
>>> (of course) and many of the required functions are only minimally
>>> documented, and sometimes only listed in the index.
>> 
>> Definitely, if the documentation is lacking and you've spent cycles
>> figuring something out, then a patch to improve the documentation is
>> most welcome.
> 
> 
> I plan to update this as I go... although I'm going to have much less to
> do if I use the debugfs approach. ;-)
> 
> I wonder if it would make sense to integrate the debugfs functionality
> into libext2fs and enable both debugfs and mke2fs to use the same common
> code. I think the "-r initialdir" option would still be nice to have for
> mke2fs, and does make it more consistent with other FSs in this feature.
> 
> 
>> 
>>> The specific instance below is the result of me trying to format and
>>> populate a filesystem image (in a file) from a root directory that looks like this:
>>> 
>>> $ tree rootdir/
>>> rootdir/
>>> |-- dir1
>>> |   |-- hello.lnk -> /hello.txt
>>> |   `-- world.txt
>>> |-- hello.lnk -> /hello.txt
>>> |-- hello.txt
>>> |-- sda
>>> `-- ttyS0
>>> 
>>> $ cat rootdir/hello.txt
>>> hello
>>> 
>>> In mke2fs.c I setup the new getopt argument and call nftw() with a
>>> callback called init_dir_cb() which checks the file type and takes
>>> the appropriate action to duplicate each entry. The exact code is at:
>> 
>> To be honest, ntfw() will drag a bunch of bloat into e2fsprogs that
>> doesn't exist today, and isn't really portable.
> 
> 
> OK, well it could also be done with ftw to be more portable, but I guess
> it's still marked obsolete in POSIX.1-2008 :/
> 
> Similar functionality could be implemented relatively easily.
> 
> 
>> 
>>> http://git.infradead.org/users/dvhart/e2fsprogs/blob/refs/heads/initialdir:/misc/mke2fs.c#l2319
>>> 
>>> As described below, when I update the inode.i_size after the initial
>>> write and copying of the file content, the above cat command fails to
>>> output anything when run on the loop mounted filesystem. If I just
>>> hack in the i_size prior to writing the inode for the first time and
>>> don't update it after copying the file content, then the cat command
>>> succeeds as above on the loop mounted image.
>> 
>> It probably makes sense to understand what is broken here, whether
>> it is the library or the program.  We definitely want to make sure
>> the API is usable and working correctly in any case.
> 
> 
> I should be able to compare with debugfs "write" and see what the
> difference is.
> 
> 
>> 
>>> The commented out inode write is noted here:
>>> 
>>> http://git.infradead.org/users/dvhart/e2fsprogs/blob/refs/heads/initialdir:/misc/mke2fs.c#l2462
>>> 
>>> Does that help clarify the situation?
>>> 
>>> What I'm looking for is some insight into what it is I am not
>>> understanding about the filesystem structures that causes this behavior.
>> 
>> I hate to put a downer on your current work, but I think that you
>> are adding something overly complex that only has a very limited
>> usefulness, and your time could be better spent elsewhere.
> 
> Not at all! I appreciate the tip. And it hasn't been wasted time, I've
> learned quite a bit, and as I said above, perhaps the debugfs copies and
> such can be pushed into libext2fs and used in both. ext2fs_mkdir()
> exists after all, why not ext2fs_mksymlink(), ext2fs_mknod() and
> ext2fs_writefile() ?
> 
> Thanks a lot for the insight, exactly what I needed!
> 
> --
> Darren
> 
>> 
>> [*] add debugfs "cp" command as an alias to "write":
>> 
>> diff --git a/debugfs/debug_cmds.ct b/debugfs/debug_cmds.ct
>> index a799dd7..3789dcd 100644
>> --- a/debugfs/debug_cmds.ct
>> +++ b/debugfs/debug_cmds.ct
>> @@ -119,7 +119,7 @@ request do_undel, "Undelete file",
>>        undelete, undel;
>> 
>> request do_write, "Copy a file from your native filesystem",
>> -       write;
>> +       write, cp;
>> 
>> request do_dump, "Dump an inode out to a file",
>>        dump_inode, dump;
>> 
>>> Thanks,
>>> 
>>> Darren
>>> 
>>>> 
>>>> Cheers, Andreas
>>>> 
>>>>> To make it easy for people to see what I'm working
>>>>> on, I've pushed my dev tree here:
>>>>> 
>>>>> http://git.infradead.org/users/dvhart/e2fsprogs/shortlog/refs/heads/initialdir
>>>>> 
>>>>> Note: the code is still just in the prototyping state. It is inelegant
>>>>> to say the least. The git tree will most definitely rebase. I'm trying
>>>>> to get it functional, once that is understand, I will refactor
>>>>> appropriately.
>>>>> 
>>>>> I can create a simple directory structure and link in files and fast
>>>>> symlinks. I'm currently working on copying content from files in the
>>>>> initial directory. The process I'm using is as follows:
>>>>> 
>>>>> 
>>>>> ext2fs_new_inode(&ino)
>>>>> ext2fs_link()
>>>>> 
>>>>> ext2fs_read_inode(ino, &inode)
>>>>> /* some initial inode setup */
>>>>> ext2fs_write_new_inode(ino, &inode)
>>>>> 
>>>>> ext2fs_file_open2(&inode)
>>>>> ext2fs_write_file()
>>>>> ext2fs_file_close()
>>>>> 
>>>>> inode.i_size = bytes_written
>>>>> ext2fs_write_inode()
>>>>> 
>>>>> ext2fs_inode_alloc_stats2(ino)
>>>>> 
>>>>> 
>>>>> When I mount the image, the size for the file is correct, by catting it
>>>>> returns nothing. If I instead hack in the known size during the initial
>>>>> inode setup and drop the last ext2fs_write_inode() call, then the size
>>>>> is right and catting the file works as expected.
>>>>> 
>>>>> Is it incorrect to write the inode more than once? If not, am I doing
>>>>> something that is somehow decoupling the block where the data was
>>>>> written from the inode associated with the file?
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> --
>>>>> Darren Hart
>>>>> Intel Open Source Technology Center
>>>>> Yocto Project - Technical Lead - Linux Kernel
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>> 
>>>> 
>>>> Cheers, Andreas
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> 
>> Cheers, Andreas
>> 
>> 
>> 
>> 
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o Dec. 4, 2012, 3:22 p.m. UTC | #4
On Mon, Dec 03, 2012 at 11:46:07AM -0800, Darren Hart wrote:
> > Maybe Ted can confirm whether that is true or not.  At least I recall
> > that the block allocator inside libext2fs was horrible, and creating
> > large files was problematic.
> 
> Ted, can you confirm?

The block allocator inside libext2fs is primitive; it will find the
first free block and use it.  It should be OK for populating large
flash devices for file system images stored on flash devices (where
seeks don't matter so block group placement isn't a big deal), and
especially for fixed root file system images which are mounted
read-only and which tend to be updated only once in a while (i.e., in
the cases of Android system updates), and so you don't really care
about aligning file writes to eMMC erase blocks.

It could certainly be made better, and for people who were trying to
use libext2fs with FUSE targetting hard drives, there are ample
opportunities for improvements.....

Creating large files shouldn't be a problem (unless what you mean is
ext4 huge files ala the huge file feature where the number of 512
blocks exceeds 2**32, in which case you should probably test that case
if you care about it), and it certainly will create extents-based
files.

Regards, 

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darren Hart Dec. 4, 2012, 5:42 p.m. UTC | #5
On 12/04/2012 02:45 AM, Yongqiang Yang wrote:
> Hi,
> 
> If original images are ext4 format, this can be done by writing the
> image to a new device and resizing the new device via resizefs.

I don't follow... what can be done using resizefs?

--
Darren

> 
> Yongqiang,
> Thanks,
> 
> On Tue, Dec 4, 2012 at 3:46 AM, Darren Hart <dvhart@infradead.org> wrote:
>> On 12/01/2012 11:31 AM, Andreas Dilger wrote:
>>> On 2012-11-30, at 10:08 PM, Darren Hart wrote:
>>>> On 11/30/2012 08:23 PM, Andreas Dilger wrote:
>>>>> On 2012-11-30, at 7:13 PM, Darren Hart wrote:
>>>>>> I am working on creating some files after creating a filesystem in
>>>>>> mke2fs. This is part of a larger project to add initial directory
>>>>>> support to mke2fs.
>>>>>
>>>>> Maybe some background on what you are trying to do would help us to
>>>>> understand the problem?
>>>>
>>>> Sure, a few are already aware, but I suppose some extra detail for
>>>> the first post to this list is in order.
>>>>
>>>> I work on the Yocto Project, and this particular effort is part of
>>>> improving our deployment tooling. Specifically, the part of the build
>>>> process that creates the root filesystem.
>>>>
>>>> Most all filesystems have some mechanism to create prepopulated
>>>> images without the need for root permissions. Many do this through
>>>> a -r parameter to their corresponding mkfs.* tool. The exceptions to
>>>> this are ext3 and ext4. Our current tooling relies on genext2fs and
>>>> flipping some bits to "convert" the ext2 filesystem to ext3 and 4.
>>>> Not ideal.
>>>>
>>>> After exploring options like libguestfs and finding them to be
>>>> considerably heavy weight for what we are trying to accomplish, I
>>>> discussed the possibility of adding an argument to mke2fs which would
>>>> populate a newly formatted filesystem from a specified directory. Ted
>>>> suggested a clean set of patches implementing this were likely to be
>>>> accepted.
>>>
>>> Hmm, I wonder if libext2fs can itself create extent-mapped files,
>>> or if these files will be block-mapped?  If they are small (< 1MB),
>>> it is probably not a huge problem, but if your files are large it
>>> may be that libext2fs also creates "ext2" files internally?
>>>
>>> Maybe Ted can confirm whether that is true or not.  At least I recall
>>> that the block allocator inside libext2fs was horrible, and creating
>>> large files was problematic.
>>
>>
>> Ted, can you confirm?
>>
>>
>>> I guess the other question is why you don't use debugfs to create
>>> the directory tree and copy the files into your new filesystem?
>>> It already has "mkdir", "mknod" and "write" commands for use, and
>>> it is a one-line patch to alias "write" to "cp" for easier use[*].
>>
>>
>> I just didn't know about it and it didn't come up in my polling :-)
>> (which would have been more fruitful had I done some of that here).
>>
>>
>>> Then, it just needs a debugfs script to build your directory tree
>>> and copy files over.  Possibly enhancing "cp" to call do_mknod() for
>>> pipe/block/char devices would make this easier to use.
>>>
>>> Something like the following, though it seems there isn't an "ln -s"
>>> or "symlink" command for debugfs yet, that would need to be written.
>>>
>>> #!/bin/bash
>>> SRCDIR=$1
>>> DEVICE=$2
>>>
>>> {
>>>       find $SRCDIR | while read FILE; do
>>>               TGT=${FILE#$SRCDIR}
>>>               case $(stat -c "%F" $FILE) in
>>>               "directory")
>>>                       echo "mkdir $TGT"
>>>                       ;;
>>>               "regular file")
>>>                       echo "write $FILE $TGT"
>>>                       ;;
>>>               "symbolic link")
>>>                       LINK_TGT=$(ls -l $FILE | sed -e 's/.*-> //')
>>>                       echo "symlink $TGT $LINK_TGT"
>>>                       ;;
>>>               "block special file")
>>>                       DEVNO=$(stat -c "%t %T" $FILE)
>>>                       echo "mknod $F $DEVNO $TGT
>>>                       ;;
>>>               "character special file")
>>>                       DEVNO=$(stat -c "%t %T" $FILE)
>>>                       echo "mknod $TYPE $DEVNO $TGT
>>>                       ;;
>>>               *)
>>>                       echo "Unknown file $FILE" 1>&2
>>>                       ;;
>>>               done
>>>       done
>>> } | debugfs -w -f /dev/stdin $device
>>
>>
>> This is really promising. I've tweaked it a bit to use the basename and
>> cd into the directories as they are traversed by find so it doesn't try
>> and create filenames like "/dir1/hello.txt" in the root directory.
>>
>>         #!/bin/sh
>>         SRCDIR=$1
>>         DEVICE=$2
>>
>>         {
>>                 find $SRCDIR | while read FILE; do
>>                         #TGT=${FILE#$SRCDIR}
>>                         TGT=$(basename ${FILE#$SRCDIR})
>>
>>                         # Skip the root dir
>>                         if [ -z "$TGT" ]; then
>>                                 continue
>>                         fi
>>
>>                         case $(stat -c "%F" $FILE) in
>>                         "directory")
>>                                 echo "mkdir $TGT"
>>                                 echo "cd $TGT"
>>                                 ;;
>>                         "regular file")
>>                                 echo "write $FILE $TGT"
>>                                 ;;
>>                         "symbolic link")
>>                                 LINK_TGT=$(ls -l $FILE | sed -e 's/.*-> //')
>>                                 echo "symlink $TGT $LINK_TGT"
>>                                 ;;
>>                         "block special file")
>>                                 DEVNO=$(stat -c "%t %T" $FILE)
>>                                 echo "mknod $TGT b $DEVNO"
>>                                 ;;
>>                         "character special file")
>>                                 DEVNO=$(stat -c "%t %T" $FILE)
>>                                 echo "mknod $TGT c $DEVNO"
>>                                 ;;
>>                         *)
>>                                 echo "Unknown file $FILE" 1>&2
>>                                 ;;
>>                         esac
>>                 done
>>         } | debugfs -w -f /dev/stdin $DEVICE
>>
>>
>>> I would guess that implementing "symlink" support in debugfs will
>>> be orders of magnitude less work, maintenance, and bugs than your
>>> current patch.
>>
>>
>> It needs symlink as you said, but I can relatively easily migrate my
>> code for that in mke2fs to debugfs.
>>
>> Still needs permissions and such. Is that done with "modify_inode" ? If
>> so, how do I specify the new contents?
>>
>> I need to look into how to detect and support hard links.
>>
>>
>>> This might be turned inside-out and just run a "find $SRCDIR" and
>>> have the inner loop check the file type and call the appropriate
>>> operation for it (mkdir, write/cp, mknod, symlink).  Note that
>>> "find" will return the directories first, so this should be OK to
>>> just consume the lines as they are output by find.
>>
>>
>> Yes, this seems to work just fine.
>>
>>
>>>> I don't have much filesystem experience - most of my experience is
>>>> with core kernel mechanisms, ipc, locking, etc. - so I'm mostly
>>>> hacking my way to some basic functionality before refactoring. The
>>>> libext2fs library documentation gave me a good start, but I
>>>> occasionally trip over things like the problem described below as
>>>> there is no documentation for what I'm trying to do specifically
>>>> (of course) and many of the required functions are only minimally
>>>> documented, and sometimes only listed in the index.
>>>
>>> Definitely, if the documentation is lacking and you've spent cycles
>>> figuring something out, then a patch to improve the documentation is
>>> most welcome.
>>
>>
>> I plan to update this as I go... although I'm going to have much less to
>> do if I use the debugfs approach. ;-)
>>
>> I wonder if it would make sense to integrate the debugfs functionality
>> into libext2fs and enable both debugfs and mke2fs to use the same common
>> code. I think the "-r initialdir" option would still be nice to have for
>> mke2fs, and does make it more consistent with other FSs in this feature.
>>
>>
>>>
>>>> The specific instance below is the result of me trying to format and
>>>> populate a filesystem image (in a file) from a root directory that looks like this:
>>>>
>>>> $ tree rootdir/
>>>> rootdir/
>>>> |-- dir1
>>>> |   |-- hello.lnk -> /hello.txt
>>>> |   `-- world.txt
>>>> |-- hello.lnk -> /hello.txt
>>>> |-- hello.txt
>>>> |-- sda
>>>> `-- ttyS0
>>>>
>>>> $ cat rootdir/hello.txt
>>>> hello
>>>>
>>>> In mke2fs.c I setup the new getopt argument and call nftw() with a
>>>> callback called init_dir_cb() which checks the file type and takes
>>>> the appropriate action to duplicate each entry. The exact code is at:
>>>
>>> To be honest, ntfw() will drag a bunch of bloat into e2fsprogs that
>>> doesn't exist today, and isn't really portable.
>>
>>
>> OK, well it could also be done with ftw to be more portable, but I guess
>> it's still marked obsolete in POSIX.1-2008 :/
>>
>> Similar functionality could be implemented relatively easily.
>>
>>
>>>
>>>> http://git.infradead.org/users/dvhart/e2fsprogs/blob/refs/heads/initialdir:/misc/mke2fs.c#l2319
>>>>
>>>> As described below, when I update the inode.i_size after the initial
>>>> write and copying of the file content, the above cat command fails to
>>>> output anything when run on the loop mounted filesystem. If I just
>>>> hack in the i_size prior to writing the inode for the first time and
>>>> don't update it after copying the file content, then the cat command
>>>> succeeds as above on the loop mounted image.
>>>
>>> It probably makes sense to understand what is broken here, whether
>>> it is the library or the program.  We definitely want to make sure
>>> the API is usable and working correctly in any case.
>>
>>
>> I should be able to compare with debugfs "write" and see what the
>> difference is.
>>
>>
>>>
>>>> The commented out inode write is noted here:
>>>>
>>>> http://git.infradead.org/users/dvhart/e2fsprogs/blob/refs/heads/initialdir:/misc/mke2fs.c#l2462
>>>>
>>>> Does that help clarify the situation?
>>>>
>>>> What I'm looking for is some insight into what it is I am not
>>>> understanding about the filesystem structures that causes this behavior.
>>>
>>> I hate to put a downer on your current work, but I think that you
>>> are adding something overly complex that only has a very limited
>>> usefulness, and your time could be better spent elsewhere.
>>
>> Not at all! I appreciate the tip. And it hasn't been wasted time, I've
>> learned quite a bit, and as I said above, perhaps the debugfs copies and
>> such can be pushed into libext2fs and used in both. ext2fs_mkdir()
>> exists after all, why not ext2fs_mksymlink(), ext2fs_mknod() and
>> ext2fs_writefile() ?
>>
>> Thanks a lot for the insight, exactly what I needed!
>>
>> --
>> Darren
>>
>>>
>>> [*] add debugfs "cp" command as an alias to "write":
>>>
>>> diff --git a/debugfs/debug_cmds.ct b/debugfs/debug_cmds.ct
>>> index a799dd7..3789dcd 100644
>>> --- a/debugfs/debug_cmds.ct
>>> +++ b/debugfs/debug_cmds.ct
>>> @@ -119,7 +119,7 @@ request do_undel, "Undelete file",
>>>         undelete, undel;
>>>
>>>  request do_write, "Copy a file from your native filesystem",
>>> -       write;
>>> +       write, cp;
>>>
>>>  request do_dump, "Dump an inode out to a file",
>>>         dump_inode, dump;
>>>
>>>> Thanks,
>>>>
>>>> Darren
>>>>
>>>>>
>>>>> Cheers, Andreas
>>>>>
>>>>>> To make it easy for people to see what I'm working
>>>>>> on, I've pushed my dev tree here:
>>>>>>
>>>>>> http://git.infradead.org/users/dvhart/e2fsprogs/shortlog/refs/heads/initialdir
>>>>>>
>>>>>> Note: the code is still just in the prototyping state. It is inelegant
>>>>>> to say the least. The git tree will most definitely rebase. I'm trying
>>>>>> to get it functional, once that is understand, I will refactor
>>>>>> appropriately.
>>>>>>
>>>>>> I can create a simple directory structure and link in files and fast
>>>>>> symlinks. I'm currently working on copying content from files in the
>>>>>> initial directory. The process I'm using is as follows:
>>>>>>
>>>>>>
>>>>>> ext2fs_new_inode(&ino)
>>>>>> ext2fs_link()
>>>>>>
>>>>>> ext2fs_read_inode(ino, &inode)
>>>>>> /* some initial inode setup */
>>>>>> ext2fs_write_new_inode(ino, &inode)
>>>>>>
>>>>>> ext2fs_file_open2(&inode)
>>>>>> ext2fs_write_file()
>>>>>> ext2fs_file_close()
>>>>>>
>>>>>> inode.i_size = bytes_written
>>>>>> ext2fs_write_inode()
>>>>>>
>>>>>> ext2fs_inode_alloc_stats2(ino)
>>>>>>
>>>>>>
>>>>>> When I mount the image, the size for the file is correct, by catting it
>>>>>> returns nothing. If I instead hack in the known size during the initial
>>>>>> inode setup and drop the last ext2fs_write_inode() call, then the size
>>>>>> is right and catting the file works as expected.
>>>>>>
>>>>>> Is it incorrect to write the inode more than once? If not, am I doing
>>>>>> something that is somehow decoupling the block where the data was
>>>>>> written from the inode associated with the file?
>>>>>>
>>>>>> Thanks,
>>>>>>
>>>>>> --
>>>>>> Darren Hart
>>>>>> Intel Open Source Technology Center
>>>>>> Yocto Project - Technical Lead - Linux Kernel
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>>>
>>>>> Cheers, Andreas
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>>> the body of a message to majordomo@vger.kernel.org
>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>>> the body of a message to majordomo@vger.kernel.org
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>> Cheers, Andreas
>>>
>>>
>>>
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 
> --
> Best Wishes
> Yongqiang Yang
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darren Hart Dec. 4, 2012, 5:43 p.m. UTC | #6
On 12/04/2012 02:59 AM, Andreas Dilger wrote:
> On 2012-12-03, at 12:46, Darren Hart <dvhart@infradead.org> wrote:
>>
>> It needs symlink as you said, but I can relatively easily migrate my
>> code for that in mke2fs to debugfs.
>>
>> Still needs permissions and such. Is that done with "modify_inode" ? If
>> so, how do I specify the new contents?
> 
> "modify_inode" is not a terribly easy use interface. Probably better to add something like "chmod" and "chown" for debugfs as well. 

I was thinking the same thing.

> 
>> I need to look into how to detect and support hard links.
> 
> I was wondering about that, and hoped you wouldn't need them.  Maybe just keep a list if any files with nlink > 1 as { inode, pathname } as you go, and any inode with mlink > 1 are looked first in the duplicate list and the duplicate inode is hard linked to the original inode. 
>
> Cheers, Andreas

Right, my thoughts as well. Thanks for the confirmation!

I don't know that I need them, but I imagine a complete solution will be
more acceptable than one that fits only our needs. So while we're in
there...

--
Darren
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darren Hart Dec. 4, 2012, 5:46 p.m. UTC | #7
On 12/04/2012 07:22 AM, Theodore Ts'o wrote:
> On Mon, Dec 03, 2012 at 11:46:07AM -0800, Darren Hart wrote:
>>> Maybe Ted can confirm whether that is true or not.  At least I recall
>>> that the block allocator inside libext2fs was horrible, and creating
>>> large files was problematic.
>>
>> Ted, can you confirm?
> 
> The block allocator inside libext2fs is primitive; it will find the
> first free block and use it.  It should be OK for populating large
> flash devices for file system images stored on flash devices (where
> seeks don't matter so block group placement isn't a big deal), and
> especially for fixed root file system images which are mounted
> read-only and which tend to be updated only once in a while (i.e., in
> the cases of Android system updates), and so you don't really care
> about aligning file writes to eMMC erase blocks.
> 
> It could certainly be made better, and for people who were trying to
> use libext2fs with FUSE targetting hard drives, there are ample
> opportunities for improvements.....
> 


I think what I'm reading here is that if you care about having a
filesystem that makes hardware specific optimizations, you're better off
mounting the device and copying the filesystem over. In that case, plan
on needing root access.


> Creating large files shouldn't be a problem (unless what you mean is
> ext4 huge files ala the huge file feature where the number of 512
> blocks exceeds 2**32, in which case you should probably test that case
> if you care about it), and it certainly will create extents-based
> files.

Great, sounds like this approach is still viable. Thanks Ted!

--
Darren Hart
Intel Open Source Technology Center
Yocto Project - Technical Lead - Linux Kernel
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o Dec. 4, 2012, 7:08 p.m. UTC | #8
On Tue, Dec 04, 2012 at 09:43:21AM -0800, Darren Hart wrote:
> > "modify_inode" is not a terribly easy use interface. Probably better to add something like "chmod" and "chown" for debugfs as well. 
> 
> I was thinking the same thing.
> 

modify_inode is the old command, and it is indeed hard to use.  The
new one (and the one I use all the time) is set_inode_field
(abbreviation "sif"):

    sif /bin/su mode 0104755
    sif /bin/su uid 0

BTW, one of the things that I've always toyed with was to create a
shim layer between the libss API and the tcl embedding API, which
would add some scripting, aliases, and automation relatively easily to
debugfs.

What people have done in practice who have cared about this is to
create perl scripts which emits a series of commands which they then
feed to a pipe which has debugfs on the other end.

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Theodore Ts'o Dec. 4, 2012, 7:24 p.m. UTC | #9
On Tue, Dec 04, 2012 at 09:46:06AM -0800, Darren Hart wrote:
> 
> I think what I'm reading here is that if you care about having a
> filesystem that makes hardware specific optimizations, you're better off
> mounting the device and copying the filesystem over. In that case, plan
> on needing root access.

Well, ext4 currently doesn't optimize for erase block alignment
either.

If I had the free time, and it was something that I could work on on
$DAYJOB time, here are some projects that I've been thinking about:

1) Add support for erase block alignment using the same mechanism
we've been planning for RAID 5 stripe alignment.

2) Add either a superblock flag or a mount option which adds an eMMC
block allocation algorithm which would add support for more aggressive
optimizations.

3) Allow a zero length file to have its extent flag switch to be
turned off (so it would be using the old indirect block scheme).

4) If a file has the extent flag turned off, and the eMMC block
allocation algorithm is enabled, and the workload appears to be doing
random overwrites, implement data block copy-on-write.  (That is,
allocate a new block and then update the indirect block to point to
the new block.)

5) If the eMMC block allocation algorithm is enabled, teach the block
allocator to aggressively allocate contiguous physical blocks
(initially aligned on an erase block) regardless of whether of what
the logical block number is, since with flash seeks are essentially
free, and with indirect blocks we don't care about extent
fragmentation.

The last two are a little bit complicated, but I'm certain we could
implement and stablize it faster than f2fs can be stablized.  (See
previous discussions regarding how confident btrfs people were that
they could stablize it more quickly than all previous experience with
gpfs, jfs, advfs, zfs, etc., because, well, Open Source Is Different.
If anyone at Linaro is interested in trying their hand on some kernel
file system work, they should contact me.  :-)

						- Ted

P.S.  I still think part of the right answer is to investigate replace
sqlite with something like OpenLDAP's mdb --- which has a drop-in
replacement sqlite API shim layer BTW --- and which beats the pants
off of sqlite's performance without requiring kernel-level changes,
but given that people seem wedded to sqlite....
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darren Hart Dec. 4, 2012, 7:40 p.m. UTC | #10
On 12/04/2012 11:08 AM, Theodore Ts'o wrote:
> On Tue, Dec 04, 2012 at 09:43:21AM -0800, Darren Hart wrote:
>>> "modify_inode" is not a terribly easy use interface. Probably better to add something like "chmod" and "chown" for debugfs as well.
>>
>> I was thinking the same thing.
>>
>
> modify_inode is the old command, and it is indeed hard to use.  The
> new one (and the one I use all the time) is set_inode_field
> (abbreviation "sif"):
>
>     sif /bin/su mode 0104755
>     sif /bin/su uid 0


Ah, excellent.


> BTW, one of the things that I've always toyed with was to create a
> shim layer between the libss API and the tcl embedding API, which
> would add some scripting, aliases, and automation relatively easily to
> debugfs.
>
> What people have done in practice who have cared about this is to
> create perl scripts which emits a series of commands which they then
> feed to a pipe which has debugfs on the other end.

And this is what Andreas suggested with the example bash script. I may
convert
this to a python script, but the concept is sound. That said, I still
feel that
a logical approach would be the following:

1) Update debugfs to completely support our use-case
   [ ] Add symlink support
   [ ] Add hardlink support
   [ ] Add meta-data mirroring
   (some of this belongs in the script and not in the ext2fsprogs)
2) Refactor filesystem creation functions from debugfs and move them into
   libext2fs alongside existing filesystem creation functions like
   ext2fs_mkdir()
3) Add the "-r initialdir" option to mke2fs that leverages these new
functions
   from libext2fs.

#3 makes for a more consistent filesystem creation process across
filesystems.

Does this seem like a reasonable approach? Any objection to the migration of
code into libext2fs?
Theodore Ts'o Dec. 4, 2012, 8 p.m. UTC | #11
On Tue, Dec 04, 2012 at 11:40:47AM -0800, Darren Hart wrote:
> 
> 1) Update debugfs to completely support our use-case
>    [ ] Add symlink support
>    [ ] Add hardlink support

This exists already.  See the "link" command.  It doesn't update the
inode i_links_count, but that's probably something I would add by
adding an option (-i) which did this.

>    [ ] Add meta-data mirroring

Not sure what what you mean by meta-data mirroring?

> 2) Refactor filesystem creation functions from debugfs and move them into
>    libext2fs alongside existing filesystem creation functions like
>    ext2fs_mkdir()

I'm not sure that creating new libext2fs functions is necessarily
needed.  We'll need to discuss that on a case-by-case basis.  For
creating a new symlink, sure, I'll buy that --- there's enough
complexity involved that it makes sense to move that into the library.

For creating a new hard link, we already have ext2fs_link().  Whether
or not it makes sense to add an entirely new function just to bump
i_links_count seems very dubious to me.  We could add a flag which
reads in the target inode, bumps i_links_count, and then writes it
back out, but then we start asking whether or not we need to add an
option to set the ctime field or not, etc., etc., etc.

Taken to extremes that way lies the insanity of VMS's create_process
system call, with its huge number of parameters and options, as
opposed to separating fork() and exec(), and allowing the application
program to close file descriptors dup them, edit environment
variables, etc., between the fork() and exec(), which was a much saner
way of doing things in Unix.

So it's actually quite deliberate that what we have are very low-level
primitives in libext2fs.  That being said, API design is a bit of an
art, as I said, for something as complicated as symlink creation,
where you need to do different things depending on whether the symlink
pathname can fit in the inode, etc., it does make sense to have a new
function.

> 3) Add the "-r initialdir" option to mke2fs that leverages these new
> functions
>    from libext2fs.

Assuming the code for -r can be well constrained in terms of cleanly
added to the mke2fs sources (probably the bulk of the code should go
in a new .c file), and assuming that it doesn't bloat the mke2fs
binary too badly, I have no objections to adding the -r option to
mke2fs.  (Worst case we can have a --disable-mke2fs-init-dir if there
are still people worried about making mke2fs fit on a boot/root 1.44M
floppy.  :-)

That being said, if it bloats mke2fs too badly (for example, because
you tried using libxml2 for some kind of control file to specify the
character/block devices, and it dragged in megabytes and megabytes of
compiled object code, I'd certainly object, both because of the
addition of the external dependency, and because at that point people
really would complain --- and of course, because XML is just
horrible.  :-)

       	     	      	      	 	 	 - Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Darren Hart Dec. 4, 2012, 8:10 p.m. UTC | #12
On 12/04/2012 12:00 PM, Theodore Ts'o wrote:
> On Tue, Dec 04, 2012 at 11:40:47AM -0800, Darren Hart wrote:
>>
>> 1) Update debugfs to completely support our use-case
>>    [ ] Add symlink support
>>    [ ] Add hardlink support
> 
> This exists already.  See the "link" command.  It doesn't update the
> inode i_links_count, but that's probably something I would add by
> adding an option (-i) which did this.

Right, this seems like it will need to be managed at a higher level (the
script for the debugfs appraoch).

> 
>>    [ ] Add meta-data mirroring
> 
> Not sure what what you mean by meta-data mirroring?


Sorry, just uid, gid, permissions... basically the inode. This could be
done by updating the script to use the sif command, but having a -i or
similar option to "write" (or maybe a new "copy" which did effectively
the same thing) by reading stat and copying the data across would be a
nice abstraction.


>> 2) Refactor filesystem creation functions from debugfs and move them into
>>    libext2fs alongside existing filesystem creation functions like
>>    ext2fs_mkdir()
> 
> I'm not sure that creating new libext2fs functions is necessarily
> needed.  We'll need to discuss that on a case-by-case basis.  For
> creating a new symlink, sure, I'll buy that --- there's enough
> complexity involved that it makes sense to move that into the library.
> 
> For creating a new hard link, we already have ext2fs_link().  Whether
> or not it makes sense to add an entirely new function just to bump
> i_links_count seems very dubious to me.  We could add a flag which
> reads in the target inode, bumps i_links_count, and then writes it
> back out, but then we start asking whether or not we need to add an
> option to set the ctime field or not, etc., etc., etc.


Agreed on the hard link. However, things like mknod and write from
debugfs seem like good candidates to be moved into libext2fs, then
other tools can make use of the code.


> Taken to extremes that way lies the insanity of VMS's create_process
> system call, with its huge number of parameters and options, as
> opposed to separating fork() and exec(), and allowing the application
> program to close file descriptors dup them, edit environment
> variables, etc., between the fork() and exec(), which was a much saner
> way of doing things in Unix.
> 
> So it's actually quite deliberate that what we have are very low-level
> primitives in libext2fs.  That being said, API design is a bit of an
> art, as I said, for something as complicated as symlink creation,
> where you need to do different things depending on whether the symlink
> pathname can fit in the inode, etc., it does make sense to have a new
> function.
> 
>> 3) Add the "-r initialdir" option to mke2fs that leverages these new
>> functions
>>    from libext2fs.
> 
> Assuming the code for -r can be well constrained in terms of cleanly
> added to the mke2fs sources (probably the bulk of the code should go
> in a new .c file), and assuming that it doesn't bloat the mke2fs
> binary too badly, I have no objections to adding the -r option to
> mke2fs.  (Worst case we can have a --disable-mke2fs-init-dir if there
> are still people worried about making mke2fs fit on a boot/root 1.44M
> floppy.  :-)
> 
> That being said, if it bloats mke2fs too badly (for example, because
> you tried using libxml2 for some kind of control file to specify the
> character/block devices, and it dragged in megabytes and megabytes of
> compiled object code, I'd certainly object, both because of the
> addition of the external dependency, and because at that point people
> really would complain --- and of course, because XML is just
> horrible.  :-)


Indeed! I understand the requirement and will keep a close eye on
dependencies and compiled size. I will include these numbers in any
patches that result from this.

So far the dependencies are minimal and I'll work to keep them that way.
Theodore Ts'o Dec. 4, 2012, 8:36 p.m. UTC | #13
On Tue, Dec 04, 2012 at 12:10:32PM -0800, Darren Hart wrote:
> > 
> >>    [ ] Add meta-data mirroring
> > 
> > Not sure what what you mean by meta-data mirroring?
> 
> Sorry, just uid, gid, permissions... basically the inode. This could be
> done by updating the script to use the sif command, but having a -i or
> similar option to "write" (or maybe a new "copy" which did effectively
> the same thing) by reading stat and copying the data across would be a
> nice abstraction.

Oh, I see, grabbing the permissions from what's actually in the source
tree.  You may want to talk to some of the people who will actually be
using your script to see what they want.  I could easily imagine some
folks arguing that they don't want to use the mtimes from the source
code tree, but instead they want to set them all to some standard set
timestamp specified in some config file (in order to make the file
system image be stable across checkouts).

In addition, for setuid root programs, you won't be able to take the
uid/gid and permissions from the inode anyway --- so I had been
assuming that you would have some kind of control file, that would
perhaps be in the top-level of your initial_ directory, that would
list the default uid/gid/mod times to be used, and then exceptions on
a per-file basis.  (This was the control file where I was suggesting
that you _not_ use XML for your encoding; if you want a flexible, but
easy-to-use config file parser, I'll suggest you take a look at
e2fsck/profile.c, which is used for parsing /etc/e2fsck.conf and
/etc/mke2fs.conf.)

> Agreed on the hard link. However, things like mknod and write from
> debugfs seem like good candidates to be moved into libext2fs, then
> other tools can make use of the code.

For write from debugfs, you do realize we have lib/ext2fs/fileio.c,
right?  That's what debugfs uses already.

Maybe it would make sense to add a ext2fs_create_file() function which
takes a pathname, and it takes care of calling namei to get the target
directory, and then allocating the file, etc., and then leave it to
the calling application to fill in the inode structure with whatever
is neecessary to create a new regular file, or a new block/character
device, etc.

So that one new function would would kill multiple stones at once; it
would give you mknod, as well as refactoring most of what's in debugfs
today for writing a file to a filesystem that isn't in
lib/ext2fs/fileio.c.

The one gotcha with creating these convenience functions is that it
make may it much harder to properly report errors back to the user,
since a fundamental rule for functions in libext2fs is that they
should not use fprintf(stderr, ...) to report errors back to the user.
That's the other reason why I tended not push too much complexity into
libext2fs.

(In general use of stdio in libext2fs is strongly discouraged, since
that wouldn't be useful for an application written to uses gtk or even
ncurses, such as that old EVMS front end program that some former
colleagues of ours at IBM's LTC had written.  :-)  It's also
problematic for the ppcboot folks, since they are trying to write a
bootloader which would use libext2fs in a restricted environment that
doesn't even have stdio available to programs; that was actually a
problem for them caused by the recently merged mmp code, and I had to
help them with that.)

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/debugfs/debug_cmds.ct b/debugfs/debug_cmds.ct
index a799dd7..3789dcd 100644
--- a/debugfs/debug_cmds.ct
+++ b/debugfs/debug_cmds.ct
@@ -119,7 +119,7 @@  request do_undel, "Undelete file",
        undelete, undel;
 
 request do_write, "Copy a file from your native filesystem",
-       write;
+       write, cp;
 
 request do_dump, "Dump an inode out to a file",
        dump_inode, dump;