Message ID | 1316803672-25550-1-git-send-email-cmaiolino@redhat.com |
---|---|
State | Rejected, archived |
Headers | show |
On 2011-09-23, at 12:47 PM, Carlos Maiolino wrote: > The current example in the man page uses bzip2 to compress > the raw image file created by the e2image, but, bzip2 does > not honors sparse files, which causes the image to have the > same size of the filesystem. > Using tar together with bzip2 will make the compressed file > to honor the sparsed file, which makes it more transportable > than the current one if the filesystem is large. > > Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> > --- > misc/e2image.8.in | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/misc/e2image.8.in b/misc/e2image.8.in > index 74d2a0b..fcf3d20 100644 > --- a/misc/e2image.8.in > +++ b/misc/e2image.8.in > @@ -115,7 +115,7 @@ as part of bug reports to e2fsprogs. When used in this capacity, the > recommended command is as follows (replace hda1 with the appropriate device): > .PP > .br > -\ \fBe2image \-r /dev/hda1 \- | bzip2 > hda1.e2i.bz2\fR > +\ \fBe2image \-r /dev/hda1 hda1.e2i && tar Sjcvf e2i.tar.bz2 hda1.e2i\fR Even better would be the use of the QCOW2 format that Lukas added, if it could also be operated on directly by the e2fsprogs utils (I don't know if that is possible or not). Cheers, Andreas -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Sep 23, 2011 at 02:24:23PM -0600, Andreas Dilger wrote: > On 2011-09-23, at 12:47 PM, Carlos Maiolino wrote: > > The current example in the man page uses bzip2 to compress > > the raw image file created by the e2image, but, bzip2 does > > not honors sparse files, which causes the image to have the > > same size of the filesystem. > > Using tar together with bzip2 will make the compressed file > > to honor the sparsed file, which makes it more transportable > > than the current one if the filesystem is large. > > > > Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> > > --- > > misc/e2image.8.in | 2 +- > > 1 files changed, 1 insertions(+), 1 deletions(-) > > > > diff --git a/misc/e2image.8.in b/misc/e2image.8.in > > index 74d2a0b..fcf3d20 100644 > > --- a/misc/e2image.8.in > > +++ b/misc/e2image.8.in > > @@ -115,7 +115,7 @@ as part of bug reports to e2fsprogs. When used in this capacity, the > > recommended command is as follows (replace hda1 with the appropriate device): > > .PP > > .br > > -\ \fBe2image \-r /dev/hda1 \- | bzip2 > hda1.e2i.bz2\fR > > +\ \fBe2image \-r /dev/hda1 hda1.e2i && tar Sjcvf e2i.tar.bz2 hda1.e2i\fR > > Even better would be the use of the QCOW2 format that Lukas added, if it could also be operated on directly by the e2fsprogs utils (I don't know if that is possible or not). > The QCOW2 format can only be operated with qcow2 capable tools like qemu-img =[ to use directly with e2fsprogs tools, we still need to use raw images.
On Fri, Sep 23, 2011 at 05:51:24PM -0300, Carlos Maiolino wrote: > On Fri, Sep 23, 2011 at 02:24:23PM -0600, Andreas Dilger wrote: > > On 2011-09-23, at 12:47 PM, Carlos Maiolino wrote: > > > The current example in the man page uses bzip2 to compress > > > the raw image file created by the e2image, but, bzip2 does > > > not honors sparse files, which causes the image to have the > > > same size of the filesystem. > > > Using tar together with bzip2 will make the compressed file > > > to honor the sparsed file, which makes it more transportable > > > than the current one if the filesystem is large. The problem with using tar is that it requires extra disk space by the user --- somewhere a bit more than double the extra disk space (because you need to have space for the hda1.e2i file before it gets compressed). For very large file systems, this can be quite significant. My general philosophy has been to make things easy as possible for the users as being more important for the developers. For the developers, we do have contrib/make-sparse.c. All we have to do is: bunzip2 < hda1.e2i.bz2 | make-sparse hda1.e2i ... and this creates a sparse file in hda1.e2i. > > Even better would be the use of the QCOW2 format that Lukas added, > > if it could also be operated on directly by the e2fsprogs utils (I > > don't know if that is possible or not). > > > The QCOW2 format can only be operated with qcow2 capable tools like > > qemu-img to use directly with e2fsprogs tools, we still need to > > use raw images. Yeah, it would be nice if we had an io_manager implementation that understood qcow2, which could then be used by dumpe2fs and debugfs. Hopefully at some point someone will implement it. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> The problem with using tar is that it requires extra disk space by the > user --- somewhere a bit more than double the extra disk space > (because you need to have space for the hda1.e2i file before it gets > compressed). For very large file systems, this can be quite > significant. My general philosophy has been to make things easy as > possible for the users as being more important for the developers. > > For the developers, we do have contrib/make-sparse.c. All we have to do is: > > bunzip2 < hda1.e2i.bz2 | make-sparse hda1.e2i > > ... and this creates a sparse file in hda1.e2i. > Nice to know this Ted. I'll use this instead. Thanks
On 09/24/2011 06:31 PM, Ted Ts'o wrote: > On Fri, Sep 23, 2011 at 05:51:24PM -0300, Carlos Maiolino wrote: >> On Fri, Sep 23, 2011 at 02:24:23PM -0600, Andreas Dilger wrote: >>> On 2011-09-23, at 12:47 PM, Carlos Maiolino wrote: >>>> The current example in the man page uses bzip2 to compress >>>> the raw image file created by the e2image, but, bzip2 does >>>> not honors sparse files, which causes the image to have the >>>> same size of the filesystem. >>>> Using tar together with bzip2 will make the compressed file >>>> to honor the sparsed file, which makes it more transportable >>>> than the current one if the filesystem is large. > > The problem with using tar is that it requires extra disk space by the > user --- somewhere a bit more than double the extra disk space > (because you need to have space for the hda1.e2i file before it gets > compressed). For very large file systems, this can be quite > significant. My general philosophy has been to make things easy as > possible for the users as being more important for the developers. > > For the developers, we do have contrib/make-sparse.c. All we have to do is: > > bunzip2< hda1.e2i.bz2 | make-sparse hda1.e2i > > ... and this creates a sparse file in hda1.e2i. The problem is that the bzip2 run will take a huge amount of time to compress all the zeros. In 2009 (with a recent CPU of that time) I aborted such a run for a 8TiB file system after a couple of days, then stored the e2image directly on disk and compressed it with tar and sparse support, which finished after only 12 hours... I don't think more modern CPUs are much faster for single threaded runs as bzip2 does it. So IMHO the man page should at least warn about that issue and suggest to use a similar tar command. Cheers, Bernd -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9/24/11 11:31 AM, Ted Ts'o wrote: > On Fri, Sep 23, 2011 at 05:51:24PM -0300, Carlos Maiolino wrote: >> On Fri, Sep 23, 2011 at 02:24:23PM -0600, Andreas Dilger wrote: >>> On 2011-09-23, at 12:47 PM, Carlos Maiolino wrote: >>>> The current example in the man page uses bzip2 to compress >>>> the raw image file created by the e2image, but, bzip2 does >>>> not honors sparse files, which causes the image to have the >>>> same size of the filesystem. >>>> Using tar together with bzip2 will make the compressed file >>>> to honor the sparsed file, which makes it more transportable >>>> than the current one if the filesystem is large. > > The problem with using tar is that it requires extra disk space by the > user --- somewhere a bit more than double the extra disk space > (because you need to have space for the hda1.e2i file before it gets > compressed). For very large file systems, this can be quite > significant. My general philosophy has been to make things easy as > possible for the users as being more important for the developers. > > For the developers, we do have contrib/make-sparse.c. All we have to do is: > > bunzip2 < hda1.e2i.bz2 | make-sparse hda1.e2i > > ... and this creates a sparse file in hda1.e2i. or | cp --sparse=always /dev/stdin sparse.img works too. But have you ever tried this with a multi-terabyte image? It takes -forever- to process all those 0s, with cpus pegged. The tar command seems to actually annotate the sparseness efficiently. Ted, your concern about space - it doesn't take the full fs size worth of space, right, just the metadata space? So in general it should not be THAT much ... -Eric >>> Even better would be the use of the QCOW2 format that Lukas added, >>> if it could also be operated on directly by the e2fsprogs utils (I >>> don't know if that is possible or not). >>> >> The QCOW2 format can only be operated with qcow2 capable tools like >>> qemu-img to use directly with e2fsprogs tools, we still need to >>> use raw images. > > Yeah, it would be nice if we had an io_manager implementation that > understood qcow2, which could then be used by dumpe2fs and debugfs. > Hopefully at some point someone will implement it. > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 9/26/11 7:22 AM, Bernd Schubert wrote: > On 09/24/2011 06:31 PM, Ted Ts'o wrote: >> On Fri, Sep 23, 2011 at 05:51:24PM -0300, Carlos Maiolino wrote: >>> On Fri, Sep 23, 2011 at 02:24:23PM -0600, Andreas Dilger wrote: >>>> On 2011-09-23, at 12:47 PM, Carlos Maiolino wrote: >>>>> The current example in the man page uses bzip2 to compress >>>>> the raw image file created by the e2image, but, bzip2 does >>>>> not honors sparse files, which causes the image to have the >>>>> same size of the filesystem. >>>>> Using tar together with bzip2 will make the compressed file >>>>> to honor the sparsed file, which makes it more transportable >>>>> than the current one if the filesystem is large. >> >> The problem with using tar is that it requires extra disk space by the >> user --- somewhere a bit more than double the extra disk space >> (because you need to have space for the hda1.e2i file before it gets >> compressed). For very large file systems, this can be quite >> significant. My general philosophy has been to make things easy as >> possible for the users as being more important for the developers. >> >> For the developers, we do have contrib/make-sparse.c. All we have to do is: >> >> bunzip2< hda1.e2i.bz2 | make-sparse hda1.e2i >> >> ... and this creates a sparse file in hda1.e2i. > > The problem is that the bzip2 run will take a huge amount of time to > compress all the zeros. In 2009 (with a recent CPU of that time) I > aborted such a run for a 8TiB file system after a couple of days, > then stored the e2image directly on disk and compressed it with tar > and sparse support, which finished after only 12 hours... I don't > think more modern CPUs are much faster for single threaded runs as > bzip2 does it. So IMHO the man page should at least warn about that > issue and suggest to use a similar tar command. Agreed! I think they both have their place. passing images around in qcow format may be best in the long run though. -Eric > > Cheers, > Bernd > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Sep 26, 2011 at 11:24:31AM -0500, Eric Sandeen wrote: > > > > bunzip2 < hda1.e2i.bz2 | make-sparse hda1.e2i > > > > ... and this creates a sparse file in hda1.e2i. > > or | cp --sparse=always /dev/stdin sparse.img works too. > > But have you ever tried this with a multi-terabyte image? > > It takes -forever- to process all those 0s, with cpus pegged. Yeah, I didn't realize until I read another message on this thread that bzip2's CPU problems were causing problems. Is gzip sufficiently better, I wonder, or is it still problematic? > Ted, your concern about space - it doesn't take the full fs size worth > of space, right, just the metadata space? So in general it should not > be THAT much ... Yes, it's just the metadata space that I was worried about. So it's not *that* much, but it still adds up on large systems. But then again, on large systems we precisely have the problem of bzip2 taking forever. If we decide that we're OK with not compressing qcow2, we could use qcow2. But note that the qcow2 format is still very compressible --- it looks like it could do a better job removing zero blocks. (I had a 256meg qcow2 e2image file compress down to 9 megs.) Unfortunately we can't do stream compression with qcow2. Long run I think we should make the qcow2 support better (by dropping all-zero blocks, and adding support for qcow2 to debugfs/dumpe2fs/e2fsck, and perhaps adding support for native compression). Anyone looking for a project? :-) - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi, > If we decide that we're OK with not compressing qcow2, we could use > qcow2. But note that the qcow2 format is still very compressible --- > it looks like it could do a better job removing zero blocks. (I had a > 256meg qcow2 e2image file compress down to 9 megs.) Unfortunately we > can't do stream compression with qcow2. > > Long run I think we should make the qcow2 support better (by dropping > all-zero blocks, and adding support for qcow2 to > debugfs/dumpe2fs/e2fsck, and perhaps adding support for native > compression). Anyone looking for a project? :-) > > - Ted Actually I'm pretty interested in contribute more to ExtFS, although I'm not pretty much familiar with all Ext internals. I've been contributing to GFS2 only. If you're not hurry to get it asap and don't mind to receive some (maybe lots of) questions I can get this project (will need to learn more about ExtFS first :). I can take sometime to implement it since my poor knowledge on ExtFS internals, but, if you're ok with the above, I'm in.
On Mon, Sep 26, 2011 at 10:23 PM, Ted Ts'o <tytso@mit.edu> wrote: > On Mon, Sep 26, 2011 at 11:24:31AM -0500, Eric Sandeen wrote: >> > >> > bunzip2 < hda1.e2i.bz2 | make-sparse hda1.e2i >> > >> > ... and this creates a sparse file in hda1.e2i. >> >> or | cp --sparse=always /dev/stdin sparse.img works too. >> >> But have you ever tried this with a multi-terabyte image? >> >> It takes -forever- to process all those 0s, with cpus pegged. > > Yeah, I didn't realize until I read another message on this thread > that bzip2's CPU problems were causing problems. Is gzip sufficiently > better, I wonder, or is it still problematic? > >> Ted, your concern about space - it doesn't take the full fs size worth >> of space, right, just the metadata space? So in general it should not >> be THAT much ... > > Yes, it's just the metadata space that I was worried about. So it's > not *that* much, but it still adds up on large systems. But then > again, on large systems we precisely have the problem of bzip2 taking > forever. > > If we decide that we're OK with not compressing qcow2, we could use > qcow2. But note that the qcow2 format is still very compressible --- > it looks like it could do a better job removing zero blocks. (I had a > 256meg qcow2 e2image file compress down to 9 megs.) Unfortunately we > can't do stream compression with qcow2. I wasn't sure if I should bring this up, but what the hack... Shardul, one of our GSoC students, have implemented e4send/e4receive for streaming of ext4 snapshot image (with data) to a remote machine. The code can be found in his github repo: https://github.com/shardulmangade/e2fsprogs-snapshots His code mostly reuses e2image code and uses "LVM snapshot store" format for streaming block numbers + blocks content to e4receive, which writes to a sparse file or block device. LVM snapshot store format was chosen simply because we needed something quick and the code was already implemented by another GSoC student for his own project (revert to ext4 snapshot using LVM merge). So without trying to promote upstream inclusion of this implementation, just so you know: 1. the code works, although not configured for exporting metadata at the moment 2. it's simple 3. no intermediate files needed 4. output can be streamed compressed 5. Shardul would be happy to help with further questions Amir. > > Long run I think we should make the qcow2 support better (by dropping > all-zero blocks, and adding support for qcow2 to > debugfs/dumpe2fs/e2fsck, and perhaps adding support for native > compression). Anyone looking for a project? :-) > > - Ted > -- > To unsubscribe from this list: send the line "unsubscribe linux-ext4" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/misc/e2image.8.in b/misc/e2image.8.in index 74d2a0b..fcf3d20 100644 --- a/misc/e2image.8.in +++ b/misc/e2image.8.in @@ -115,7 +115,7 @@ as part of bug reports to e2fsprogs. When used in this capacity, the recommended command is as follows (replace hda1 with the appropriate device): .PP .br -\ \fBe2image \-r /dev/hda1 \- | bzip2 > hda1.e2i.bz2\fR +\ \fBe2image \-r /dev/hda1 hda1.e2i && tar Sjcvf e2i.tar.bz2 hda1.e2i\fR .PP This will only send the metadata information, without any data blocks. However, the filenames in the directory blocks can still reveal
The current example in the man page uses bzip2 to compress the raw image file created by the e2image, but, bzip2 does not honors sparse files, which causes the image to have the same size of the filesystem. Using tar together with bzip2 will make the compressed file to honor the sparsed file, which makes it more transportable than the current one if the filesystem is large. Signed-off-by: Carlos Maiolino <cmaiolino@redhat.com> --- misc/e2image.8.in | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-)