diff mbox

[U-Boot] tools: mkimage can read input on /dev/stdin

Message ID CADF714agpUedt_o3iXcGBRBc5CaV6TBUn=1oGEqJj3h2b+f-Vg@mail.gmail.com
State Deferred
Delegated to: Wolfgang Denk
Headers show

Commit Message

Julien Castets Sept. 26, 2014, 7:33 p.m. UTC
Hi,

I would like to give /dev/stdin to the flag -d of mkimage. The only
thing that prevent doing it is the function copy_file of mkimage.c,
which:
- calls stat(2) on the file to get the input file size
- calls mmap(2) with this size as length

When the file is a pipe, its size is set to 0 and mmap(2) fails.

This patch replaces the use of mmap(2) with read(2). If accepted, I
could give a look to accept /dev/stdout as output file (which is
currently also required to be a file).

From e107bdc73ee7b2159956cfc753328f9f03c058e8 Mon Sep 17 00:00:00 2001
From: Julien Castets <castets.j@gmail.com>
Date: Fri, 26 Sep 2014 11:28:49 +0200
Subject: [PATCH] tools: mkimage can read input on /dev/stdin

Use a sequential read(2) instead of a mmap(2) in copy_file.

Signed-off-by: Julien Castets <castets.j@gmail.com>
---
 tools/mkimage.c |   64 +++++++++++++++++++++++++++----------------------------
 1 file changed, 31 insertions(+), 33 deletions(-)

Comments

Julien Castets Sept. 27, 2014, 1:11 p.m. UTC | #1
On Sat, Sep 27, 2014 at 2:54 PM, Wolfgang Denk <wd@denx.de> wrote:
> What would be the benefit of doing so?  Do you have an example for a
> practical use case where this makes sense?

In my case, I have a TFTP server that dyncamically generates uboot
bootfiles when a specific file is requested. The input template file
being generated, I need to create a temporary file to store it before
calling mkimage. Except the mmap, there's no technical restriction for
mkimage to be able to read on a pipe.

> What is the size and performance impact of the suggested change for
> typical use cases?

None. The behaviour is exactly the same.
Wolfgang Denk Sept. 27, 2014, 1:25 p.m. UTC | #2
Dear Julien,

In message <CADF714bJ6-WvjBSd1u8UA-zT+H0TUO5vqO3_-6Q34yUjvbkVxw@mail.gmail.com> you wrote:
> On Sat, Sep 27, 2014 at 2:54 PM, Wolfgang Denk <wd@denx.de> wrote:
> > What would be the benefit of doing so?  Do you have an example for a
> > practical use case where this makes sense?
> 
> In my case, I have a TFTP server that dyncamically generates uboot
> bootfiles when a specific file is requested. The input template file
> being generated, I need to create a temporary file to store it before
> calling mkimage. Except the mmap, there's no technical restriction for
> mkimage to be able to read on a pipe.

Sorry, but I don't understand this.   Where are the image(s) coming
from, then?  Who or what is feeding the pipe?

> > What is the size and performance impact of the suggested change for
> > typical use cases?
> 
> None. The behaviour is exactly the same.

I don't believe you.  Sizes rare certainly not identical, and neither
is the performance.  Did you do any real measurements?

Best regards,

Wolfgang Denk
Julien Castets Sept. 27, 2014, 5:28 p.m. UTC | #3
On Sat, Sep 27, 2014 at 3:25 PM, Wolfgang Denk <wd@denx.de> wrote:
> Sorry, but I don't understand this.   Where are the image(s) coming
> from, then?  Who or what is feeding the pipe?

Sorry, I wrote quickly.

In my situation:
- I have a Python implementation of a TFTP server
- when the file named "uboot.bootscript" is requested, a template is
rendered. Currently, this template is stored in a temporary file
- A new process is created to call mkimage, with the temporary file as
input and another temporary file as output (mkimage -A arm -O linux -a
0 -e 0 -T script -C none -n 'comment' -d tmp_input tmp_output)
- Finally, the output file content is returned to the client

Instead of creating two temporary file, two pipes could be used (-d
/dev/stdin /dev/stdout), which, in my case, would be more efficient
(even if I don't really have performance issues).

>> > What is the size and performance impact of the suggested change for
>> > typical use cases?
>>
>> None. The behaviour is exactly the same.
>
> I don't believe you.  Sizes rare certainly not identical, and neither
> is the performance.  Did you do any real measurements?

mmap is useful when you need to make random access in a file, or to
optimize memory: when a file is mmapped, the kernel only loads the
parts of the file that are accessed. In the case of mkimage, the file
is read sequentially (meaning all of it will be retrieved from the
disk).
Wolfgang Denk Sept. 27, 2014, 6:24 p.m. UTC | #4
Dear Julien,

In message <CADF714bpHv6x8wUxs=7H-st3-ThPCeBWbZvO_-uoQcdWS6WqJQ@mail.gmail.com> you wrote:
>
> > I don't believe you.  Sizes rare certainly not identical, and neither
> > is the performance.  Did you do any real measurements?
> 
> mmap is useful when you need to make random access in a file, or to
> optimize memory: when a file is mmapped, the kernel only loads the
> parts of the file that are accessed. In the case of mkimage, the file
> is read sequentially (meaning all of it will be retrieved from the
> disk).

OK, so you don't have any real data, but make a very specific
statement: "exactly the same."

Sorry, but this is not an answer I'm going to buy.

Best regards,

Wolfgang Denk
Julien Castets Sept. 27, 2014, 7:06 p.m. UTC | #5
On Sep 27, 2014 8:24 PM, "Wolfgang Denk" > OK, so you don't have any real
data, but make a very specific
> statement: "exactly the same."
>
> Sorry, but this is not an answer I'm going to buy.

I'm not sure to understand what you mean. In both cases, the file is copied.

What is bothering you?
Wolfgang Denk Sept. 27, 2014, 9:56 p.m. UTC | #6
Dear Julien,

In message <CADF714bLG21jobjTnbUbWqsyj3xbzL+Fb+WBfcrq8YH4-ugm5Q@mail.gmail.com> you wrote:
> 
> I'm not sure to understand what you mean. In both cases, the file is copied.
> 
> What is bothering you?

I asked:

| What is the size and performance impact of the suggested change for
| typical use cases?

Can you please provide values for the size of the binary and the
execution time?

It's not really critical, but I'd like to understand the impact of
your changes.  You use case is pretty exotic, so it seems a valid
question to me to try to understand what the extended functionality
costs.

Best regards,

Wolfgang Denk
Marek Vasut Sept. 27, 2014, 10:01 p.m. UTC | #7
On Saturday, September 27, 2014 at 11:56:55 PM, Wolfgang Denk wrote:

Hello Wolfgang,

> Dear Julien,
> 
> In message <CADF714bLG21jobjTnbUbWqsyj3xbzL+Fb+WBfcrq8YH4-
ugm5Q@mail.gmail.com> you wrote:
> > I'm not sure to understand what you mean. In both cases, the file is
> > copied.
> > 
> > What is bothering you?
> 
> I asked:
> | What is the size and performance impact of the suggested change for
> | typical use cases?
> 
> Can you please provide values for the size of the binary and the
> execution time?
> 
> It's not really critical, but I'd like to understand the impact of
> your changes.  You use case is pretty exotic, so it seems a valid
> question to me to try to understand what the extended functionality
> costs.

Won't it be better to focus on the overall concept first and dig in the finer 
details later ?

I think right now, the question is -- do we want to support stdin as a source of 
payload for mkimage or not at all?

Best regards,
Marek Vasut
Wolfgang Denk Sept. 27, 2014, 10:21 p.m. UTC | #8
Dear Marek,

In message <201409280001.26383.marex@denx.de> you wrote:
> 
> > Can you please provide values for the size of the binary and the
> > execution time?
> > 
> > It's not really critical, but I'd like to understand the impact of
> > your changes.  You use case is pretty exotic, so it seems a valid
> > question to me to try to understand what the extended functionality
> > costs.
> 
> Won't it be better to focus on the overall concept first and dig in the finer 
> details later ?
> 
> I think right now, the question is -- do we want to support stdin as a source of 
> payload for mkimage or not at all?

The general approach to new features in U-Boot is: 1) is it useful at
least to some? and 2) does it not hurt others?

Re 1), I think the use case is pretty exostic, but apparently there is
at least one user for that.

Re 2), we need some numbers.  Plain mmap() on a regular file is
supposed to be the fastest possible I/O method in a Unix OS, so we
should understand how much a change costs, or if it makes sense to
provide different implementations depending on input type (read() for
stdin vs. mmap() for regular files).  Or if the differences are so
small that this is all not worth the time we spend here.

Best regards,

Wolfgang Denk
Julien Castets Sept. 28, 2014, 12:16 a.m. UTC | #9
Thanks for your comments,

On Sat, Sep 27, 2014 at 11:56 PM, Wolfgang Denk <wd@denx.de> wrote:
> Can you please provide values for the size of the binary and the
> execution time?

Without the patch, with mmap:
$> dd if=/dev/zero of=test bs=1M count=10
$> time ./mkimage -A arm -O linux -a 0 -e 0 -T script -C none -n
'test' -d test test_out
...
real    0m0.168s
user    0m0.027s
sys     0m0.023s

With the patch, with read:
$> dd if=/dev/zero of=test bs=1M count=10
$> time ./mkimage -A arm -O linux -a 0 -e 0 -T script -C none -n
'test' -d test test_out
...
real    0m0.160s
user    0m0.025s
sys     0m0.016s

In both cases, the binary mkimage size is 144k bytes (147333 with
mmap, 147421 with read). Compiled with make sandbox_defconfig and
make_tools.

> It's not really critical, but I'd like to understand the impact of
> your changes.  You use case is pretty exotic, so it seems a valid
> question to me to try to understand what the extended functionality
> costs.

I understand the use case, in its globality, is pretty exotic.
However, I don't really get why giving /dev/stdin as input is.


Regards,
Wolfgang Denk Sept. 28, 2014, 6:49 a.m. UTC | #10
Dear Julien,

In message <CADF714btdiJT=ccVRBuJk7xUJCgapA-ued=E1J1C0v-Bxa2h=A@mail.gmail.com> you wrote:
> 
> Without the patch, with mmap:

Thanks for the numbers.  So these indeed make no real difference.

> I understand the use case, in its globality, is pretty exotic.
> However, I don't really get why giving /dev/stdin as input is.

The case where mkimage is taking a single input file is quickly
becoming a rare corner case.

The recommended way to build U-Boot images is (and has been for years,
even though marketing for this has been poor) to build FIT images.  In
this case you typically have several inout files, which are not even
listed on the mkimage command line, but referenced in the fit-image.its
file you use.  OK, in this case you could feed the fit-image.its
through stdin.  But there are other file arguments - like when using
signed images.

Even if you use the (deprecated) legacy image format, the case of using
a single input file is not the generic one.  mkimage syntax is:

	mkimage ... -d data_file[:data_file...] ...

and allows to provide several input files - pretty often you need the
kernel image and the DT blob.  It is also not uncommon to have a
ramdisk image included.


So if we add support to read from stdin instead from a file where we
pass the file name as an argument, we should probably do this in a
consistent way.  It would be a frustrating experience to the end user
to learn that he can use stdin here but not there - so we would
probably have to rework all these use cases?  And how should we
implement this - would a file name "-" mean stdin (1), or should we
simply pass "/dev/stdin" as file argument (2)?

With (1), we need to change more code, while (2) could probably be
pretty transparent.


You see, even such a simple change like your suggestion needs some
deeper thought if you want to keep the design consistent.  This is why
I am asking about your use case, and how it would fit into other
situations.

Best regards,

Wolfgang Denk
Julien Castets Sept. 28, 2014, 10:49 a.m. UTC | #11
On Sun, Sep 28, 2014 at 8:49 AM, Wolfgang Denk <wd@denx.de> wrote:
> The case where mkimage is taking a single input file is quickly
> becoming a rare corner case.
>
> The recommended way to build U-Boot images is (and has been for years,
> even though marketing for this has been poor) to build FIT images.  In
> this case you typically have several inout files, which are not even
> listed on the mkimage command line, but referenced in the fit-image.its
> file you use.  OK, in this case you could feed the fit-image.its
> through stdin.  But there are other file arguments - like when using
> signed images.
>
> Even if you use the (deprecated) legacy image format, the case of using
> a single input file is not the generic one.  mkimage syntax is:
>
>         mkimage ... -d data_file[:data_file...] ...
>
> and allows to provide several input files - pretty often you need the
> kernel image and the DT blob.  It is also not uncommon to have a
> ramdisk image included.

Thank you so much for you explanations. To tell the truth, I don't
even know what a FIT image is, and even if I saw the usage accepted
several files as input, I didn't search to understand why.


> So if we add support to read from stdin instead from a file where we
> pass the file name as an argument, we should probably do this in a
> consistent way.  It would be a frustrating experience to the end user
> to learn that he can use stdin here but not there - so we would
> probably have to rework all these use cases?  And how should we
> implement this - would a file name "-" mean stdin (1), or should we
> simply pass "/dev/stdin" as file argument (2)?
>
> With (1), we need to change more code, while (2) could probably be
> pretty transparent.

If I understand well, your proposition for (1) would be to use mmap(2)
for everything, but use read(2) for the special case "-".

I'm not sure it is a good idea. The standard input can be handled like
any other file. And note the input could also be a named pipe, that
you won't be able to mmap. With the patch proposed, it would work just
fine.

Also, in the case you're having several files as input, they will be
consumed one after the other. So if the input is "-d
/dev/stdin:/dev/stdin:/dev/stdin", you can give the three files
through stdin.

> You see, even such a simple change like your suggestion needs some
> deeper thought if you want to keep the design consistent.  This is why
> I am asking about your use case, and how it would fit into other
> situations.

Indeed!
Wolfgang Denk Sept. 28, 2014, 5:48 p.m. UTC | #12
Dear Julien,

In message <CADF714Y4ufr4SuoF83tgyJxyhx1gdKYKw9E6JWtYhHfM3NxzUg@mail.gmail.com> you wrote:
>
> > So if we add support to read from stdin instead from a file where we
> > pass the file name as an argument, we should probably do this in a
> > consistent way.  It would be a frustrating experience to the end user
> > to learn that he can use stdin here but not there - so we would
> > probably have to rework all these use cases?  And how should we
> > implement this - would a file name "-" mean stdin (1), or should we
> > simply pass "/dev/stdin" as file argument (2)?
> >
> > With (1), we need to change more code, while (2) could probably be
> > pretty transparent.
> 
> If I understand well, your proposition for (1) would be to use mmap(2)
> for everything, but use read(2) for the special case "-".

I did not mean to suggest this.  I probably makes more sense to use
the same code everywhere.

> I'm not sure it is a good idea. The standard input can be handled like
> any other file. And note the input could also be a named pipe, that
> you won't be able to mmap. With the patch proposed, it would work just
> fine.

But the patch would only be a part of the implementation.  I think we
should see it all together to be able to compare approaches.

> Also, in the case you're having several files as input, they will be
> consumed one after the other. So if the input is "-d
> /dev/stdin:/dev/stdin:/dev/stdin", you can give the three files
> through stdin.

Ouch.  That would be error prone as hell.  Not all things that can be
done should be done ;-)

Best regards,

Wolfgang Denk
diff mbox

Patch

diff --git a/tools/mkimage.c b/tools/mkimage.c
index c70408c..bb35110 100644
--- a/tools/mkimage.c
+++ b/tools/mkimage.c
@@ -522,14 +522,14 @@  static void
 copy_file (int ifd, const char *datafile, int pad)
 {
     int dfd;
-    struct stat sbuf;
-    unsigned char *ptr;
     int tail;
     int zero = 0;
     uint8_t zeros[4096];
-    int offset = 0;
-    int size;
     struct image_type_params *tparams = mkimage_get_type (params.type);
+    unsigned char buf[4096];
+    ssize_t nbytes;
+    ssize_t i;
+    ssize_t size = 0;

     if (pad >= sizeof(zeros)) {
         fprintf(stderr, "%s: Can't pad to %d\n",
@@ -549,63 +549,62 @@  copy_file (int ifd, const char *datafile, int pad)
         exit (EXIT_FAILURE);
     }

-    if (fstat(dfd, &sbuf) < 0) {
-        fprintf (stderr, "%s: Can't stat %s: %s\n",
-            params.cmdname, datafile, strerror(errno));
-        exit (EXIT_FAILURE);
-    }
-
-    ptr = mmap(0, sbuf.st_size, PROT_READ, MAP_SHARED, dfd, 0);
-    if (ptr == MAP_FAILED) {
-        fprintf (stderr, "%s: Can't read %s: %s\n",
-            params.cmdname, datafile, strerror(errno));
-        exit (EXIT_FAILURE);
-    }
-
     if (params.xflag) {
-        unsigned char *p = NULL;
         /*
          * XIP: do not append the image_header_t at the
          * beginning of the file, but consume the space
          * reserved for it.
          */
+        nbytes = read(dfd, buf, tparams->header_size);
+        if (nbytes == -1) {
+            fprintf (stderr, "%s: Can't read XIP header of %s: %s\n",
+                 params.cmdname, datafile, strerror(errno));
+            exit (EXIT_FAILURE);
+        }

-        if ((unsigned)sbuf.st_size < tparams->header_size) {
+        if (nbytes < tparams->header_size) {
             fprintf (stderr,
                 "%s: Bad size: \"%s\" is too small for XIP\n",
                 params.cmdname, datafile);
             exit (EXIT_FAILURE);
         }

-        for (p = ptr; p < ptr + tparams->header_size; p++) {
-            if ( *p != 0xff ) {
+        for (i = 0; i < nbytes; ++i) {
+            if (buf[i] != 0xff) {
                 fprintf (stderr,
                     "%s: Bad file: \"%s\" has invalid buffer for XIP\n",
                     params.cmdname, datafile);
                 exit (EXIT_FAILURE);
             }
         }
+    }

-        offset = tparams->header_size;
+    while ((nbytes = read(dfd, buf, sizeof(buf))) > 0) {
+        if (write(ifd, buf, nbytes) != nbytes) {
+            fprintf (stderr, "%s: Write error on %s: %s\n",
+                 params.cmdname, params.imagefile, strerror(errno));
+            exit (EXIT_FAILURE);
+        }
+        size += nbytes;
     }

-    size = sbuf.st_size - offset;
-    if (write(ifd, ptr + offset, size) != size) {
-        fprintf (stderr, "%s: Write error on %s: %s\n",
-            params.cmdname, params.imagefile, strerror(errno));
-        exit (EXIT_FAILURE);
+    if (nbytes == -1) {
+        fprintf (stderr, "%s: Read error on %s: %s\n",
+             params.cmdname, params.imagefile, strerror(errno));
+             exit (EXIT_FAILURE);
     }

     tail = size % 4;
     if ((pad == 1) && (tail != 0)) {
-
-        if (write(ifd, (char *)&zero, 4-tail) != 4-tail) {
+        if (write(ifd, (char *)&zero, 4 - tail) != 4 - tail) {
             fprintf (stderr, "%s: Write error on %s: %s\n",
-                params.cmdname, params.imagefile,
-                strerror(errno));
+                 params.cmdname, params.imagefile,
+                 strerror(errno));
             exit (EXIT_FAILURE);
         }
-    } else if (pad > 1) {
+    }
+
+    else if (pad > 1) {
         if (write(ifd, (char *)&zeros, pad) != pad) {
             fprintf(stderr, "%s: Write error on %s: %s\n",
                 params.cmdname, params.imagefile,
@@ -614,7 +613,6 @@  copy_file (int ifd, const char *datafile, int pad)
         }
     }

-    (void) munmap((void *)ptr, sbuf.st_size);
     (void) close (dfd);
 }