[OpenWrt-Devel] Fwd: [PATCH] uhttpd: serve precompressed files
diff mbox

Message ID CALs5MW_tmXXnjCL2vzjGEnwygMXmwFaFoVfr090ZSNnEWGYJhA@mail.gmail.com
State Changes Requested
Headers show

Commit Message

Adrian Kotelba Sept. 6, 2015, 1:02 p.m. UTC
Serving precompressed content with uhttpd.

Signed-off-by: Adrian Kotelba >adrian.kotelba@gmail.com>
---

Comments

Bastian Bittorf Sept. 6, 2015, 6:14 p.m. UTC | #1
* Adrian Kotelba <adrian.kotelba@gmail.com> [06.09.2015 20:05]:
> Serving precompressed content with uhttpd.

please write more about the usecase. it looks useless?!

bye, bastian
Adrian Kotelba Sept. 7, 2015, 6:47 a.m. UTC | #2
Well, it could be useful for low-end devices with small storage
capacity and slow networks, see https://dev.openwrt.org/ticket/14333.
The static content, mostly text-based HTML, Java, and css files, could
be stored as gzip or zopfli precompressed files. Thus, one reduces
storage requirements and improves page loading times.

Adrian

2015-09-06 21:14 GMT+03:00 Bastian Bittorf <bittorf@bluebottle.com>:
> * Adrian Kotelba <adrian.kotelba@gmail.com> [06.09.2015 20:05]:
>> Serving precompressed content with uhttpd.
>
> please write more about the usecase. it looks useless?!
>
> bye, bastian
Felix Fietkau Sept. 7, 2015, 8:08 a.m. UTC | #3
On 2015-09-07 08:47, Adrian Kotelba wrote:
> Well, it could be useful for low-end devices with small storage
> capacity and slow networks, see https://dev.openwrt.org/ticket/14333.
> The static content, mostly text-based HTML, Java, and css files, could
> be stored as gzip or zopfli precompressed files. Thus, one reduces
> storage requirements and improves page loading times.
Serving compressed files requires a fallback codepath for browsers that
can't accept gzip encoding. Also, precompressed files that ship with the
firmware image waste storage instead of saving it, because the
filesystem is compressed using LZMA (which has a much better ratio than
gzip).

- Felix
Bastian Bittorf Sept. 7, 2015, 9:40 a.m. UTC | #4
* Adrian Kotelba <adrian.kotelba@gmail.com> [07.09.2015 11:36]:
> Well, it could be useful for low-end devices with small storage
> capacity and slow networks, see https://dev.openwrt.org/ticket/14333.
> The static content, mostly text-based HTML, Java, and css files, could
> be stored as gzip or zopfli precompressed files. Thus, one reduces
> storage requirements and improves page loading times.

ok, as far as i understand the patch, if somebody
tries to http_get file 'xy.html' and the webserver
finds 'xy.html.gz' it serves this file, right?

the storage-argument is not really valid, because the
files are usually compressed anyway (squashfs or jffs2).
the "slow networks" argument is ofcourse valid.

bye, bastian
Adrian Kotelba Sept. 7, 2015, 11:39 a.m. UTC | #5
2015-09-07 12:40 GMT+03:00 Bastian Bittorf <bittorf@bluebottle.com>:
> * Adrian Kotelba <adrian.kotelba@gmail.com> [07.09.2015 11:36]:
>> Well, it could be useful for low-end devices with small storage
>> capacity and slow networks, see https://dev.openwrt.org/ticket/14333.
>> The static content, mostly text-based HTML, Java, and css files, could
>> be stored as gzip or zopfli precompressed files. Thus, one reduces
>> storage requirements and improves page loading times.
>
> ok, as far as i understand the patch, if somebody
> tries to http_get file 'xy.html' and the webserver
> finds 'xy.html.gz' it serves this file, right?

Right. More preciselly, if somebody tries to http_get file 'xy.html'
and the webserver does find it, then it serves it.
If 'xy.html' is not found, then webserver tries 'xy.html.gz'. If
'xy.html.gz' is found, then it is served.

> the storage-argument is not really valid, because the
> files are usually compressed anyway (squashfs or jffs2).
> the "slow networks" argument is ofcourse valid.

Agreed.

> bye, bastian
Bastian Bittorf Sept. 7, 2015, 1:45 p.m. UTC | #6
* Adrian Kotelba <adrian.kotelba@gmail.com> [07.09.2015 15:18]:
> > ok, as far as i understand the patch, if somebody
> > tries to http_get file 'xy.html' and the webserver
> > finds 'xy.html.gz' it serves this file, right?
> 
> Right. More preciselly, if somebody tries to http_get file 'xy.html'
> and the webserver does find it, then it serves it.
> If 'xy.html' is not found, then webserver tries 'xy.html.gz'. If
> 'xy.html.gz' is found, then it is served.

ok, what you can do is this:

check if HTTP_ACCEPT_ENCODING has some of
gzip, deflate, sdch or whatever you think makes sense
and live-compress if enabled via uci and it _is_ a
static file. (i think the cgi-case is more complicated)
also take care of the headers, you must set something like:

'Content-Encoding: gzip'

unsure what to do with 'content length' because you
only know this after compression, but you should do it
chunkwise...

bye, bastian
Adrian Kotelba Sept. 7, 2015, 4:28 p.m. UTC | #7
2015-09-07 16:45 GMT+03:00 Bastian Bittorf <bittorf@bluebottle.com>:
> * Adrian Kotelba <adrian.kotelba@gmail.com> [07.09.2015 15:18]:
>> > ok, as far as i understand the patch, if somebody
>> > tries to http_get file 'xy.html' and the webserver
>> > finds 'xy.html.gz' it serves this file, right?
>>
>> Right. More preciselly, if somebody tries to http_get file 'xy.html'
>> and the webserver does find it, then it serves it.
>> If 'xy.html' is not found, then webserver tries 'xy.html.gz'. If
>> 'xy.html.gz' is found, then it is served.
>
> ok, what you can do is this:
>
> check if HTTP_ACCEPT_ENCODING has some of
> gzip, deflate, sdch or whatever you think makes sense
> and live-compress if enabled via uci and it _is_ a
> static file. (i think the cgi-case is more complicated)
> also take care of the headers, you must set something like:
>
> 'Content-Encoding: gzip'
>
> unsure what to do with 'content length' because you
> only know this after compression, but you should do it
> chunkwise...
>
> bye, bastian

Thanks for hints. So, in other words, you propose to compress the
files on the fly, right?
I am afraid that low-end devices may not have enough cpu power to do that.

Furthermore, we may probably need to use zlib library. I will,
nevertheless, think about it.

Adrian
Bastian Bittorf Sept. 7, 2015, 5:14 p.m. UTC | #8
* Adrian Kotelba <adrian.kotelba@gmail.com> [07.09.2015 18:57]:
> Thanks for hints. So, in other words, you propose to compress the
> files on the fly, right?
> I am afraid that low-end devices may not have enough cpu power to do that.

this is really fast. gzip compresses at ~1megabyte/sec on a typical router.
this means 1/10 sec for a typical large 100k html-file. (which has only
1/8 of the original size) - thats ok for your usecase.

bye, bastian

Patch
diff mbox

diff --git a/file.c b/file.c
index 480c40b..81f8186 100644
--- a/file.c
+++ b/file.c
@@ -136,6 +136,7 @@  uh_path_lookup(struct client *cl, const char *url)
  int docroot_len = strlen(docroot);
  char *pathptr = NULL;
  bool slash;
+ bool precompressed = 0;

  int i = 0;
  int len;
@@ -191,11 +192,26 @@  uh_path_lookup(struct client *cl, const char *url)
  continue;

  /* test current path */
- if (stat(path_phys, &p.stat))
+ if (stat(path_phys, &p.stat) == 0) {
+ snprintf(path_info, sizeof(path_info), "%s", uh_buf + i);
+ break;
+ }
+
+ pathptr = path_phys + strlen(path_phys);
+
+ /* try to locate precompressed file */
+ len = path_phys + sizeof(path_phys) - pathptr - 1;
+ if (strlen(".gz") > len)
  continue;

- snprintf(path_info, sizeof(path_info), "%s", uh_buf + i);
- break;
+ strcpy(pathptr, ".gz");
+ if (stat(path_phys, &p.stat) == 0) {
+ snprintf(path_info, sizeof(path_info), "%s", uh_buf + i);
+ precompressed = 1;
+ break;
+ }
+
+ *pathptr = 0;
  }

  /* check whether found path is within docroot */
@@ -210,6 +226,7 @@  uh_path_lookup(struct client *cl, const char *url)
  p.phys = path_phys;
  p.name = &path_phys[docroot_len];
  p.info = path_info[0] ? path_info : NULL;
+ p.compressed = precompressed;
  return &p;
  }

@@ -258,9 +275,27 @@  uh_path_lookup(struct client *cl, const char *url)
  *pathptr = 0;
  }

+ /* try to locate precompressed index file */
+ len = path_phys + sizeof(path_phys) - pathptr - 1;
+ list_for_each_entry(idx, &index_files, list) {
+ if (strlen(idx->name) + strlen(".gz") > len)
+ continue;
+
+ strcpy(pathptr, idx->name);
+ strcpy(pathptr + strlen(idx->name), ".gz");
+ if (!stat(path_phys, &s) && (s.st_mode & S_IFREG)) {
+ memcpy(&p.stat, &s, sizeof(p.stat));
+ precompressed = 1;
+ break;
+ }
+
+ *pathptr = 0;
+ }
+
  p.root = docroot;
  p.phys = path_phys;
  p.name = &path_phys[docroot_len];
+ p.compressed = precompressed;

  return p.phys ? &p : NULL;
 }
@@ -561,6 +596,8 @@  static void uh_file_free(struct client *cl)

 static void uh_file_data(struct client *cl, struct path_info *pi, int fd)
 {
+ static char name[PATH_MAX];
+
  /* test preconditions */
  if (!uh_file_if_modified_since(cl, &pi->stat) ||
  !uh_file_if_match(cl, &pi->stat) ||
@@ -576,8 +613,15 @@  static void uh_file_data(struct client *cl,
struct path_info *pi, int fd)
  /* write status */
  uh_file_response_200(cl, &pi->stat);

+ strcpy(name, pi->name);
+
+ if (pi->compressed) {
+ name[strlen(name) - strlen(".gz")] = 0;
+ ustream_printf(cl->us, "Content-Encoding: gzip\r\n");
+ }
+
  ustream_printf(cl->us, "Content-Type: %s\r\n",
-   uh_file_mime_lookup(pi->name));
+   uh_file_mime_lookup(name));

  ustream_printf(cl->us, "Content-Length: %" PRIu64 "\r\n\r\n",
    pi->stat.st_size);
diff --git a/uhttpd.h b/uhttpd.h
index fbcb1ed..7b580e4 100644
--- a/uhttpd.h
+++ b/uhttpd.h
@@ -140,6 +140,7 @@  struct path_info {
  const char *query;
  const char *auth;
  bool redirected;
+ bool compressed;
  struct stat stat;
  const struct interpreter *ip;
 };