Documents/uHTTP Lib Uploads

From Nutwiki
Jump to: navigation, search

MicroHTTP Library: Uploading Files

The capability to upload files can significantly improve the maintainability of your webserver based application. You may, for example, allow users to upload new firmware versions. Unfortunately, compared to other features of the MicroHTTP library, the implementation requires more additional application code.

Web Content

The following HTML code presents a form that allows to replace an image displayed on the same page.

<html>
<head>
<title>File Upload</title>
</head>
<body>
<h1>Upload Sample</h1>
<p>Upload PNG image</p>
<form action="upload.cgi" method="post" enctype="multipart/form-data">
  <table cellspacing="10" width="100%">
    <tr><td><input type="file" size="64" name="image"></td></tr>
    <tr><td><input type="submit" name="upload" value="Send" onClick="return confirm('Are you sure?');"></td></tr>
  </table>
</form>
<img src="image.png" />
</body>
</html>

When clicking the Send button, the browser will query the user for confirmation (see onClick) and, if confirmed, transfer the data with the POST method to the CGI script upload.cgi.

Application Code

The application must process the form data with a CGI function, registered under the name upload.cgi. The required code to start the webserver is similar to the one we used when explaining the common gateway interface in general.

StreamInit();
MediaTypeInitDefaults();
HttpRegisterCgiFunction("upload.cgi", CgiUpload);
HttpRegisterMediaType("cgi", NULL, NULL, HttpCgiFunctionHandler);
StreamClientAccept(HttpdClientHandler, NULL);

The CGI function is kept simple:

#define HTTP_ROOT   (http_root ? http_root : HTTP_DEFAULT_ROOT)
extern char *http_root;

int CgiUpload(HTTPD_SESSION *hs)
{
    char *upname;
    char *lclname;

    lclname = malloc(strlen(HTTP_ROOT) + sizeof("image.png"));
    strcat(strcpy(lclname, HTTP_ROOT), "image.png");
    upname = UploadFile(hs, lclname);
    HttpSendRedirection(hs, 303, "/index.html", NULL);

    return 0;
}

It calls UploadFile to do all the work. And this is where things become a bit more complicated. When using a form with an input field of type file, we must also declare the encoding type of the form as multipart/form-data. This enables the browser to send both, the data entered by the user as well as the contents of the given file.

Multipart form data is, as its name implies, sent in parts, which are separated by a boundary string. The real trouble is, that the browser only informs the server about the overall content length. But the server is completely unaware about the size of each part. As long as text lines are transferred, it's not too hard to detect the next boundary string that marks the end of each part. When, like in our case with images, binary data is transfered, detecting the next boundary without significantly degrading the transfer speed can become tricky.

I will first present the full code of the upload function before turning into a few details.

#define MAX_UPSIZE  1460
#define MIN(a, b) ((a) < (b) ? (a) : (b))

char *UploadFile(HTTPD_SESSION *hs, char *path)
{
    char *rp = NULL;
    char *upname = NULL;
    long avail;
    char *line;
    char *delim;
    const char *sub_ptr;
    int sub_len;
    int fd = -1;
    int got = 0;
    HTTP_STREAM *stream = hs->s_stream;
    HTTP_REQUEST *req = &hs->s_req;

    /* Retrieve the boundary string. */
    delim = GetMultipartBoundary(req);
    if (delim == NULL) {
        return NULL;
    }

    avail = req->req_length;
    line = malloc(MIN(avail, MAX_UPSIZE) + 1);
    if (line == NULL) {
        /* No memory. */
        free(delim);
        return NULL;
    }

    /* If we have a delimiter, then process the boundary content. */
    while (avail > 0) {
        /* Parse the next boundary header. */
        if (HttpParseMultipartHeader(hs, delim, &avail)) {
            /* Broken connection. */
            break;
        }
        /* Ignore headers without content disposition line. */
        if (req->req_bnd_dispo) {
            /* Retrieve the name of the form item. */
            sub_ptr = HttpArgValueSub(req->req_bnd_dispo, "name", &sub_len);
            if (sub_ptr) {
                /* The item named 'image' contains the binary data of the file. */
                if (strncasecmp(sub_ptr, "image", sub_len) == 0) {
                    char *filename = NULL;
                    int fd = -1;
                    int eol = 0;

                    /* Get the upload file name. */
                    sub_ptr = HttpArgValueSub(req->req_bnd_dispo, "filename", &sub_len);
                    if (sub_ptr && sub_len) {
                        upname = malloc(sub_len + 1);
                        if (upname) {
                            memcpy(upname, sub_ptr, sub_len);
                            upname[sub_len] = 0;
                            /* Open the local file that the caller has provided. */
#ifdef NUT_OS
                            fd = _open(path, _O_CREAT | _O_TRUNC | _O_RDWR | _O_BINARY);
#else
                            fd = _open(path, _O_CREAT | _O_TRUNC | _O_RDWR | _O_BINARY, _S_IREAD | _S_IWRITE);
#endif
                            if (fd == -1) {
                                printf("Error %d opening %s\n", errno, path);
                            } else {
                                printf("Uploading %s\n", upname);
                            }
                        }
                    }
                    /* Recieve the binary data. */
                    while (avail) {
                        /* Read until the next boundary line. */
                        got = StreamReadUntilString(stream, delim, line, MIN(avail, MAX_UPSIZE));
                        if (got <= 0) {
                            break;
                        }
                        avail -= got;
                        /* Write data to the local file, if one had been opened. */
                        if (fd != -1) {
                            if (eol) {
                                _write(fd, "\r\n", 2);
                            }
                            if (got >= 2 && line[got - 2] == '\r' && line[got - 1] == '\n') {
                                eol = 1;
                                got -= 2;
                            }
                            _write(fd, line, got);
                        }
                    }
                    if (fd != -1) {
                        _close(fd);
                    }
                    free(filename);
                    if (got < 0) {
                        /* Broken connection. */
                        break;
                    }
                    rp = upname;
                }
                else if (strncasecmp(sub_ptr, "upload", sub_len) == 0) {
                    got = StreamReadUntilChars(hs->s_stream, "\n", "\r", line, MIN(avail, MAX_UPSIZE));
                    if (got <= 0) {
                        break;
                    }
                }
            }
        }
    }
    if (fd != -1) {
        _close(fd);
    }
    free(delim);
    free(line);

    return rp;
}

Note, that this function will directly read the data stream that is transmitted by the webbrowser. First, it will call GetMultipartBoundary to retrieve the boundary string that delimits each mime part. It will then allocate a read buffer, which size is limited to either the total content length or a maximum defined by MAX_UPSIZE, whichever is smaller. This prevents the application to run out of memory on tiny embedded systems.

After the buffer has been successfully allocated, the CPU will execute a loop until all data from the browser will have been consumed. At its beginning the function HttpParseMultipartHeader is called to parse the mime header. This routine adds the result to the HTTP_REQUEST structure. Most notably, the mime header Content-Disposition is stored in a buffer, to which req->req_bnd_dispo points to. The application can request specific items from this header by calling the library function

const char *HttpArgValueSub(const char *str, const char *name, int *len);

If an item named image appears, HttpArgValueSub() is called to retrieve the filename. To receive the binary data (uploaded image file), the function

int StreamReadUntilString(HTTP_STREAM *sp, const char *delim, char *buf, int siz);

is called, which delivers data up to the next multipart boundary string. If reached, the loop is repeated, looking for the next mime header.

I mentioned above, that GetMultipartBoundary is called to determine the boundary string. This routine must be provided by the application code too.

char *GetMultipartBoundary(HTTP_REQUEST *req)
{
    char *rp = NULL;
    const char *bptr;
    int blen;

    /* Make sure this is a multipart post. */
    if (CheckForPost(req, 1) == 0) {
        /* Retrieve the boundary string. */
        bptr = HttpArgValueSub(req->req_type, "boundary", &blen);
        if (bptr) {
            /* Build a delimiter string. */
            rp = malloc(blen + 3);
            if (rp) {
                rp[0] = '-';
                rp[1] = '-';
                memcpy(rp + 2, bptr, blen);
                rp[blen + 2] = '\0';
            }
        }
    }
    return rp;
}

Before reading any data, this function calls CheckForPost to make sure, that POST data is indeed available.

int CheckForPost(HTTP_REQUEST * req, int type)
{
    if (req->req_method != HTTP_METHOD_POST || req->req_type == NULL) {
        /* Bad method, POST expected. */
        return -1;
    }
    if (type == 0 && strncasecmp(req->req_type, "application/x-www-form-urlencoded", 33) == 0) {
        return 0;
    }
    if (strncasecmp(req->req_type, "multipart/form-data", 19) == 0) {
        return 0;
    }
    /* Bad content. */
    return -1;
}

Next Step

So far we used CGI functions to deliver content to the webbrowser for immediate display. Another technique is to asynchronously send data to Javascript.