html - working - multipart/form-data header
What does enctype='multipart/form-data' mean? (6)
when should we use it
Quentin's answer is right: use
multipart/form-data if the form contains a file upload, and
application/x-www-form-urlencoded otherwise, which is the default if you omit
I'm going to:
- add some more HTML5 references
- explain why he is right with a form submit example
There are three possibilities for
multipart/form-data(spec points to RFC7578)
text-plain. This is "not reliably interpretable by computer", so it should never be used in production, and we will not look further into it.
How to generate the examples
Once you see an example of each method, it becomes obvious how they work, and when you should use each one.
You can produce examples using:
nc -lor an ECHO server: HTTP test server accepting GET/POST requests
- an user agent like a browser or cURL
Save the form to a minimal
<!DOCTYPE html> <html lang="en"> <head> <meta charset="utf-8"/> <title>upload</title> </head> <body> <form action="http://localhost:8000" method="post" enctype="multipart/form-data"> <p><input type="text" name="text1" value="text default"> <p><input type="text" name="text2" value="aωb"> <p><input type="file" name="file1"> <p><input type="file" name="file2"> <p><input type="file" name="file3"> <p><button type="submit">Submit</button> </form> </body> </html>
We set the default text value to
aωb, which means
U+03C9, which are the bytes
61 CF 89 62 in UTF-8.
Create files to upload:
echo 'Content of a.txt.' > a.txt echo '<!DOCTYPE html><title>Content of a.html.</title>' > a.html # Binary file containing 4 bytes: 'a', 1, 2 and 'b'. printf 'a\xCF\x89b' > binary
Run our little echo server:
while true; do printf '' | nc -l 8000 localhost; done
Open the HTML on your browser, select the files and click on submit and check the terminal.
nc prints the request received.
Tested on: Ubuntu 14.04.3,
nc BSD 1.105, Firefox 40.
POST / HTTP/1.1 [[ Less interesting headers ... ]] Content-Type: multipart/form-data; boundary=---------------------------735323031399963166993862150 Content-Length: 834 -----------------------------735323031399963166993862150 Content-Disposition: form-data; name="text1" text default -----------------------------735323031399963166993862150 Content-Disposition: form-data; name="text2" aωb -----------------------------735323031399963166993862150 Content-Disposition: form-data; name="file1"; filename="a.txt" Content-Type: text/plain Content of a.txt. -----------------------------735323031399963166993862150 Content-Disposition: form-data; name="file2"; filename="a.html" Content-Type: text/html <!DOCTYPE html><title>Content of a.html.</title> -----------------------------735323031399963166993862150 Content-Disposition: form-data; name="file3"; filename="binary" Content-Type: application/octet-stream aωb -----------------------------735323031399963166993862150--
For the binary file and text field, the bytes
61 CF 89 62 (
aωb in UTF-8) are sent literally. You could verify that with
nc -l localhost 8000 | hd, which says that the bytes:
61 CF 89 62
were sent (
61 == 'a' and
62 == 'b').
Therefore it is clear that:
Content-Type: multipart/form-data; boundary=---------------------------9051914041544843365972754266sets the content type to
multipart/form-dataand says that the fields are separated by the given
every field gets some sub headers before its data:
Content-Disposition: form-data;, the field
filename, followed by the data.
The server reads the data until the next boundary string. The browser must choose a boundary that will not appear in any of the fields, so this is why the boundary may vary between requests.
Because we have the unique boundary, no encoding of the data is necessary: binary data is sent as is.
TODO: what is the optimal boundary size (
log(N)I bet), and name / running time of the algorithm that finds it? Asked at: https://cs.stackexchange.com/questions/39687/find-the-shortest-sequence-that-is-not-a-sub-sequence-of-a-set-of-sequences
Content-Typeis automatically determined by the browser.
How it is determined exactly was asked at: How is mime type of an uploaded file determined by browser?
Now change the
application/x-www-form-urlencoded, reload the browser, and resubmit.
POST / HTTP/1.1 [[ Less interesting headers ... ]] Content-Type: application/x-www-form-urlencoded Content-Length: 51 text1=text+default&text2=a%CF%89b&file1=a.txt&file2=a.html&file3=binary
Clearly the file data was not sent, only the basenames. So this cannot be used for files.
As for the text field, we see that usual printable characters like
b were sent in one byte, while non-printable ones like
0x89 took up 3 bytes each:
File uploads often contain lots of non-printable characters (e.g. images), while text forms almost never do.
From the examples we have seen that:
multipart/form-data: adds a few bytes of boundary overhead to the message, and must spend some time calculating it, but sends each byte in one byte.
application/x-www-form-urlencoded: has a single byte boundary per field (
&), but adds a linear overhead factor of 3x for every non-printable character.
Therefore, even if we could send files with
application/x-www-form-urlencoded, we wouldn't want to, because it is so inefficient.
But for printable characters found in text fields, it does not matter and generates less overhead, so we just use it.
enctype='multipart/form-data' mean in an
HTML form and when should we use it?
enctype='multipart/form-data' means that no characters will be encoded. that is why this type is used while uploading files to server.
multipart/form-data is used when a form requires binary data, like the contents of a file, to be uploaded
Set the method attribute to POST because file content can't be put inside a URL parameter using a form.
Set the value of enctype to multipart/form-data because the data will be split into multiple parts, one for each file plus one for the text of the form body that may be sent with them.
Usually this is when you have a POST form which needs to take a file upload as data... this will tell the server how it will encode the data transferred, in such case it won't get encoded because it will just transfer and upload the files to the server, Like for example when uploading an image or a pdf
When you make a POST request, you have to encode the data that forms the body of the request in some way.
HTML forms provide three methods of encoding.
Work was being done on adding
application/json, but that has been abandoned.
The specifics of the formats don't matter to most developers. The important points are:
When you are writing client-side code, all you need to know is use
multipart/form-data when your form includes any
<input type="file"> elements.
When you are writing server-side code: Use a prewritten form handling library (e.g. Perl's
CGI->param or the one exposed by PHP's
$_POST superglobal) and it will take care of the differences for you. Don't bother trying to parse the raw input received by the server.
If you are writing (or debugging) a library for parsing or generating the raw data, then you need to start worrying about the format. You might also want to know about it for interest's sake.
application/x-www-form-urlencoded is more or less the same as a query string on the end of the URL.
multipart/form-data is significantly more complicated but it allows entire files to be included in the data. An example of the result can be found in the HTML 4 specification.
text/plain is introduced by HTML 5 and is useful only for debugging — from the spec: They are not reliably interpretable by computer — and I'd argue that the others combined with tools (like the Net tab in the developer tools of most browsers) are better for that).
- enctype(ENCode TYPE) attribute specifies how the form-data should be encoded when submitting it to the server.
- multipart/form-data is one of the value of enctype attribute, which is used in form element that have a file upload. multi-part means form data divides into multiple parts and send to server.
- metaphor part : an HTML document has two parts: a head and a body.