Web Servers and Concurrency

A journey to understand how web servers handle concurrency with limited resources using simple, known but intelligent techniques.

As we go up the ladders of engineering levels, we need to understand the internal working of the software, tools, and libraries that we use in our day-to-day lives. This improves our ability to use them efficiently and also helps us to debug things in the face of production issues. Web server is one such software that we use in most of the projects that we work on and we just expect it to work (scale) automatically.

Also, this understanding will help you answer most of the "highly asked interview questions": wink

In technical words, a web server can be explained as follows - a computer program that serves files (usually web pages) to clients over the internet. But in simple words can be explained as - Webserver is a like a waiter in a restaurant, it helps clients get the requested dishes (order) from the restaurant kitchen.

I hope you have used one or many web servers like Tomcat, jetty, or Node.js in your experience.

With this basic idea, let's get to the point and see how it handles concurrent requests from multiple clients. Imagine this as the waiter handling multiple orders from multiple clients in the restaurant.

The first step is to understand how a single request is handled

  • The incoming request is received by the server: The server listens on a specific port for incoming requests. When a request is received, the server creates a new socket connection to handle the request.

  • The request is parsed: The server reads the incoming request data and parses it to extract important information like the HTTP method, headers, and request body.

  • The request is processed: Once the request is parsed, the server determines the appropriate action to take based on the request data. This might involve routing the request to the appropriate handler, retrieving data from a database, or performing some other operation.

  • The response is generated: After processing the request, the server generates an HTTP response that includes a status code, headers, and a response body.

  • The response is sent: Finally, the server sends the response back to the client over the socket connection. This involves writing the response data to the socket and flushing the socket to ensure that all data is sent.

Ohh, did it say creating a new socket for every new request that is received? what exactly happens under the hood?

  • For the server to handle the request, it must establish a two-way communication channel with the client. This is typically done using a network socket, which is an endpoint for sending and receiving data over a network.

  • So, when a request hits a web server, the server creates a new socket connection to handle the request. This involves allocating a new socket and binding it to a specific port on the server. The server then listens on this port for incoming requests.

  • Once the connection is established, the client and server can exchange data over the socket. The client sends the request to the server over the socket, and the server reads the request data from the socket. Similarly, when the server generates a response, it sends the response data back to the client over the same socket connection.

But I heard that a web server generally runs on a single port, how can it bind a new port for every request?

  • The web server binds a fixed port to listen for incoming requests and creates a new socket connection on a different port for each incoming request, which is used to exchange data related to that specific request. The use of a different port for each request is transparent to the client and is handled by the web server internally.

Okay, that makes sense. That means a web server can handle concurrent requests by creating separate sockets for each request that it receives.

In addition to this, web servers apply certain well-known techniques to handle a huge number of concurrent requests:

  • Web servers typically handle concurrent requests using a combination of multi-threading and/or event-driven programming.

    • Multi-threading is a technique that allows a single process to execute multiple threads of code simultaneously. Each thread can handle a separate request, allowing the server to process multiple requests concurrently. When a web server receives a new request, it typically spawns a new thread to handle the request, while the main thread continues to listen for new incoming requests.

    • Event-driven programming is another technique used to handle concurrent requests in web servers. In an event-driven system, the server waits for events to occur and responds to them as they happen. When a new request arrives, the server generates an event and registers a callback function to handle the event. This allows the server to handle multiple requests concurrently, without having to create a new thread for each request.

But ports are limited in number, right? does it affect concurrent request handling?

  • Yes, the number of available ports is limited, and this can potentially affect the concurrent request-handling capacity of a web server. However, modern operating systems provide several mechanisms to manage port allocation and optimize the use of available ports.

  • For example, web servers like Netty often use a technique called "connection reuse" or "connection pooling" to minimize the number of ports used for concurrent requests.

    • In connection reuse, the server maintains a pool of existing socket connections that can be reused for subsequent requests instead of creating new socket connections for each request. This reduces the number of ports needed to handle a given number of concurrent requests and helps to optimize resource utilization.
  • In addition, modern operating systems provide advanced networking features like port reuse, where a socket can be bound to a previously-used port as long as certain conditions are met (e.g., the previous socket connection has been closed and the port is not in a TIME_WAIT state). This can help to increase the effective number of available ports and improve the scalability of web servers.

I hope this explanation helps you understand some of the internal workings of a web server and may have sparked more curiosity to learn about such things.

Thank you for reading this one and stay tuned for many such insightful posts coming your way soon.