Evolution of Server Architectures

5 min readNov 4, 2020

The last decade has seen a lot of improvements in computer technology. CPU clock speed has increased from few Megahertz (Mhz 10⁶cycles per second) to the order of Gigahertz( GHz — 10⁹cycles per second) and the size of memory from few Kilobytes(KB) to order of Gigabytes (GB). But in the last few years, the clock speed has not increased much instead, the number of CPU cores has been increasing.

According to wikipedia:
“Multi-core processors are widely used across many application domains, including general-purpose, embedded, network, digital signal processing (DSP), and graphics (GPU). Core count goes up to even dozens, and for specialized chips over 10,000, and in supercomputers (i.e. clusters of chips) the count can go over 10 million.”

Server Architectures

To make use of the multi-core CPUs and to solve scalability problems, server architectures have evolved over time. Two primary server architectures which are widely used are:

Thread-based server architecture

This server architecture has been widely accepted and is based on thread-per-connection model which means each request/connection is executed in a separate thread. Usually, the webserver runs in a single process mode and this process receives requests from web clients.

Internals of thread-based server

As in Fig 1, a thread-based server internally comprises of the following:

Acceptor thread

This is a single dispatcher thread which blocks on the socket ( on which the server process is listening to ) for new request connections. Once the connection is established, it is passed on to the connection queue.

Connection queue

This queue holds the connection requests in case all the request processing threads are busy executing prior requests. This queue is bounded and the size determines the maximum number of awaiting connections at a time. Connection requests exceeding the size will be rejected.

Worker threads ( a.k.a Request processing threads)

This is a thread pool which does the actual request processing. It waits for new connections in the queue, picks up the request the moment it is available, and executes the request. Once the request processing is complete, the thread becomes available for processing new requests.

How does it work?

During server startup, the number of worker threads created will be equal to the minimum thread pool size configuration which by default is the number of CPU cores. The worker thread pool size at a time is determined by the connection backlog in the connection queue. As more and more requests get queued up in the connection queue, more worker threads get instantiated.

Scalability Limitations

Concurrency is determined by the number of worker threads although ultimately it depends on the number of CPU cores and the preemptive scheduling by the OS. Constant context switching causes considerable loss of CPU time.
Worker threads are blocked on synchronous blocking I/O operations. Additionally, blocking operations triggers scheduling and context-switching.
Under high traffic, a significant amount of memory is consumed by the worker threads ( constant memory stack per thread ) since a larger number of worker threads will be created.
A lower number of threads improves the performance of each thread but reduces the overall scalability due to a lesser number of concurrent connections.

Examples: Tomcat

Event-driven server architecture

This server architecture is based on an event-driven approach which is asynchronous/non-blocking call semantics. It prevents the synchronous blocking I/O approach in threaded servers. In this architecture, a single thread is mapped to multiple connections instead of each connection is mapped to a separate thread.

Internals of event-driven server

As in Fig 2, the event-driven server internally comprises of the following:

Event Queue

Incoming requests are broken down into small events ( which as a whole represent handling the whole request ) and are queued up in the event queue.

Event Loop

This is a single thread which picks up the events sequentially from the queue and hands it over to the respective event handlers.

Event Handlers

These are responsible for the execution of each event queued up in the event queue. All I/O calls by the event handlers are done asynchronously and are non-blocking.

How does it work?

When a request gets fired to the server to fetch data from a remote DB, the event handler dealing with the request will make an asynchronous call to the DB. When OS fetches the data, it emits an interrupt resulting in a new event. This new event is then queued up in the event queue. The event loop picks up all queued events from the event queue till the turn for this event comes. Please note — Event-driven model works only when the entire pipeline is asynchronous, events are executed as non-blocking I/O, and notification is sent when the execution is complete.

Scalability Advantages

Having a single event-looping thread along with OS kernel threads for I/O helps to get rid of the overhead of excessive context switching and reduces a lot of memory footprints.
Moving out of the thread-per-connection model (as in thread-based servers) saves CPU cycles due to reduced context-switching.
Event-driven server models can scale easily under heavy load with CPU being the only bottleneck.
With this model, servers can scale till the system resources are completely utilized.
To make use of all the CPU cores, multiple separate server processes are instantiated on a single machine. These instances might share the same server socket. This approach is called “N-Copy” ( N instances on a system with N CPU cores).

Example: netty, tornado

Threaded and Evented architectures are the basic server architectures that are widely used. The need to make architectures more scalable has led to further improvisations on the above architectures and has led to the creation of hybrid architectures in many frameworks and libraries.

Thanks for reading! Feel free to comment or message me, when you have questions or suggestions.

Evolution of Server Architectures

Server Architectures

Thread-based server architecture

Event-driven server architecture

Written by Arijeet Saha