@backstreetbrogrammer
--------------------------------------------------------------------------------
Chapter 02 - Limitation of current servers and Little's Law
--------------------------------------------------------------------------------
- Limitation of current server applications
Server applications generally handle concurrent user requests that are independent of each other, so it makes sense for an application to handle a request by dedicating a thread to that request for its entire duration.
This thread-per-request style is easy to understand, easy to program, and easy to debug and profile because it uses the platform's unit of concurrency to represent the application's unit of concurrency.
- Little's Law
In mathematical queueing theory, Little's law is a theorem by John Little which states that the long-term average number L of customers in a stationary system is equal to the long-term average effective arrival rate λ multiplied by the average time W that a customer spends in the system.
L = λ * W
For example,
Little's Law tells us that the average number of customers in the store L, is the effective arrival rate λ, times the average time that a customer spends in the store W.
Assume customers arrive at the rate of 10 per hour and stay an average of 0.5 hour. This means we should find the average number of customers in the store at any time to be 5.
L = 10 * 0.5 = 5
The scalability of server applications is governed by Little's Law, which relates latency, concurrency, and throughput:
For a given request-processing duration (i.e., latency), the number of requests an application handles at the same time (i.e., concurrency) must grow in proportion to the rate of arrival (i.e., throughput).
For example, suppose an application with an average latency of 50ms achieves a throughput of 200 requests per second by processing 10 requests concurrently.
1 request takes 50 ms
2 requests takes 50*2=100 ms
20 requests takes 50*20=1000 ms or 1 second
Thus, to increase throughput from 20 requests per second to 200 requests per second, we need to process 10 requests concurrently.
In order for that application to scale to a throughput of 2000 requests per second, it will need to process 100 requests concurrently.
If each request is handled in a thread for the request's duration then, for the application to keep up, the number of threads must grow as throughput grows.
Unfortunately, the number of available threads is limited because the JDK implements threads as wrappers around operating system (OS) threads.
OS threads are costly, so we cannot have too many of them, which makes the implementation ill-suited to the thread-per-request style.
If each request consumes a thread, and thus an OS thread, for its duration, then the number of threads often becomes the limiting factor long before other resources, such as CPU or network connections, are exhausted.
The JDK's current implementation of threads caps the application's throughput to a level well below what the hardware can support.
This happens even when threads are pooled, since pooling helps avoid the high cost of starting a new thread but does not increase the total number of threads.
Github: github.com/backstreetbrogramm...
- Upgrade to Java 21 Playlist: • Upgrade to Java 21
- Apache Spark for Java Developers Playlist: • Apache Spark for Java ...
- Top Java Coding Interview Problems Playlist: • Top Java Coding Interv...
- Java Serialization Playlist: • Java Serialization
- Dynamic Programming Playlist: • Dynamic Programming
#java #javadevelopers #javaprogramming
Негізгі бет Ғылым және технология 06 - Limitation of current servers and Littles Law
Пікірлер