AlgoDaily - Metrics: Latency, CPU, Memory, Error Rates

Home > Systems Design and Architecture 🔥 > Fundamentals of Systems Design > Metrics: Latency, CPU, Memory, Error Rates

1. Latency

Latency, or lag, is the total time needed for a data package to get from one point on a network to another and is mostly measured as the delay between a user's action and the response of the application to that action. This measurement can tell us how in how many milliseconds (ms) our application or website loads or responds to users’ requests. The structure of one such request-response HTTP transaction is shown in the diagram below.

An in-depth explanation of the protocol is beyond the scope of this tutorial, but roughly speaking, this is what happens during one HTTP transaction:

1 ) The client contacts the server at a designated port number and then it sends a document request

2 ) The client sends header information to inform the server of the configuration and document preference

3 ) The server replies with a message that it got the request

4 ) The server gives header information to tell the client about itself and the document that the client requested

5 ) The response data is sent

Having no latency is practically impossible due to the way networks work and communicate. However, having high latency leads to poor customer experience and ultimately - negative income implications and reviews. That’s why it’s of the utmost importance to minimize this lag as much as possible, and in order to do this, we need to focus on the main factors that may cause this lag:

Distance – the latency is directly proportional to the distance between the start and end node.
Transmission medium –the medium, i.e. the actual physical path between the communicating nodes can also influence latency. For example, using optic fibers is always a better idea than using the old cable-based networks.
Storage – latency may be increased when we access previously stored data since time is needed in order to process and return the requested information.

When trying to decrease latency, we should always try to optimize the aforementioned components or at least those we can control. Moreover, we should optimize pictures, compress files and avoid using render-blocking resources when possible. Another way we can improve our application performance is by using a Content Delivery Network (CDN). The main idea here is to store content closer to end-users by having many CDN servers distributed in multiple locations and thus reduce its travel time and latency respectively.

1. Latency

Programming Categories

Popular Lessons