Request timing analysis

Request timing analysis

Category: DevOps


18.11.2020

In one of the previous blog posts, we discussed how different Python REST frameworks perform in the basic scenario. We performed tests based on sending a request to the endpoints with no underlying logic. Our goal was to find out which frameworks can achieve the highest requests per minute (RPM). One of the important notes there was that the absolute value of RPM depends on many factors, such as hardware or logic that this endpoint should implement.

Today we focus on single request timing analysis rather than multiple concurrent requests. For demonstration purposes, we implemented two dummy microservices using aiohttp framework and deployed it on localhost alongside with PostgreSQL database. In this post, we will cover a single request timing breakdown. We will be using Elastic solutions to visualize the results.

Scenario

There are two microservices called service1 and service2.

service1 exposes a single POST endpoint while service2 exposes a single GET endpoint.

Making a POST request with JSON payload to the service1 triggers:
– validation of the payload
– making a request to the service2

Making a GET request to the service2 triggers:
– database query

The flow is presented in the figure below.

communication data specialists

Basic timing analysis

The first thing one might do to analyze timing is to use developer tools from any browser to check the network monitor. Let’s try to do this and see what output can we get.

firefox dashboard for timing analysis

The request we make is to the localhost and without SSL, so all the network and security-related parameters (e.g. DNS resolution, TLS setup) are zeroed. There is one main component, namely Waiting that lasts in this particular case around 8ms. This is very coarse and definitely doesn’t give us an insight into what is going on behind the first request.

If you are interested in fine numbers you can use a solution from Joseph Scott and measure curl timing with greater precision.

Here is the breakout of the same request using curl:

time_namelookup: 0,000030s
time_connect: 0,000123s
time_appconnect: 0,000000s
time_pretransfer: 0,000155s
time_redirect: 0,000000s
time_starttransfer: 0,008496s
----------
time_total: 0,008668s



As you can see instead of round 0 integers we can see that some processes like name lookup (DNS) or connect took some time, but in this case is negligible. This might come in handy when trying to perform cutting edge time optimization, but we still don’t know the details of the request itself and what it consists of.

Timing analysis tool

This is where we added one of the tools that allow gathering events from distributed services. We decided to use APM from Elastic, but this is not the only tool capable of analyzing such data. You might also take a look at Zipkin if you are interested in an alternative.

Implementing APM in aiohttp framework is as easy as importing a python library and setting a few required parameters:

from aiohttp import web
from elasticapm.contrib.aiohttp import ElasticAPM

app = web.Application()

app['ELASTIC_APM'] = {
'SERVICE_NAME': 'service1',
'SERVER_URL': 'http://elastic-apm-ip:8200',
}
apm = ElasticAPM(app)


This short snippet requires two libraries, namely:

aiohttp
elastic-apm

With such low coding effort we have access to several dashboards and visualizations. Let’s start with the service map, which visualizes relations between microservices.

elastic company poland

Our exemplary implementation isn’t complex, but this helps to clearly see that the service map is indeed correct. service1 “talks” to the service2, which connects to the database. There is no information about timing in this view, but we get an overview of the relations in our microservices.

Timing analysis

Let’s move on to the timing distribution of a particular request. The following histogram shows how many requests fall into specific duration bins. This allows determining whether all requests have a similar duration or whether there are any deviations.

kibana analysis

From our scenario, most of the requests fall into the 5-10 ms bin, but there are also some that last even up to 70ms. This is due to the fact that tests were performed on the localhost, where load might vary depending on the other tasks or OS processes. Let’s deep dive into a single request timing analysis and see what it consists of.

apm services deviations

This is a view of a single POST request. This particular request lasts 7.5 ms and during that time GET request is performed to the service2 (localhost:8081). Handling this GET using python requests package in service1 lasts 6.2 ms. As you can see an actual processing time in service2 is 4.3 ms, so missing 1.9 ms is used by the requests package to prepare and receive the message. service2 performs two tasks, which are clearly visible on this trace: database connect – 2.8 ms and some query – almost 1 ms.

Using such an approach allows us to quickly grasp what is happening during different requests in a complex world of microservices and identify weak spots. Visualizing your requests will reveal bottlenecks or not optimal approaches in microservices communication.

If you are interested in this topic and would like to read more about this let us know. We can share such a timing request analysis from our actual implementation of microservices-based applications.

Other materials

If you are not sure which framework performs best take a look at Python REST frameworks analysis.

If you are interested in more details about aiohttp framework, take a look at the Routing order in aiohttp library in Python.

Author

Mateusz Buczkowski

Mateusz Buczkowski received his M.Sc. degree from Poznan University of Technology in 2012. Since then he is employed at the Chair of Telecommunication Systems and Optoelectronics in the Faculty of Electronics and Telecommunications as a teaching assistant. He is pursuing his PhD in field of image processing. His research interest cover the wide spectrum of image and video processing. In particular he is interested in image quality assessment, which is his PhD topic. As a R&D engineer he took part in two FP7 EU projects, namely 5GNOW and SOLDER, where he worked on solutions that can be used in 5-th Generation wireless networks, which is to come in 2020. In Grandmetric he is involved in wireless systems research.

Leave a Reply

Your email address will not be published. Required fields are marked *


 

Newsletter