Model-based Reinforcement Learning for Service Mesh Fault Resiliency in a Web Application-level

Microservice-based architectures enable different aspects of web applications to be created and updated independently, even after deployment. Associated technologies such as service mesh provide application-level fault resilience through attribute configurations that govern the behavior of request-response service -- and the interactions among them -- in the presence of failures. While this provides tremendous flexibility, the configured values of these attributes -- and the relationships among them -- can significantly affect the performance and fault resilience of the overall application. Furthermore, it is impossible to determine the best and worst combinations of attribute values with respect to fault resiliency via testing, due to the complexities of the underlying distributed system and the many possible attribute value combinations. In this paper, we present a model-based reinforcement learning workflow towards service mesh fault resiliency. Our approach enables the prediction of the most significant fault resilience behaviors at a web application-level, scratching from single service to aggregated multi-service management with efficient agent collaborations.


INTRODUCTION
A key trend in web application development in recent years is the advent of microservices-based architectures, in which applications are composed of small microservices that communicate with one another via distributed system mechanisms.Using open-source microservices technologies such as Kubernetes [1, 34,48], developers can create and update different aspects of an application independently, even after deployment.At the same time, to ensure that faults in individual microservices -or delays in communication among them -do not cascade into application-level failures, microservice-based architectures increasingly include service mesh technologies such as Istio [2,13] and Linkerd [3,32].These service meshes [12,26,36] contain associated "sidecars" [4] that monitor individual microservices for failures and delays, and perform associated actions to ensure application-level fault resilience; for example, bypassing problematic microservices upon consecutive errors, or ejecting them for a period of time [29,37,49].The number of consecutive errors or the length of the ejection time is configured through attributes in the service mesh.The service mesh architecture is depicted in Figure 1.
As Figure 2 shows [5], multiple services are aggregated and responsible for request -connection independently, which is challenging to be optimized when faults injected [19].Istio httpbin supported by service mesh is developed to control over web application behaviors, providing tremendous flexibility for communications.Nevertheless, the degree to which an web application is fault resilient is heavily dependent on the configured attribute values [18,31,47] and the relationships among them.Furthermore, it is impossible to determine the best and worst combinations of attributes via testing, due to the complexities of the underlying distributed system and the many possible attribute value combinations.The emerging machines learning methods have been widely applied to many networking problems, however, there is no previous machine learning methods focused on exploring the service mesh resiliency.In this paper, we present a model-based reinforcement learning approach towards service mesh (httpbin) fault resiliency that we call SFR2L (Service Fault Resiliency with Reinforcement Learning).Our novel contributions are as followings: 1) To the best of our knowledge, it is the first investigation on service mesh resiliency using machine learning methods, especially the first one to optimize aggregated services using model-based reinforcement learning, which enables the prediction of the most significant fault resiliency applicationlevel behaviors.2) We develop a complete workflow to implement the flexible control over service mesh resiliency, including data collection, service modelling, policy learning for resiliency optimization and validation on simulation.3) In terms of the policy learning, three cases are investigated: single agent for single service, multiple agents for single service and collaborative agents for multi-services, which consider most common optimizing scenarios in web applications.4) There are five structured fault injection (circuit breaking pattern) datasets disclosed by us, covering from 50 to 2000 loading call settings.Each dataset is well-labelled and available at here.We hope this work can make contributions to the fault resiliency community and help more scholars perform more impactful researches in this field.

RELATED WORK 2.1 Service Modelling
[50] present a model-based reinforcement learning approach for resource allocation in scientific workflow systems based on microservices.While this paper does not address service meshes or fault resilience, our overall approach is inspired by their work.From a service resiliency perspective, two kinds of approaches have been proposed: systematic testing and formal modeling.[24] presents an infrastructure and approach for systematic testing of resiliency.However, this work does not cover the selection of the tests to be run.Our work focuses on automating such a selection through machine learning.
In terms of formal modeling, [41] presents the use of continuoustime Markov Chains (CTMCs) and formal verification to analyze tradeoffs in service resiliency mechanisms in simple client-service interactions.Earlier work [27] also uses formal verification based on CTMCs, and analyzes multiple concurrent target services as well as steady-state availability measures including degraded service functionality.
Microservices and their self-adaptation is an active area of research, and a comprehensive survey and taxonomy of recent work is given in [20].However, as described in [20], there is a very limited work on application-level resilency, as well as very little work on using machine learning in the context of service self-adaptation.

The Istio Service Mesh
As described in Section 1, Istio is an open-source service mesh technology for distributed and microservice architectures.Istio provides a transparent way to build applications.Istio's traffic management features enable service monitoring and application-level fault resilience.
In particular, Istio provides outlier detection and circuit breakers to realize fault resilience.Outlier detection enables the capacity of services to be limited when they are behaving anomalously, or even to be ejected for a period of time.Circuit breaking [6] is a capability that prevents service failures from cascading.In particular, if a service A calls another service B, which does not respond within an acceptable time period, the call can be retried or even bypassed via the circuit breaker specification.As depicted in Figure 1, the sidecar proxy for service B monitors its response, and the sidecar for service A performs the specified circuit breaking.
Istio also provides fault injection [7] and load testing [8] capabilities in order to test application fault recovery.Such testing is considered critical to perform prior to application deployment to gain confidence in the fault resilience of deployed applications.
To realize these fault resilience mechanisms, Istio enables traffic rules to be configured for application deployment; these configurations govern the specific behaviors of outlier detection and circuit breaking.Some of the attributes of these traffic rules are depicted in Table 1, and govern the number of requests and connections allowed to a service that may be behaving anomalously, the amount of time it may be ejected and at what rate and at what detection interval, and the number of consecutive errors after which a circuit breaker will be tripped.The total threads is the number of Istio worker threads, while the total calls is the number of requests to the application used in Istio load testing.Details of these attributes as well as other Istio traffic rule attributes are at [9]; configurations for fault injection and load testing are at [7,8].
While configuration of these traffic rule attributes enables finegrained control of fault resilience policies and fault injection and load testing, the degree to which an application is fault resilient is heavily dependent on the attribute configurations and the relationships among them.Thus, a key challenge is to determine the most significant combination of attribute values, where the"best" values from a fault resilience perspective can be used for application deployment and the "worst" values can be used to drive fault injection and load testing.
However, the determination of these most significant value combinations is highly complex due to the inter-dependencies among attributes, the failure behavior of the underlying services and the communication among them, as well as the complexities of the underlying distributed system.Thus, it is impossible to determine the most significant attribute values via manual or automated testing due to sheer number of possible behaviors.

OVERVIEW OF REINFORCEMENT LEARNING
In our context, rewards correspond to the degree of service meshbased application over time."Worst-case" rewards (penalties) can be used to determine configuration settings that are critical to test for resiliency in load testing prior to application deployment.For model-free methods to be used in our context, the agents must take real-time actions directly on the service mesh, and learn from the observed behavior.This necessitates implementation of algorithmic APIs in the Istio environment, likely to be very expensive and inefficient.On the other hand, model-based methods must be approached with care, as deterministic models are unlikely to yield good approximations of the transition states among different microservices.
[10] firstly propose to utilize deep neural network to generate Q-factor instead of storing large amount of reward-action pairs in harsh table.Following their works, [38] present an algorithm according to the deterministic policy gradient to execute over continuous action spaces.For model-based reinforcement learning, [23,30,39,43] demonstrate the theoretical basis of policy gradient for model-based interaction [15,16,46].With regard to multi-agent reinforcement learning (MARL), [44] introduce the efficient MARL algorithm for parallel policy optimization.[14] propose to deploy multi-agent to optimize the traffic controls and networks, which is an important application in actual networking practice.Communication/Collaboration is a common configuration in multi-agents system [17,21,28,33,53] and advantageous at executing more stable, efficient and better decision-makings [25,35,40,45,51,52] using decentralized Q-network.Nevertheless, decentralized multiagents have weak performance in the case that only small datasets are available or there are fewer state vector features for policy learning.Regarding this, [22,27] conduct the proof of cooperative multi-agent and illustrate that the model parameters can be shared by decentralized multiple agents and agent itself also preserve its own private network to make decisions, which is the groundwork for our multi-service management.Our simulation model is inspired by [50], which introduces good approximation capabilities in assisting policy learning for service-mesh based architecture, obtaining greater training and networking optimizing efficiency than the model-free approaches.

WORKFLOW IMPLEMENTATION
Our implementation for exploring service mesh-based application fault resiliency is organized as followings (Figure 4

Data collection
To this end, we collect substantial traces covering target parametric spaces of the Istio httpbin service for each dataset, varying the configuration settings of the traffic rules and fault injection and load testing settings from Table 1.All collected data points are based on the actual Istio application and well-labelled after data cleaning.

Simulation Model
In general, we apply multiple-layer perceptrons (MLP) to simulate the aggregated behaviors, enabling agents to interact with the environment and learn the best loading space given the traffic rule attributes.Figure 3 depicts the input-output relations for modelling application-level fault resiliency.The 9-dimensional input vector contains 7 deterministic traffic rules and 2 loading settings given in Table 1, and the output corresponds to the 2 learned response attributes.We train the simulation MLP model weights: (Φ  +b  ), which is the well-trained -layer MLP (Φ  is the weights of -th layer and b  is the corresponding bias).When the inference error reaches minimum, we save the model weights for following agent-environment interactions.
Figure 3: Simulation Model of web application-level fault resiliency.X is a 9-dimensional vector with 7 deterministic traffic rules and 2 loading settings to be decided by reinforcement learning agent(s), Y is the web application (APP) response vector with 2 features: QPS and failure rate.

Model-based Reinforcement Learning
We then use our simulation model as the basis for our model-based reinforcement learning (RL).In our context, the environment is our simulation model (well-trained MLP), states correspond to traffic rule settings, actions determine the loading settings (the number of threads and loading calls).We explore both single and multiple agents, the latter either working independently or dependently.As depicted in Figure 4, our agents learn from responses of our simulation model of the Istio httpbin service to the perform actions.
In real-world web application, communications between servers and clients are normally aggreagted with multiple services (microservices) and holistic resiliency for all services are essential.Take a simple shopping web application for example, it may contain check-out, product, sign-up/in/out, shopping cart and review service.These services experience very closed visiting volume due to the fact that visitors would go through all these steps to finish their shopping.All services play individual roles but also have interdependencies to ensure the function of whole application.So we extend our multi-agent collaboration into management of multiservice resiliency.Section 5 describes our model-based reinforcement learning algorithm in much more detail.

Validation
Finally, we train the reinforcement learning algorithms to do policy learning and obtain optimized loading decisions given different traffic rule and loading setting combinations.All decisions with the highest reward are validated in actual Istio to evaluate policy learning.

REINFORCEMENT LEARNING PARADIGM
In the beginning, we present some preliminaries on RL and how states, actions, and rewards are defined in the context of simulation model.Based on the basic setting, we illustrate how to apply single agent to single service.Then we discuss the deployment of multiagent reinforcement learning to address the complex parametric space optimization according to their collaborative relationships.Finally, the multi-service holistic resiliency optimization is illustrated using the combination of decentralized learning and centralized learning paradigm.
We define  as the index of timestamp that RL agent interacts with the simulation model (SM),  () as the state vector for RL agent(s) at ,  () as the reward for -th action (), i() as the input vector for SM, o(t) as the application responses emulated by the SM.W  is the well-trained simulation model weights.() and s() are concatenated as SM input vector  () to trigger o(t) i(t) = {s(t); ()} .
(1) ; is to vertically concatenate vectors.For o(t) (2) We would like to explore the borderline of normal operation of services regarding fault injection, which means higher failure rate

.|𝜋 (𝜃 )],
where  is discounted coefficient.In the implementation, the agent will search through all the actions for a given state and select the state-action pair with the highest corresponding Q-factor using Bellman Equation [42].The update of Q-network depends on the square error between the Q-factor that estimated by agents and actual rewards received from SM.The policy gradient for long term where  is the sequential action-state pair trace in time order: {s(0), (0), s(1), (1), ..., s( − 1), ( − 1)}, () is the reward function across the trace and ∇ ( ) is the gradient used for network update.The single agent for single service is summarized in Algorithm 1.

Multi-agents for Single Service
Section 5.1 illustrates the basic case for policy learning.However, there are two decided loading settings to tune the resiliency together.Following the previous settings, we extend the scenario into multi-agent interactions and define two kinds of collaborative relationships between two agents, which are responsible for two loading decisions, respectively.Denote  1 () as the action taken by  1 (),  2 () as the action taken by  2 ().s 1 (), s 2 () are respective state vectors, i(t), o(t),  () maintain the same definition as before.Two agents share the same reward for ∇ ( 1 ), ∇ ( 2 ),  1 ,  2 are RL Q-network parameters.
After both actions are made, the i(t) is generated as After obtaining i(), the following interaction with SM aligns with Section 4.1.

Dependent Collaboration.
The definition on "Dependent" (Det.) also drives from the state vector for two agents.Two agents are executed in order and the state for latter agent combines the action with the state of former agent to take the second action.Assume  1 () is the first executed agent,  2 () is the second executed agent, state vector s 2 () relies on  1 () and s 1 () s 2 () = {s 1 ();  1 ()} . (6) Similarly, the application response is We summarize the multi-agent algorithm in Algorithm 2. T() is denoted as traffic rule vector at timestamp .All types of agent interdenpedencies are listed in Table 3.

Communicative Multi-agents for Multi-services
In light of real-world web applications subject to collaboration of all aggregated services to fulfill functions, communications are necessary to know how other services operate in the same stream.At this part, we extend our exploration that multiple services are optimized by multi-agents.However, decentralized only Q-network mandates the iterative decision execution for information sharing, which leads to serious communication latency when high volume of requests.As a consequence, we enable individual RL agents to efficiently communicate in the beginning phase so that all policy learnings can obtain better summed rewards with low time complexity.[10,22] propose the theoretical foundation of homogeneous agent cooperation through sharing their model parameters, allowing model to be trained with experiences of all agents, which makes the algorithm data efficient.Inspired by their work, we design that partial learning is centralized while the left learning is decentralized.The agents take different actions via searching through their own action spaces, receive rewards and update two split networks.We define the sharable network as SNet with input layer and its weight is   , decentralized network (or say private) as PNet with hidden and output layer and their weights are  1 , ...,   ,  is the index of service.As Figure 4 shows, all states from all agents pass For the purpose of optimizing the holistic "worst-case" resiliency, the reward is defined as where  is the coefficient.After rewards are generated for each one, the respective PNet will be updated by corresponding rewards and Q-factor pair, SNet will be updated by all pairs from all service agents.As a consequence, the policy of each agent is relevant to   and    ,  ( ) =   ().The long-term policy gradient for PNet

∇𝐽 (𝜃
where ms n () is the output vector of SNet and the input of PNet.The prediction function f   of SNet is represented by Updating   is to find the MSE minimizer of predicted ŝ( + 1) and s( + 1) where training data s n (), ms n (), s n ( + 1) ∈ D for all agents.As shown in figure 5, the agent can firstly efficiently communicate based on shared experienced and then optimize their own decisions combining with individual case and holistic circumstances.
If both kinds of actions are decided by agents, only the same type of agents communicate over all services (call agents only communicate with other call agents, thread agent only communicate with other agents etc.The collaborative relationship for agents within one service have accordance with Section 5.2).The learning paradigm for multi-services is summarized in Algorithm 3 and visualized in Figure 6.The implementation of action space searching and PNet refreshment can be executed in parallel, only the SNet updates require iterative steps.[11] summarizes most common ways in modelling networking communications, among which Logistic Regression, Linear Ridge Regression and Support Vector Regression are highlighted due to their outperformed simulating abilities.We select these three methods as baseline simulation models to fight against 5 layer-MLP and see which emulates the application response (in our case study, the Istio httpbin service) best.

EXPERIMENTS 6.1 Simulation Model Evaluation
We collect 5 groups of structured datasets and split them into 8:2 as training and testing set, respectively.For MLP, the input layer has 9 neurons, 3 hidden layers have 512 neurons and the output layer has 2 neurons.The learning rate varies from 1 −6 to 1 −5 .The range for the traffic rule and thread and call settings we used in our experiments is in Table 4.
Four types of models are trained and MLP has the best performance according to Table 4, hence we use it in our approach.

Policy Evaluation
We explore the "worst-case" rewards, as defined in Section 5. "worstcase" rewards (penalties) provide insight into configuration settings that are critical to use in load testing prior to application deployment.The learning ability is characterized by how much RL algorithm outperform baselines under the same context, whose metrics is the maximum rolling mean ratio of cumulative reward obtained in RL to baseline in last 25 epochs.
In simulation session, we implement the simulation model to interact with RL agent(s) and record activities at the epoch with the highest reward ratio.Then data points at the best epoch work as the loading and traffic parameters to trip the actual service response and obtain the validated reward ratio.For all experiments, we analyze single agent and multiple/collaborative agent model-based reinforcement learning using our algorithm and implementation from Section 4. Every experiment is performed for 3 times and we present the mean values of 3 results.We compare the selection of configuration setting values generated by our algorithm to baselines in Table 5.In terms of RL experiment, we use our deep model-based reinforcement learning algorithm to identify the configuration settings of threads and calls over time, over variations of the other traffic rule settings as per Table 1.Collaborative agent experiment includes either call or thread selection and both thread and call selection with 5 service aggregated in applications.For baseline experiment, there is no exact relevant machine learning methods addressing fault resiliency, so we randomly and evenly pick the action and see reward difference gained by non-monitoring and monitoring (RL) groups.

Result Analysis
From Table 5, it is clear that model-based RL algorithm is able to outperform baselines in most cases and learning effects are validated in actual Istio API.Although there are errors between simulations and validations, the ratio trends are almostly aligned.The factor of thread/call has different influence significance on the policy learning: thread is more significant for S1 and call is more significant for S2 -S5.To be extended, multi-agents learning abilities are closed to single agent when either factor is very trivial to fault resiliency, such as thread in S2, S5 and call in S1.The difference derives from the parameter range setting in simulation model training, and our approach is also beneficial to help clarify the significant factor for the service with specific traffic volume.
In terms of single service cases, most of multi-agents work better than single agent decisions, which proves that complex parameter interdependence optimization can fulfill the potential of policy learning.For instance, Thread&Call agents gain 27% higher rewards than Call only (3.45 to 2.71) agent in simulation and 69% higher rewards (2.80 to 1.77) in validation in S2.What is more important, multi-agents usually have higher validation accuracy than single agent.Take S1 as an example, Thread only agent has 2.21 reward ratio in simulation and 1.63 in validation (36% higher), but Thread&Call agent has closed 2.26 reward ratio and more accurate 2.15 validated ratio (5% higher).Similarly, Call only agent has 2.71 in simulation and 1.77 in validation (53% higher), Thread&Call has 3.45 in simulation and 2.80 in validation (23% higher) in S2.Regarding the difference between single service and multi-services, it is apparent that the reward ratios obtained by all 5 services are higher than the corresponding single case and have more stable learning trends against single service, which strongly supports the importance of collaborative decision making for aggregated services resiliency, especially they experience closed loading volumes in real world.For S1, S2, multi-services outperform single-service in corresponding cases for both simulation and validation.For S3, S4 and S5, even if aggregated services behave closely to corresponding single service in simulation, the validating reward ratios are much higher.For example, we could observe that even if the reward ratios of Thread only and Call only agent in single and multiple services cases are very closed (1.13 to 1.18, 1.75 to 1.77) in S3 simulation, but S3 validating results reveal that collaborative services have better results (1.06 to 0.98, 1.43 to 1.31).
In terms of MLP approximation accuracy, Table 4 indicates that MLP has variant emulating capabilities on the behavior of service resiliency, among which S1, S2 have the lowest MSE values (0.13, 0.17).However, the difference between simulation and validation results do not strongly related to the simulating MSE value, modelbased RL still provides optimized decisions on loading setting for S3 and S4 although their MLP simulating MSE values (0.52, 0.63) are relatively higher out of all groups.For model-based learning, the simulating model is not required to describe behaviors very precisely, in other words, we just need to know the estimated reactions of services given a specific context to enable continuous agent-environment interaction and policy learning.We think the experiments are also enlightening to apply model-based RL to many other fields when the real-time online interaction is expensive or unrealistic.

CONCLUSION
Our model-based reinforcement learning algorithm can help predict which values of the traffic rule settings and threads and calls yield rewards with respect to fault resiliency of the Istio httpbin service.We comprehensively investigate how model-based RL can help optimize the parametric spaces of single services, what kind of relationships between different variable agents are suitable given specific settings and how to employ service cluster using collaborative agents to communicate with each other efficiently.
In particular, the configuration settings that yield the "worst-case" rewards give insight into which combinations of configurations should be tested rigorously during load testing to ensure robust fault recovery, as these may significantly compromise applicationlevel fault resiliency.Regarding the dynamic operating status of microservice networking, the validation on SFR2L is crucial to approve the performance of our designs.Our research is promising to the maintenance of serve mesh-based architecture in industry and construct an effective workflow to demonstrate the optimal state for a service or cluster of services.

Training Details
We conduct our experiment for 500 epochs, in which each epoch contains 1000 simulation interactions.For the network of single agent in Section 5.1 and Section 5.2, it contains a input layer and a output layer ,and another 2 hidden layers with 512 neurons.The learning rate is 5 × 10 −5 , dimension of single agent state vectors are uniformly set as 8 (7 rules + 1 loading setting), dependent multiagent state vectors are set as 7 and 8 dimensions, independent multi-agent state vectors are set as 7 dimensions.According to Section 5.3, SNet has 2-layer 512 neurons in the end, PNet has a 512 hidden neuron in the front.The learning rate is set as 1 × 10 −5 .All experiments use Adam optimizer in Pytorch library for gradient descents.
The action spaces are constrained by the datasets that used for training simulation model.The available actions are listed in Table 6.

LIMITATION
Communication latency and efficiencies in services within web applications are subject to many factors, such as the stability of internet stability, the volume of concurrent visiting and internal properties of networking (bandwidth etc.).So the validation accuracy for simulation rely on the real-time networking transmission circumstances.In the meanwhile, the data collection process is not able to precisely reflect the real-time networking operation, but the substantial traces that we collect are capable of emulating the general responses of fault injections.

Figure 2 :
Figure 2: Communication (request & response) between web document clients and servers, the figure only shows the request process.
depicts the whole pipeline) 1) Data collection from an Istio application; 2) Simulation model training and selection; 3) Model-based Reinforcement learning; 4) Validation on policy learning results.

Figure 4 :
Figure 4: SFR2L Pipeline.Data collection and validation are based on actual Istio API.RL Agent(s) fully interact with simulation model and validate its decisions in actual Istio API.

Figure 5 :
Figure 5: The configuration of Q-network of collaborative service agents: all states from all services pass the same SNet firstly, then go through individual PNet and search through their own action space to receive rewards.

Figure 6 :for all agents do 3 4 s 10 Update
Figure6: The interactions of multi-service using collaborative multi-agents : all same types of agents communicate with each other in the beginning, then take individual actions and finally interact with corresponding simulation models.

Figure 7 :
Figure 7: Cumulative reward (per epoch) ratio.The upper are all single service case, the bottom are all aggregated multi-services.We can see that learnings are fast and then keep volatile for all cases, but multi-services with communications have more stable trends against single service.The best epoch has been reached before 150th epoch.

Table 1 :
Representative Fault Resiliency Attributes

Table 2 :
Reward Factors in Application-level Reward Factors Definition Querys Per Second Request processing speed 200 Response Rate Successful connected request rate 503 Response Rate Failed connected request rate (as Table 2 shows)  503 () but not the full failure rate is preferred .At the same time, higher QPS (()) is beneficial to improve the networking efficiency.So the definition of 503 reward is  503 () = () •  503 ().
Firstly, we demonstrate the simplest case that only one agent and one simulation model interact with each other.In this case, only one kind of action (number of threads or calls) is decided by ().Given a preset traffic rule s(t), agent () takes it as input state and make actions ().After that, i(t) integrates both input (traffic rules + one loading testing) and output (optimized loading testing) of () and triggers application responses (QPS + failure rate) o(t).

Table 3 :
Agents and Interdependencies.Traffic rule settings are the first 7 listed in Table1.Thread&Call represent two agents take actions independently, Thread-Call or Call-Thread represent the actions are decided in order.The former one takes action firstly and passes it to the latter one.

Table 4 :
Ranges of Traffic Rule, Thread, and Call Settings; Model evaluation metrics: MSE (Mean Square Error).

Table 5 :
Policy Evaluations: Sim. is the maximum rolling reward ratio in simulation, Val. is the maximum rolling reward ratio in validation, *5 means 5 services are aggregated and communicative.Val.Sim.Val.Sim.Val.Sim.Val.Sim.Val.

Table 6 :
Action SpacesThe parameter range setting for our data collection has the following criterion:1) The failure rate and success rate distribution is relatively equivalent for whole dataset.2) Some parameters are fixed to see better optimization on other parameters.Interval Time and Max Ejection Rate are constant in our experiments in terms of no complex topology within single httpbin service.3) Settings are closed to cases in industrialized practice.The number of calls are set from 50 to 2000, which covers broad ranges of loading testing.