Telemetry to solve dynamic analysis of a distributed system

Main Article Content

Oleh V. Talaver
https://orcid.org/0000-0002-6752-2175
Tetiana A. Vakaliuk
https://orcid.org/0000-0001-6825-4697

Abstract

In the modern software development world, implementing distributed solutions has become quite common due to the flexibility it brings to big companies. The downside is that when developing such systems, especially in many teams, global design problems may not be obvious and lead to a slowdown in the development process or even problems with the location of errors or degradation of overall system performance. In addition, the timely reaction to system degradation is complicated by the distributed nature of the architecture; while manually configuring rules for reporting problematic situations can be time-consuming and still incomplete, automatic detection of possible system anomalies will give engineers (especially Software Reliability Engineers) the focus on problems. For this reason, applications that can dynamically analyse the system for problems have great potential. Currently, the topic of using telemetry for system analysis is actively studied and gaining traction, so further research is valuable. The work aims to theoretically and practically prove the possibility of using telemetry to analyse a distributed information system and detect harmful architectural practices and anomalous events. To do this, firstly, a detailed overview of the problems related to the topic and the feasibility of using telemetry is provided; the next section briefly describes the history of the development of monitoring systems and the key points of the latest OpenTelemetry standard, reviews popular application performance monitoring systems, and defines innovative features to be further researched. The main part includes an explanation of the approach used to collect and process telemetry, a reasoning behind the usage of Neo4j as a data storage solution, a practical overview of graph theory algorithms that help in the analysis of the collected data, and a description outlining how the PCA algorithm is employed to detect unusual situations in the whole system instead of individual metrics. The results provide an example of using the software presented with Neo4j Bloom to visualise and analyse the data collected over several hours from the OpenTelemetry Demo test system. The last section contains additional remarks on the results of the study.

Abstract views: 440 / PDF downloads: 273

Downloads

Download data is not yet available.

Article Details

How to Cite
Talaver, O.V. and Vakaliuk, T.A., 2024. Telemetry to solve dynamic analysis of a distributed system. Journal of Edge Computing [Online], 3(1), pp.87–109. Available from: https://doi.org/10.55056/jec.728 [Accessed 9 February 2025].
Section
Articles

How to Cite

Talaver, O.V. and Vakaliuk, T.A., 2024. Telemetry to solve dynamic analysis of a distributed system. Journal of Edge Computing [Online], 3(1), pp.87–109. Available from: https://doi.org/10.55056/jec.728 [Accessed 9 February 2025].
Received 2024-04-21
Accepted 2024-05-10
Published 2024-05-21

References

Boone, N.D., 2017. Dynamic Baseline Alerts Now Automatically Find the Best Algorithm for You. Available from: https://newrelic.com/blog/how-to-relic/baseline-alerts-algorithm.

Brownlee, J., 2020. A Gentle Introduction to Exponential Smoothing for Time Series Forecasting in Python. Available from: https://machinelearningmastery.com/exponential-smoothing-for-time-series-forecasting-in-python/.

Bucchiarone, A., Dragoni, N., Dustdar, S., Larsen, S.T. and Mazzara, M., 2018. From Monolithic to Microservices: An Experience Report from the Banking Domain. IEEE Software, 35(3), pp.50–55. Available from: https://doi.org/10.1109/MS.2018.2141026. DOI: https://doi.org/10.1109/MS.2018.2141026

Cerny, T., Abdelfattah, A.S., Bushong, V., Al Maruf, A. and Taibi, D., 2022. Microservice Architecture Reconstruction and Visualization Techniques: A Review. 2022 IEEE International Conference on Service-Oriented System Engineering (SOSE). pp.39–48. Available from: https://doi.org/10.1109/SOSE55356.2022.00011. DOI: https://doi.org/10.1109/SOSE55356.2022.00011

Francesco, P.D., Malavolta, I. and Lago, P., 2017. Research on Architecting Microservices: Trends, Focus, and Potential for Industrial Adoption. 2017 IEEE International Conference on Software Architecture (ICSA). pp.21–30. Available from: https://doi.org/10.1109/ICSA.2017.24. DOI: https://doi.org/10.1109/ICSA.2017.24

Gamage, I.U.P. and Perera, I., 2021. Using dependency graph and graph theory concepts to identify anti-patterns in a microservices system: A tool-based approach. 2021 Moratuwa Engineering Research Conference (MERCon). pp.699–704. Available from: https://doi.org/10.1109/MERCon52712.2021.9525743. DOI: https://doi.org/10.1109/MERCon52712.2021.9525743

Guo, X., Peng, X., Wang, H., Li, W., Jiang, H., Ding, D., Xie, T. and Su, L., 2020. Graph-Based Trace Analysis for Microservice Architecture Understanding and Problem Diagnosis. Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering. New York, NY, USA: Association for Computing Machinery, ESEC/FSE 2020, p.1387–1397. Available from: https://doi.org/10.1145/3368089.3417066. DOI: https://doi.org/10.1145/3368089.3417066

Han, S., Hu, X., Huang, H., Jiang, M. and Zhao, Y., 2022. ADBench: Anomaly Detection Benchmark. In: S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho and A. Oh, eds. Advances in Neural Information Processing Systems. Curran Associates, Inc., vol. 35, pp.32142–32159. Available from: https://proceedings.neurips.cc/paper_files/paper/2022/file/cf93972b116ca5268827d575f2cc226b-Paper-Datasets_and_Benchmarks.pdf.

Neo4j APOC Library, 2023. Available from: https://neo4j.com/developer/neo4j-apoc/.

Neo4j Degree Centrality, 2023. Available from: https://neo4j.com/docs/graph-data-science/current/algorithms/degree-centrality/.

Neo4j Local Clustering Coefficient, 2023. Available from: https://neo4j.com/docs/graph-data-science/current/algorithms/local-clustering-coefficient/.

Neo4j Strongly Connected Components, 2023. Available from: https://neo4j.com/docs/graph-data-science/current/algorithms/strongly-connected-components/.

Niedermaier, S., Koetter, F., Freymann, A. andWagner, S., 2019. On Observability and Monitoring of Distributed Systems – An Industry Interview Study. In: S. Yangui, I. Bouassida Rodriguez, K. Drira and Z. Tari, eds. Service-Oriented Computing. Cham: Springer International Publishing, pp.36–52. Available from: https://doi.org/10.1007/978-3-030-33702-5_3. DOI: https://doi.org/10.1007/978-3-030-33702-5_3

Observability Primer, 2023. Available from: https://opentelemetry.io/docs/concepts/observability-primer/.

OpenTelemetry, 2024. Available from: https://opentelemetry.io/docs/what-is-opentelemetry/.

OpenTelemetry Collector, 2023. Available from: https://opentelemetry.io/docs/collector/.

OpenTelemetry Demo, 2023. Available from: https://github.com/open-telemetry/opentelemetry-demo.

OpenTelemetry Project Journey Report – 2023, 2023. Available from: https://www.cncf.io/reports/opentelemetry-project-journey-report/.

OpenTelemetry Semantic Conventions, 2024. Available from: https://opentelemetry.io/docs/specs/semconv/.

Parker, G., Kim, S., Maruf, A.A., Cerny, T., Frajtak, K., Tisnovsky, P. and Taibi, D., 2023. Visualizing Anti-Patterns in Microservices at Runtime: A Systematic Mapping Study. IEEE Access, 11, pp.4434–4442. Available from: https://doi.org/10.1109/ACCESS.2023.3236165. DOI: https://doi.org/10.1109/ACCESS.2023.3236165

Pigazzini, I., Fontana, F.A., Lenarduzzi, V. and Taibi, D., 2020. Towards Microservice Smells Detection. Proceedings of the 3rd International Conference on Technical Debt. New York, NY, USA: Association for Computing Machinery, TechDebt ’20, p.92–97. Available from: https://doi.org/10.1145/3387906.3388625. DOI: https://doi.org/10.1145/3387906.3388625

Pilkevych, I.A., Fedorchuk, D.L., Romanchuk, M.P. and Naumchak, O.M., 2023. Approach to the fake news detection using the graph neural networks. Journal of Edge Computing, 2(1), p.24–36. Available from: https://doi.org/10.55056/jec.592. DOI: https://doi.org/10.55056/jec.592

Semerikov, S.O., Vakaliuk, T.A., Mintii, I.S., Hamaniuk, V.A., Soloviev, V.N., Bondarenko, O.V., Nechypurenko, P.P., Shokaliuk, S.V., Moiseienko, N.V. and Ruban, V.R., 2021. Development of the computer vision system based on machine learning for educational purposes. Educational Dimension, 5, p.8–60. Available from: https://doi.org/10.31812/educdim.4717. DOI: https://doi.org/10.31812/educdim.4717

Sigelman, B.H., Barroso, L.A., Burrows, M., Stephenson, P., Plakal, M., Beaver, D., Jaspan, S. and Shanbhag, C.K., 2010. Dapper, a Large-Scale Distributed Systems Tracing Infrastructure. Available from: https://api.semanticscholar.org/CorpusID:14271421.

Soldani, J., Tamburri, D.A. and Van Den Heuvel, W.J., 2018. The pains and gains of microservices: A Systematic grey literature review. Journal of Systems and Software, 146, pp.215–232. Available from: https://doi.org/10.1016/j.jss.2018.09.082. DOI: https://doi.org/10.1016/j.jss.2018.09.082

Suboh, S., Aziz, I., Shaharudin, S., Ismail, S. and Mahdin, H., 2023. A Systematic Review of Anomaly Detection within High Dimensional and Multivariate Data. JOIV : International Journal on Informatics Visualization, 7, p.122. Available from: https://doi.org/10.30630/joiv.7.1.1297. DOI: https://doi.org/10.30630/joiv.7.1.1297

Söylemez, M., Tekinerdogan, B. and Kolukısa Tarhan, A., 2022. Challenges and Solution Directions of Microservice Architectures: A Systematic Literature Review. Applied sciences, 12(11), p.5507. Available from: https://doi.org/10.3390/app12115507. DOI: https://doi.org/10.3390/app12115507

Talaver, O.V. and Vakaliuk, T.A., 2023. Reliable distributed systems: review of modern approaches. Journal of edge computing, 2(1), p.84–101. Available from: https://doi.org/10.55056/jec.586. DOI: https://doi.org/10.55056/jec.586

Talaver, O.V. and Vakaliuk, T.A., 2024. Dynamic system analysis using telemetry. In: S.O.Semerikov and A.M. Striuk, eds. Proceedings of the 6th Workshop for Young Scientists in Computer Science& Software Engineering (CS&SE@SW 2023), Virtual Event, Kryvyi Rih, Ukraine, February 2, 2024. CEUR-WS.org, CEUR Workshop Proceedings, vol. 3662, pp.193–209. Available from: https://ceur-ws.org/Vol-3662/paper01.pdf.

The OpenTracing Semantic Specification, 2023. Available from: https://opentracing.io/specification/.

Trace source code, 2023. Available from: https://github.com/open-telemetry/opentelemetry-proto/blob/0a743e76ddbb34d7d46a4c3ca8f9d7bdbb81e389/opentelemetry/proto/trace/v1/trace.proto.

Villamizar, M., Garcés, O., Castro, H., Verano, M., Salamanca, L., Casallas, R. and Gil, S., 2015. Evaluating the monolithic and the microservice architecture pattern to deploy web applications in the cloud. 2015 10th Computing Colombian Conference (10CCC). pp.583–590. Available from: https://doi.org/10.1109/ColumbianCC.2015.7333476. DOI: https://doi.org/10.1109/ColumbianCC.2015.7333476

Most read articles by the same author(s)