Investigating the effect of virtual machine migration accounting on reliability using a cluster model
Main Article Content
Abstract
The purpose of the article is to develop and verify with the help of mathematical modeling a software method of deploying a fault-tolerant computing cluster with a virtual machine, which consists of two physical servers (main and backup), on which a distributed data storage system with synchronous data replication from the source server to the backup server is deployed. For this purpose, the task is to conduct a computational experiment on a model of a fault-tolerant cluster, which neglects costs during recovery for the migration of virtual machines by means of the mathematical application Mathcad. Combining computing resources into clusters is a way to ensure high reliability, fault tolerance, and continuity of the computing process of computer systems. This is achieved through virtualization, which enables the movement of virtual resources, services, or applications between physical servers while maintaining the continuity of computing processes. The focus of this study is on a failover cluster, which is composed of two physical servers (primary and backup) connected through a switch, and each server has a local hard disk. A distributed storage system with synchronous data replication from the source server to the backup server is deployed on the local disks of the servers, and a virtual machine is running on the cluster. Markovian processes, flows of podias, and Kolmogorov's systems of differential equations are built into the mathematical tools of the model of a water cluster. To ensure the continuity of the computing process in case of a failure of the main server, a shadow copy of the virtual machine is launched on the backup server. The reliability of the failover cluster is measured by the coefficient of non-stationary readiness. A Markov model is proposed to assess the reliability of the failover cluster, taking into account the costs of migrating virtual machines and mechanisms that ensure the continuity of the computing process in the cluster in case of a failure of one physical server. The memory migration process maintains two copies of the virtual machine on different physical servers, enabling them to continue working on the other in the event of failure. A simplified model of the failover cluster neglects the cost of migrating virtual machines and provides an upper estimate of reliability. The study shows that the reliability of a failover cluster, as measured by the non-stationary availability factor, is significantly impacted by the virtual machine migration process. The findings of this study can be used to inform decisions about the technology chosen to ensure the failure stability and continuity of the computing process of computer systems with cluster architecture. The calculations allow us to draw a conclusion about the significant impact of virtual machine migration accounting on reliability. The calculations allow us to draw a conclusion about the significant impact of virtual machine migration accounting on reliability. The calculation was performed under the following failure rates of the server, disk, and switch: λ0 = 1,115×10-5 1/h, λ1 = 3,425×10-6 1/h, λ2 = 2,3×10-6 1/h recovery respectively: μ0 = 0,33 1/h, μ1 = 0,171/h, μ2 = 0,33 1/h. The intensity of synchronization of the distributed storage system: μ3 = 1 1/h, μ4 = 2 1/h. The difference of non-stationary cluster availability coefficients is d = К2(t) – К1(t) = 2.7×10-10
Downloads
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
How to Cite
Accepted 2023-05-01
Published 2023-05-06
References
Abdulhamid, S.M., Latiff, M.S.A., Madni, S.H.H. and Abdullahi, M., 2016. Fault tolerance aware scheduling technique for cloud computing environment using dynamic clustering algorithm. Neural Computing and Applications, 29, pp.279–293. Available from: https://doi.org/10.1007/s00521-016-2448-8. DOI: https://doi.org/10.1007/s00521-016-2448-8
Attallah, S.M., Fayek, M.B., Nassar, S.M. and Hemayed, E.E., 2021. Proactive load balancing fault tolerance algorithm in cloud computing. Concurrency and Computation: Practice and Experience, 33(10), p.e6172. Available from: https://doi.org/10.1002/cpe.6172. DOI: https://doi.org/10.1002/cpe.6172
Belgacem, A., Saпd, M. and Ferrag, M.A., 2023. A machine learning model for improving virtual machine migration in cloud computing. The Journal of Supercomputing, pp.1–23. Available from: https://doi.org/10.1007/s11227-022-05031-z. DOI: https://doi.org/10.1007/s11227-022-05031-z
Fang, Y., Chen, Q. and Xiong, N., 2019. A multi-factor monitoring fault tolerance model based on a GPU cluster for big data processing. Information Sciences, 496, pp.300–316. Available from: https://doi.org/10.1016/j.ins.2018.04.053. DOI: https://doi.org/10.1016/j.ins.2018.04.053
Gonzalez, C. and Tang, B., 2020. FT-VMP: Fault-Tolerant Virtual Machine Placement in Cloud Data Centers. 2020 29th International Conference on Computer Communications and Networks (ICCCN). pp.1–9. Available from: https://doi.org/10.1109/ICCCN49398.2020.9209676. DOI: https://doi.org/10.1109/ICCCN49398.2020.9209676
Jin, H., Deng, L., Wu, S., Shi, X., Chen, H. and Pan, X., 2014. MECOM: Live migration of virtual machines by adaptively compressing memory pages. Future Generation Computer Systems, 38, pp.23–35. Available from: https://doi.org/10.1016/j.future.2013.09.031. DOI: https://doi.org/10.1016/j.future.2013.09.031
Kaur, K., Bharany, S., Badotra, S., Aggarwal, K., Nayyar, A. and Sharma, S., 2022. Energy-efficient polyglot persistence database live migration among heterogeneous clouds. The Journal of Supercomputing, 79, pp.1–30. Available from: https://doi.org/10.1007/s11227-022-04662-6. DOI: https://doi.org/10.1007/s11227-022-04662-6
Kumari, P. and Kaur, P., 2021. A survey of fault tolerance in cloud computing. Journal of King Saud University - Computer and Information Sciences, 33(10), pp.1159–1176. Available from: https://doi.org/10.1016/j.jksuci.2018.09.021. DOI: https://doi.org/10.1016/j.jksuci.2018.09.021
Lobanchykova, N.M., Pilkevych, I.A. and Korchenko, O., 2022. Analysis and protection of IoT systems: Edge computing and decentralized decision-making. Journal of Edge Computing, 1(1), p.55–67. Available from: https://doi.org/10.55056/jec.573. DOI: https://doi.org/10.55056/jec.573
Mangalagowri, R. and Venkataraman, R., 2023. Ensure secured data transmission during virtual machine migration over cloud computing environment. International Journal of System Assurance Engineering and Management. Available from: https://doi.org/10.1007/s13198-022-01834-8. DOI: https://doi.org/10.1007/s13198-022-01834-8
Modlo, Y.O., Semerikov, S.O., Bondarevskyi, S.L., Tolmachev, S.T., Markova, O.M. and Nechypurenko, P.P., 2019. Methods of using mobile Internet devices in the formation of the general scientific component of bachelor in electromechanics competency in modeling of technical objects. In: A.E. Kiv and M.P. Shyshkina, eds. Proceedings of the 2nd International Workshop on Augmented Reality in Education, Kryvyi Rih, Ukraine, March 22, 2019. CEUR-WS.org, CEUR Workshop Proceedings, vol. 2547, pp.217–240. Available from: https://ceur-ws.org/Vol-2547/paper16.pdf.
Nechypurenko, P., Selivanova, T. and Chernova, M., 2019. Using the Cloud-Oriented Virtual Chemical Laboratory VLab in Teaching the Solution of Experimental Problems in Chemistry of 9th Grade Students. In: V. Ermolayev, F. Mallet, V. Yakovyna, V.S. Kharchenko, V. Kobets, A. Kornilowicz, H. Kravtsov, M.S. Nikitchenko, S. Semerikov and A. Spivakovsky, eds. Proceedings of the 15th International Conference on ICT in Education, Research and Industrial Applications. Integration, Harmonization and Knowledge Transfer. Volume II: Workshops, Kherson, Ukraine, June 12-15, 2019. CEUR-WS.org, CEUR Workshop Proceedings, vol. 2393, pp.968–983. Available from: https://ceur-ws.org/Vol-2393/paper_329.pdf.
Oleksiuk, V. and Oleksiuk, O., 2021. The practice of developing the academic cloud using the Proxmox VE platform. Educational Technology Quarterly, 2021(4), p.605–616. Available from: https://doi.org/10.55056/etq.36. DOI: https://doi.org/10.55056/etq.36
Popel, M., Shokalyuk, S.V. and Shyshkina, M., 2017. The Learning Technique of the SageMathCloud Use for Students Collaboration Support. In: V. Ermolayev, N. Bassiliades, H. Fill, V. Yakovyna, H.C. Mayr, V.S. Kharchenko, V.S. Peschanenko, M. Shyshkina, M.S. Nikitchenko and A. Spivakovsky, eds. Proceedings of the 13th International Conference on ICT in Education, Research and Industrial Applications. Integration, Harmonization and Knowledge Transfer, ICTERI 2017, Kyiv, Ukraine, May 15-18, 2017. CEUR-WS.org, CEUR Workshop Proceedings, vol. 1844, pp.327–339. Available from: https://ceur-ws.org/Vol-1844/10000327.pdf.
Rajashekar, K., Karmakar, S., Paul, S. and Sidhanta, S., 2023. Topology-Aware Cluster Configuration for Real-Time Multi-Access Edge Computing. Proceedings of the 24th International Conference on Distributed Computing and Networking. New York, NY, USA: Association for Computing Machinery, ICDCN ’23, p.286–287. Available from: https://doi.org/10.1145/3571306.3571417. DOI: https://doi.org/10.1145/3571306.3571417
Riabko, A.V., Vakaliuk, T.A., Zaika, O.V., Kukharchuk, R.P. and Kontsedailo, V.V., 2023. Cluster fault tolerance model with migration of virtual machines. Proceedings of the 3rd Edge Computing Workshop, doors 2023, Zhytomyr, Ukraine, April 7, 2023. CEUR-WS.org, CEUR Workshop Proceedings, vol. 3374, pp.23–40. Available from: https://ceur-ws.org/Vol-3374/paper02.pdf. DOI: https://doi.org/10.31812/123456789/7402
Ryabko, A.V., Zaika, O.V., Kukharchuk, R.P. and Vakaliuk, T.A., 2022. Graph theory methods for fog computing: A pseudo-random task graph model for evaluating mobile cloud, fog and edge computing systems. Journal of Edge Computing, 1(1), p.1–16. Available from: https://doi.org/10.55056/jec.569. DOI: https://doi.org/10.55056/jec.569
Saxena, D. and Singh, A.K., 2022. OFP-TM: An Online VM Failure Prediction and Tolerance Model towards High Availability of Cloud Computing Environments. The Journal of Supercomputing, 78(6), p.8003–8024. Available from: https://doi.org/10.1007/s11227-021-04235-z. DOI: https://doi.org/10.1007/s11227-021-04235-z
Sheeba, A. and Uma Maheswari, B., 2023. An efficient fault tolerance scheme based enhanced firefly optimization for virtual machine placement in cloud computing. Concurrency and Computation: Practice and Experience, 35(7), p.e7610. Available from: https://doi.org/10.1002/cpe.7610. DOI: https://doi.org/10.1002/cpe.7610
Sivagami, V.M. and Easwarakumar, K.S., 2019. An Improved Dynamic Fault Tolerant Management Algorithm during VM migration in Cloud Data Center. Future Generation Computer Systems, 98, pp.35–43. Available from: https://doi.org/10.1016/j.future.2018.11.002. DOI: https://doi.org/10.1016/j.future.2018.11.002
Souza, A., Vittorio Papadopoulos, A., Tomas, L., Gilbert, D. and Tordsson, J., 2018. Hybrid Adaptive Checkpointing for Virtual Machine Fault Tolerance. 2018 IEEE International Conference on Cloud Engineering (IC2E). pp.12–22. Available from: https://doi.org/10.1109/IC2E.2018.00023. DOI: https://doi.org/10.1109/IC2E.2018.00023
Talwar, B., Arora, A. and Bharany, S., 2021. An Energy EfficientAgentAware Proactive Fault Tolerance for Preventing Deterioration of Virtual Machines Within Cloud Environment. 2021 9th International Conference on Reliability, Infocom Technologies and Optimization (Trends and Future Directions) (ICRITO). pp.1–7. Available from: https://doi.org/10.1109/ICRITO51393.2021.9596453. DOI: https://doi.org/10.1109/ICRITO51393.2021.9596453
Xu, H., Xu, S., Wei, W. and Guo, N., 2022. Fault tolerance and quality of service aware virtual machine scheduling algorithm in cloud data centers. The Journal of Supercomputing. Available from: https://doi.org/10.1007/s11227-022-04760-5. DOI: https://doi.org/10.1007/s11227-022-04760-5
Yang, C.T., Chou,W.L., Hsu, C.H. and Cuzzocrea, A., 2014. On improvement of cloud virtual machine availability with virtualization fault tolerance mechanism. The Journal of Supercomputing, 69(3), pp.1103–1122. Available from: https://doi.org/10.1007/s11227-013-1045-1. DOI: https://doi.org/10.1007/s11227-013-1045-1
Yu, C.Y., Lee, C.R., Tsao, P.J., Lin, Y.S. and Chiueh, T.C., 2020. Efficient Group Fault Tolerance for Multi-tier Services in Cloud Environments. ICC 2020 - 2020 IEEE International Conference on Communications (ICC). pp.1–7. Available from: https://doi.org/10.1109/ICC40277.2020.9149253. DOI: https://doi.org/10.1109/ICC40277.2020.9149253
Zhang, W., Chen, X. and Jiang, J., 2021. A multi-objective optimization method of initial virtual machine fault-tolerant placement for star topological data centers of cloud systems. Tsinghua Science and Technology, 26(1), pp.95–111. Available from: https://doi.org/10.26599/TST.2019.9010044. DOI: https://doi.org/10.26599/TST.2019.9010044