Distributed System Computing Resource Scheduling Algorithm Based on Deep Reinforcement Learning
Authors: Yitao Lei, Xingxiang Zhai, Burra Venkata Durga Kumar
Abstract:
As the quantity and complexity of computing in large-scale software systems increase, distributed system computing becomes increasingly important. The distributed system realizes high-performance computing by collaboration between different computing resources. If there are no efficient resource scheduling resources, the abuse of distributed computing may cause resource waste and high costs. However, resource scheduling is usually an NP-hard problem, so we cannot find a general solution. However, some optimization algorithms exist like genetic algorithm, ant colony optimization, etc. The large scale of distributed systems makes this traditional optimization algorithm challenging to work with. Heuristic and machine learning algorithms are usually applied in this situation to ease the computing load. As a result, we do a review of traditional resource scheduling optimization algorithms and try to introduce a deep reinforcement learning method that utilizes the perceptual ability of neural networks and the decision-making ability of reinforcement learning. Using the machine learning method, we try to find important factors that influence the performance of distributed system computing and help the distributed system do an efficient computing resource scheduling. This paper surveys the application of deep reinforcement learning on distributed system computing resource scheduling. The research proposes a deep reinforcement learning method that uses a recurrent neural network to optimize the resource scheduling. The paper concludes the challenges and improvement directions for Deep Reinforcement Learning-based resource scheduling algorithms.
Keywords: Resource scheduling, deep reinforcement learning, distributed system, artificial intelligence.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 495References:
[1] M. van Steen and A. Tanenbaum, "A brief introduction to distributed systems", Computing, vol. 98, no. 10, pp. 967-1009, 2016. Available: 10.1007/s00607-016-0508-7.
[2] O. M. Elzeki, M. Z. Rashad, M. A. Elsoud “Overview of Scheduling Tasks in Distributed Computing Systems”, International Journal of Soft Computing and Engineering, Volume-2, Issue-3, July 2012.
[3] H. Zhang, X. Wu and J. Wang, "Study on the Evaluation Theoretical Structure Building of Deep Learning", China Education Technology, pp. 51-55, 2014.
[4] M. Botvinick, S. Ritter, J. Wang, Z. Kurth-Nelson, C. Blundell and D. Hassabis, "Reinforcement Learning, Fast and Slow", Trends in Cognitive Sciences, vol. 23, no. 5, pp. 408-422, 2019. Available: 10.1016/j.tics.2019.02.006.
[5] Singh. M and Suri. P.K, "QPS A QoS Based Predictive Max-Min, Min-Min Switcher Algorithm for Job Scheduling in a Grid", "Information Technology Journal", vol. 7, Issue. 8, 2008, pp. 1176- 1181.
[6] T. Kokilavani, Dr. D.I. George Amalarethinam, "Load Balanced Min-Min Algorithm for Static Meta-Task Scheduling in Grid Computing", "International Journal of Computer Applications", vol. 20, no. 2, April 2012, pp. 43-49.
[7] Hai Zhong, Kun Tao, Xuejie Zhang, "An Approach to Optimized Resource Scheduling Algorithm for Open-source Cloud Systems “, 5th Annu. Conf. China Grid Conference, China, 2010.
[8] Y. Yang, K. Liu, J. Chen, X. Liu, D. Yuan and H. Jin, "An Algorithm in SwinDeW-C for Scheduling Transaction-Intensive Cost-Constrained Cloud Workflows", 4th IEEE International Conference on e-Science, 374-375, Indianapolis, USA, December 2008.
[9] W. Guo, W. Tian, Y. Ye, L. Xu, and K. Wu, “Cloud resource scheduling with deep reinforcement learning and imitation learning,” IEEE Internet Things J., vol. 8, no. 5, pp. 3576–3586, 202
[10] C. Bitsakos, I. Konstantinou, and N. Koziris, “DERP: A deep reinforcement learning cloud system for elastic resource provisioning,” in 2018 IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2018, Nicosia, Cyprus, December 10-13, 2018. IEEE Computer Society, 2018, pp. 21–29
[11] H. Lu, C. Gu, F. Luo, W. Ding, and X. Liu, “Optimization of lightweight task offloading strategy for mobile edge computing based on deep reinforcement learning,” Future Gener. Comput. Syst., vol. 102, pp. 847–861, 2020
[12] Liu, N., Li, Z., Xu, J., Xu, Z., Lin, S., Qiu, Q., ... & Wang, Y. (2017, June). A hierarchical framework of cloud resource allocation and power management using deep reinforcement learning. In 2017 IEEE 37th international conference on distributed computing systems (ICDCS) (pp. 372-382). IEEE.
[13] Zhou, G., Tian, W., & Buyya, R. (2021). Deep Reinforcement Learning-based Methods for Resource Scheduling in Cloud Computing: A Review and Future Directions. arXiv preprint arXiv:2105.04086.
[14] W. Xu, L. Chen, and H. Yang, “A comprehensive discussion on deep reinforcement learning,” 2021 International Conference on Communications, Information System and Computer Engineering (CISCE), 2021.