Safe and Efficient Deep Reinforcement Learning Control Model: A Hydroponics Case Study
Authors: Almutasim Billa A. Alanazi, Hal S. Tharp
Abstract:
Safe performance and efficient energy consumption are essential factors for designing a control system. This paper presents a reinforcement learning (RL) model that can be applied to control applications to improve safety and reduce energy consumption. As hardware constraints and environmental disturbances are imprecise and unpredictable, conventional control methods may not always be effective in optimizing control designs. However, RL has demonstrated its value in several artificial intelligence (AI) applications, especially in the field of control systems. The proposed model intelligently monitors a system's success by observing the rewards from the environment, with positive rewards counting as a success when the controlled reference is within the desired operating zone. Thus, the model can determine whether the system is safe to continue operating based on the designer/user specifications, which can be adjusted as needed. Additionally, the controller keeps track of energy consumption to improve energy efficiency by enabling the idle mode when the controlled reference is within the desired operating zone, thus reducing the system energy consumption during the controlling operation. Water temperature control for a hydroponic system is taken as a case study for the RL model, adjusting the variance of disturbances to show the model’s robustness and efficiency. On average, the model showed safety improvement by up to 15% and energy efficiency improvements by 35%-40% compared to a traditional RL model.
Keywords: Control system, hydroponics, machine learning, reinforcement learning.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 207References:
[1] Bridgewood, L. (2003). Hydroponics: Soilless gardening explained. Ramsbury, Marlborough, Wiltshire: The Crowood Press Limited.
[2] Carotti, Laura, et al. "Plant factories are heating up: Hunting for the best combination of light intensity, air temperature and root-zone temperature in lettuce production." Frontiers in plant science 11 (2021): 592171.
[3] Nguyen, Duyen TP, et al. "Short-term root-zone temperature treatment enhanced the accumulation of secondary metabolites of hydroponic coriander (Coriandrum sativum L.) grown in a plant factory." Agronomy 10.3 (2020): 413.
[4] Zhang, Yong-Ping, et al. "Temperature effects on the reactive oxygen species formation and antioxidant defence in roots of two cucurbit species with contrasting root zone temperature optima." Acta Physiologiae Plantarum 34 (2012): 713-720.
[5] “Best Temperature for Hydroponics.” North Slope Chillers, 20 Aug. 2018, northslopechillers.com/blog/best-temperature-for-hydroponics. Accessed 13 Apr. 2023.
[6] Garone, E., Di Cairano, S., & Kolmanovsky, I. (2017). Reference and command governors for systems with constraints: A survey on theory and applications. Automatica, 75, 306-328. https://doi.org/10.1016/j.automatica.2016.08.013.
[7] I. Kolmanovsky, E. Garone and S. Di Cairano, "Reference and command governors: A tutorial on their theory and automotive applications," 2014 American Control Conference, Portland, OR, USA, 2014, pp. 226-241, doi: 10.1109/ACC.2014.6859176.
[8] A Comprehensive Survey on Safe Reinforcement Learning Javier García Fernando Fernández Universidad Carlos III de Madrid, Avenida de la Universidad 30, 28911 Leganes, Madrid, Spain.
[9] Kotevska, Olivera, et al. Rl-hems: Reinforcement learning based home energy management system for HVAC energy optimization. Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States), 2020.
[10] Wiering, Marco A., and Martijn Van Otterlo. "Reinforcement learning." Adaptation, learning, and optimization 12.3 (2012): 729.
[11] Huang, Jing-Wen & Gao, Jia-Wen. (2020). How could data integrate with control? A review on data-based control strategy. International Journal of Dynamics and Control. 8. 1-11. 10.1007/s40435-020-00688-x.
[12] Wang, Qiang, and Zhongli Zhan. "Reinforcement Learning Model, Algorithms and Its Application." 2011 International Conference on Mechatronic Science, Electric Engineering and Computer (MEC) (2011): 1143-146. Web.
[13] Schulman, J., Wolski, F., Dhariwal, P., Radford, A., & Klimov, O. (2017). Proximal Policy Optimization Algorithms. ArXiv. https://doi.org/10.48550/arXiv.1707.06347.
[14] J. Segawa, Y. Shirota, K. Fujisaki, T. Kimura and T. Kanai, "Aggressive use of Deep Sleep mode in low power embedded systems," 2014 IEEE COOL Chips XVII, Yokohama, Japan, 2014, pp. 1-3, doi: 10.1109/CoolChips.2014.6842956.
[15] Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., & Zaremba, W. (2016). OpenAI Gym. ArXiv. /abs/1606.01540.
[16] Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., & Dormann, N. (2021). Stable-baselines3: Reliable reinforcement learning implementations. The Journal of Machine Learning Research, 22(1), 12348-12355.
[17] Pedram, Massoud. "Power optimization and management in embedded systems." Proceedings of the 2001 Asia and South Pacific Design Automation Conference. 2001.
[18] Homayoun, Houman, Mohammad Makhzan, and Alex Veidenbaum. "Multiple sleep mode leakage control for cache peripheral circuits in embedded processors." Proceedings of the 2008 international conference on Compilers, architectures and synthesis for embedded systems. 2008.