Agent-based Decision Making and Control of Manufacturing System Considering the Joint Production, Maintenance, and Quality by Reinforcement Learning

Mohammad Reza Nazabadi; Seyed Esmaeil Najafi; Ali Mohaghar; Farzad Movahedi Sobhani

doi:10.31181/dmame712024885

Authors

Mohammad Reza Nazabadi Department of Industrial Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran https://orcid.org/0000-0003-2127-1507
Seyed Esmaeil Najafi Department of Industrial Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran https://orcid.org/0000-0002-8734-5436
Ali Mohaghar Faculty of Management, University of Tehran, Tehran, Iran https://orcid.org/0000-0002-9844-1714
Farzad Movahedi Sobhani Department of Industrial Engineering, Science and Research Branch, Islamic Azad University, Tehran, Iran https://orcid.org/0000-0002-4602-2710

DOI:

https://doi.org/10.31181/dmame712024885

Keywords:

Reinforcement learning, Agent-based modeling, Production Planning, Maintenance, Quality control, Real-time decision making

Abstract

Taking an integrated approach towards production, maintenance, and control in manufacturing systems is crucial due to the profound impact of their interconnections. Investigating these aspects in isolation may lead to infeasible solutions. This research focuses on the real-time and autonomous decision-making process concerning joint production planning, maintenance, and quality problems in a stochastic deteriorating production system with limited maintenance activities. Formulating the problem as a continuous semi-Markov decision process accounts for the complexities of the real production system and the occurrence of events over an uneven and continuous period. While dynamic programming is a common tool for addressing joint optimization problems, it has limitations, such as the curse of dimensionality. In this study, the optimal policy of the decision-maker agent is obtained by the goal-directed machine learning method called (R-SMART) and agent-based modeling. To the author's knowledge, the proposed approach is novel, and there is little research on such an implementation of the joint optimization problem. The quality of the optimal policy is evaluated through heuristic and simulation-optimization methods in various scenarios. The results demonstrate that the proposed RL-based method outperforms others in most scenarios, achieving a stable, integrated optimal policy.

Downloads

Download data is not yet available.

References

Saidi-Mehrabad, M., Paydar, M. M., & Aalaei, A. (2013). Production planning and worker training in dynamic manufacturing systems. Journal of Manufacturing Systems, 32(2), 308–314. https://doi.org/10.1016/j.jmsy.2012.12.007

Liu, Q., Dong, M., & Chen, F. F. (2018). Single-machine-based joint optimization of predictive maintenance planning and production scheduling. Robotics and Computer-Integrated Manufacturing, 51, 238–247. https://doi.org/10.1016/j.rcim.2018.01.002

Rivera-Gómez, H., Gharbi, A., Kenné, J.-P., Montaño-Arango, O., & Corona-Armenta, J. R. (2020). Joint optimization of production and maintenance strategies considering a dynamic sampling strategy for a deteriorating system. Computers & Industrial Engineering, 140, 106273. https://doi.org/10.1016/j.cie.2020.106273

Sharp, M., Ak, R., & Hedberg Jr, T. (2018). A survey of the advancing use and development of machine learning in smart manufacturing. Journal of Manufacturing Systems, 48, 170–179. https://doi.org/10.1016/j.jmsy.2018.02.004

Sharma, A., Zhang, Z., & Rai, R. (2021). The interpretive model of manufacturing: a theoretical framework and research agenda for machine learning in manufacturing. International Journal of Production Research, 59(16), 4960-4994. https://doi.org/10.1080/00207543.2021.1930234

Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.

Das, T. K., Gosavi, A., Mahadevan, S., & Marchalleck, N. (1999). Solving semi-Markov decision problems using average reward reinforcement learning. Management Science, 45(4), 560–574. https://doi.org/10.1287/mnsc.45.4.560

Beheshti Fakher, H., Nourelfath, M., & Gendreau, M. (2017). A cost minimisation model for joint production and maintenance planning under quality constraints. International Journal of Production Research, 55(8), 2163–2176. https://doi.org/10.1080/00207543.2016.1201605

Nourelfath, M., Nahas, N., & Ben-Daya, M. (2016). Integrated preventive maintenance and production decisions for imperfect processes. Reliability Engineering & System Safety, 148, 21–31. https://doi.org/10.1016/j.ress.2015.11.015

Aghezzaf, E.-H., Khatab, A., & le Tam, P. (2016). Optimizing production and imperfect preventive maintenance planning׳ s integration in failure-prone manufacturing systems. Reliability Engineering & System Safety, 145, 190–198. https://doi.org/10.1016/j.ress.2015.09.017

Chen, Y.-C. (2013). An optimal production and inspection strategy with preventive maintenance error and rework. Journal of Manufacturing Systems, 32(1), 99–106. https://doi.org/10.1016/j.jmsy.2012.07.010

Chouikhi, H., Khatab, A., & Rezg, N. (2014). A condition-based maintenance policy for a production system under excessive environmental degradation. Journal of Intelligent Manufacturing, 25, 727–737. https://doi.org/10.1007/s10845-012-0715-9

Khatab, A., Diallo, C., Aghezzaf, E.-H., & Venkatadri, U. (2019). Integrated production quality and condition-based maintenance optimisation for a stochastically deteriorating manufacturing system. International Journal of Production Research, 57(8), 2480–2497. https://doi.org/10.1080/00207543.2018.1521021

Hadian, S. M., Farughi, H., & Rasay, H. (2021). Joint planning of maintenance, buffer stock and quality control for unreliable, imperfect manufacturing systems. Computers & Industrial Engineering, 157, 107304. https://doi.org/10.1016/j.cie.2021.107304

Cheng, G. Q., Zhou, B. H., & Li, L. (2018). Integrated production, quality control and condition-based maintenance for imperfect production systems. Reliability Engineering & System Safety, 175, 251–264. https://doi.org/10.1016/j.ress.2018.03.025

Bouslah, B., Gharbi, A., & Pellerin, R. (2016). Integrated production, sampling quality control and maintenance of deteriorating production systems with AOQL constraint. Omega, 61, 110–126. https://doi.org/10.1016/j.omega.2015.07.012

Bouslah, B., Gharbi, A., & Pellerin, R. (2018). Joint production, quality and maintenance control of a two-machine line subject to operation-dependent and quality-dependent failures. International Journal of Production Economics, 195, 210–226. https://doi.org/10.1016/j.ijpe.2017.10.016

Fakher, H. B., Nourelfath, M., & Gendreau, M. (2018). Integrating production, maintenance and quality: A multi-period multi-product profit-maximization model. Reliability Engineering & System Safety, 170, 191–201. https://doi.org/10.1016/j.ress.2017.10.024

Rivera-Gómez, H., Gharbi, A., & Kenné, J. P. (2013). Joint production and major maintenance planning policy of a manufacturing system with deteriorating quality. International Journal of Production Economics, 146(2), 575–587. https://doi.org/10.1016/j.ijpe.2013.08.006

Ghaleb, M., Taghipour, S., Sharifi, M., & Zolfagharinia, H. (2020). Integrated production and maintenance scheduling for a single degrading machine with deterioration-based failures. Computers & Industrial Engineering, 143, 106432. https://doi.org/10.1016/j.cie.2020.106432

Sharifi, M., & Taghipour, S. (2021). Optimal production and maintenance scheduling for a degrading multi-failure modes single-machine production environment. Applied Soft Computing, 106, 107312. https://doi.org/10.1016/j.asoc.2021.107312

Kang, Z., Catal, C., & Tekinerdogan, B. (2020). Machine learning applications in production lines: A systematic literature review. Computers & Industrial Engineering, 149, 106773. https://doi.org/10.1016/j.cie.2020.106773

Kuhnle, A., Jakubik, J., & Lanza, G. (2019). Reinforcement learning for opportunistic maintenance optimization. Production Engineering, 13(1), 33–41. https://doi.org/10.1007/s11740-018-0855-7

Xanthopoulos, A. S., Kiatipis, A., Koulouriotis, D. E., & Stieger, S. (2017). Reinforcement learning-based and parametric production-maintenance control policies for a deteriorating manufacturing system. IEEE Access, 6, 576–588. https://doi.org/10.1109/ACCESS.2017.2771827

Schwartz, A. (1993). A reinforcement learning method for maximizing undiscounted rewards. Proceedings of the Tenth International Conference on International Conference on Machine Learning )pp. 298–305). Amherst MA USA : Morgan Kaufmann Publishers Inc. https://dl.acm.org/doi/10.5555/3091529.3091568

Paraschos, P. D., Koulinas, G. K., & Koulouriotis, D. E. (2020). Reinforcement learning for combined production-maintenance and quality control of a manufacturing system with deterioration failures. Journal of Manufacturing Systems, 56, 470–483. https://doi.org/10.1016/j.jmsy.2020.07.004

Yang, H., Li, W., & Wang, B. (2021). Joint optimization of preventive maintenance and production scheduling for multi-state production systems based on reinforcement learning. Reliability Engineering & System Safety, 214, 107713. https://doi.org/10.1016/j.ress.2021.107713

Wang, H., Yan, Q., & Zhang, S. (2021). Integrated scheduling and flexible maintenance in deteriorating multi-state single machine system using a reinforcement learning approach. Advanced Engineering Informatics, 49, 101339. https://doi.org/10.1016/j.aei.2021.101339

Huang, J., Chang, Q., & Arinez, J. (2020). Deep reinforcement learning based preventive maintenance policy for serial production lines. Expert Systems with Applications, 160, 113701. https://doi.org/10.1016/j.eswa.2020.113701

Rodríguez, M. L. R., Kubler, S., de Giorgio, A., Cordy, M., Robert, J., & Le Traon, Y. (2022). Multi-agent deep reinforcement learning based Predictive Maintenance on parallel machines. Robotics and Computer-Integrated Manufacturing, 78, 102406. https://doi.org/10.1016/j.rcim.2022.102406

Lee, J., & Mitici, M. (2023). Deep reinforcement learning for predictive aircraft maintenance using probabilistic remaining-useful-life prognostics. Reliability Engineering & System Safety, 230, 108908. https://doi.org/10.1016/j.ress.2022.108908

Wesendrup, K., & Hellingrath, B. (2023). Post-prognostics demand management, production, spare parts and maintenance planning for a single-machine system using Reinforcement Learning. Computers & Industrial Engineering, 179, 109216. https://doi.org/10.1016/j.cie.2023.109216

Ye, Z., Cai, Z., Yang, H., Si, S., & Zhou, F. (2023). Joint optimization of maintenance and quality inspection for manufacturing networks based on deep reinforcement learning. Reliability Engineering & System Safety, 236, 109290. https://doi.org/10.1016/j.ress.2023.109290

Geurtsen, M., Adan, I., & Atan, Z. (2023). Deep reinforcement learning for optimal planning of assembly line maintenance. Journal of Manufacturing Systems, 69, 170-188. https://doi.org/10.1016/j.jmsy.2023.05.011

Macal, C., & North, M. (2014). Introductory tutorial: Agent-based modeling and simulation. Proceedings of the 2014 Winter Simulation Conference. IEEE. https://doi.org/10.1109/WSC.2014.7019874

Cuevas, E. (2020). An agent-based model to evaluate the COVID-19 transmission risks in facilities. Computers in Biology and Medicine, 121, 103827. https://doi.org/10.1016/j.compbiomed.2020.103827

Santos, F., Nunes, I., & Bazzan, A. L. C. (2020). Quantitatively assessing the benefits of model-driven development in agent-based modeling and simulation. Simulation Modelling Practice and Theory, 104, 102126. https://doi.org/10.1016/j.simpat.2020.102126

Borshchev, A., & Filippov, A. (2004). From system dynamics and discrete event to practical agent based modeling: reasons, techniques, tools. The 22nd International Conference of the System Dynamics Society. England: Oxford.

Gosavi, A. (2004). Reinforcement learning for long-run average cost. European Journal of Operational Research, 155(3), 654–674. https://doi.org/10.1016/S0377-2217(02)00874-3

Singh, S. P. (1994). Reinforcement learning algorithms for average-payoff Markovian decision processes. Proceedings of the Twelfth AAAI National Conference on Artificial Intelligence (pp. 700-705). Seattle Washington: AAAI Press.

Gosavi, A. (2011). Target-sensitive control of Markov and semi-Markov processes. International Journal of Control, Automation and Systems, 9, 941-951. https://doi.org/10.1007/s12555-011-0515-6

van Horenbeek, A., Buré, J., Cattrysse, D., Pintelon, L., & Vansteenwegen, P. (2013). Joint maintenance and inventory optimization systems: A review. International Journal of Production Economics, 143(2), 499–508. https://doi.org/10.1016/j.ijpe.2012.04.001

Gosavi, A. (2015). Simulation-based optimization. Berlin: Springer.