Improved Q-learning Algorithm to Solve the Permutation Flow Shop Scheduling Problem


  • Zifeng Xuan
  • Yunfei Liu
  • Xinxin Peng



Q-learning Algorithm, Boltzmann Action Exploration Strategy, Permutation Flow Shop


A modified Q-learning algorithm is proposed for the permutation flow shop scheduling problem. This algorithm initializes the environment with the job sequence and considers each processable job as an executable action. A reward function is defined as the reciprocal of the completion time. Moreover, the completion time is calculated using the principle of diagonalization of a two-dimensional matrix, significantly enhancing computational efficiency. The Boltzmann action exploration strategy is designed, where the probability of selecting an action decreases as the temperature coefficient T decreases, and the probability of randomly selecting an action decreases, favoring the selection of actions corresponding to larger Q values. Finally, the performance of the proposed algorithm is validated using instances of permutation flow shop scheduling problems of different scales. By comparing the results with standard instances and other algorithms, the accuracy of the algorithm is demonstrated.


How to Cite

Xuan, Z., Liu, Y., & Peng, X. (2024). Improved Q-learning Algorithm to Solve the Permutation Flow Shop Scheduling Problem. International Journal of Mechanical and Electrical Engineering, 2(3), 63-68.

