This project focuses on training a reinforcement learning agent to capture Pac-Man using Markov Decision Process (MDP) modeling in a custom Gym (Gymnasium) environment. As a group, we implemented and evaluated different reinforcement learning algorithms, including Q-Learning, Deep Q-Learning, Actor-Critic, SARSA, and Dyna-Q. Each algorithm was trained and tested to compare performance in a pursuit-evasion scenario. The system supports training, testing, and performance analysis across different models. Deep Q-Learning achieved the best performance with a 99.77% win rate, outperforming all other approaches.
The goal of this project was to design and implement a reinforcement learning system that trains a ghost agent to efficiently pursue and capture Pac-Man in a simulated environment.
Python, Gymnasium, PyGame, PyTorch, NumPy
- Custom Gym (Gymnasium) environment for Pac-Man pursuit-evasion simulation
- Markov Decision Process (MDP) based state and reward modeling
- Implementation of multiple RL algorithms (Q-Learning, SARSA, Dyna-Q, Deep Q-Learning, Actor-Critic)
- Training pipeline for comparing agent performance across models
- Real-time simulation using PyGame visualization
Demo.mp4
- Designed and implemented the Actor-Critic reinforcement learning model along with its training pipeline
- Optimized hyperparameters, tested performance, and analyzed training results for Actor-Critic experiments
- Contributed to reward structure design to improve learning stability and agent performance
- Assisted in developing and refining game visuals, and supported data collection and analysis for evaluation
- Contributed to system integration, debugging, and final validation across all reinforcement learning models
This project was developed as part of a group-based academic assignment and is kept private in accordance with academic integrity policies. A demo showcasing the original system functionality is included.