Skip to main content

RL Modules

Modules where we apply Reinforcement Learning to solve various real-world problems

📄️ Swarm Drones

Overview (a) Defense/Search-and-Rescue – drones coordinate for formation flying, area surveillance, and threat neutralization; (b) Logistics/Delivery – drones optimally deliver packages cooperatively, managing battery and routing. Using Microsoft AirSim (multi-drone support) and OpenAI Gym/PettingZoo environments, you will explore decentralized decision-making and advanced multi-agent RL algorithms like QMIX and MADDPG. The program covers multi-agent coordination, cooperative vs adversarial training, and robust engineering (monitoring each agent, reproducible multi-agent training). By the end, you’ll demonstrate a swarm in simulation handling both defense and delivery tasks with learned policies.

📄️ Self Driving Cars

Overview: This 180-day curriculum uses the CARLA simulator as the primary platform for autonomous driving RL. Trainees will start with imitation learning (behavioral cloning) and progress through deep RL algorithms (DDPG, SAC, PPO) to master lane following, obstacle avoidance, and intersection handling. Emphasis is on vision-based navigation, planning, and control integration, with engineering best practices (logging, CI/CD, reproducibility). By the end, you’ll deliver a real-time agent driving safely in a virtual city under varied weather and traffic conditions.

📄️ Chess Engine

Overview one learned purely via reinforcement learning self-play (inspired by AlphaZero), and one that mimics human play style (trained on human game datasets, akin to Maia chess). The program starts with constructing a basic chess environment and a policy-gradient agent, then progresses to implementing Monte Carlo Tree Search (MCTS) and self-play training to create a strong engine. In parallel, it covers supervised learning on millions of human games to create a human-like engine. We emphasize training pipelines (self-play game generation, distributed training), evaluation (Elo, playing vs Stockfish), and adjustable difficulty settings. The final deliverables are an RL-based chess engine that can play at various skill levels and a “mimic” engine that plays like a specific human or rating level.

📄️ Mahjong Engine

Overview: A 180-day curriculum to develop a Mahjong AI using reinforcement learning. The program will create a Mahjong environment (Japanese Riichi or a Chinese variant) and train a bot through selfplay with policy/value networks, carefully shaped rewards, and rule-based guidance due to the game’s complexity (imperfect information, large action space, sparse rewards). Additionally, a mimicry system will be developed to learn a specific player’s style from game logs (e.g., how a particular player discards tiles). We will utilize frameworks like RLCard (which supports Mahjong) and possibly custom simulators to handle game logic. The final output is a Mahjong AI that can play at a reasonable level, plus an imitation model that replicates a human player’s strategy for discards and hand decisions.