The remarkable recent advances in Large Language Models (LLMs) and autonomous AI systems are strongly connected to a powerful training framework known as Reinforcement Learning (RL). This approach empowers AI systems to learn optimal strategies through reward-driven interactions, combining mathematical elegance with practical utility. In this project, students will explore foundational concepts such as value and policy optimization, and deep reinforcement learning, gaining both theoretical insight and practical skills essential for building intelligent, adaptive systems.
Strong knowledge of Python. Familiarity with basic probability theory.