TF 2.0 for Reinforcement Learning

Lessons

Introduction
Setting up your Reinforcement Learning Environment
Markov Decision Processes
Introduction to the OpenAI Gym Interface
$Q$ learning
Gym Wrappers
Function Approximation and Tensorflow
$Q$ -learning with Tensorflow
Deep $Q$ -learning
Rainbow - Improvements to Deep $Q$ -learning
Policy Gradients
Advantage Actor-Critic (A2C)
Generalized Advantage Estimation (GAE)
Trust Region Policy Optimization (TRPO)
Proximal Policy Optimization (PPO)
Entropy
KL-Divergence
List of Important Papers
Neural Network Design