Reinforcement Learning Tic Tac Toe with Value Function
A reinforcement learning algorithm for agents to learn the tic-tac-toe, using the value function
machine-learning reinforcement-learning javascript tutorial article code

At any progression state except the terminal stage (where a win, loss or draw is recorded), the agent takes an action which leads to the next state, which may not yield any reward but would result in the agent a move closer to receiving a reward.

The value function is the algorithm to determine the value of being in a state, the probability of receiving a future reward.

The value of each state is updated reversed chronologically through the state history of a game, with enough training using both explore and exploit strategy, the agent will be able to determine the true value of each state in the game.

Don't forget to tag @jinglescode in your comment, otherwise they may not be notified.

Authors original post
PhD student, applying machine learning in healthcare
Share this project
Similar projects
Building AI Trading Systems
Lessons learned building a profitable algorithmic trading system using Reinforcement Learning techniques.
Reinforcement learning is supervised learning on optimized data
In this blog post we discuss a mental model for RL, based on the idea that RL can be viewed as doing supervised learning on the “good data”.
Acme: A Research Framework for Reinforcement Learning
A library of reinforcement learning components and agents.
Spinning Up in Deep RL (OpenAI)
An educational resource to help anyone learn deep reinforcement learning.