Hybrid AI example with Java, TicTacToe Reinforcement-Learning and NN
Training an AI player for the Tic Tac Toe game.
A simple example combining Reinforcement learning with Neural network.
I want to train the AI player to be able to beat an opponent which plays randomly.
The training method is simple.
The AI also plays randomly and after the game finishes A positive reward is given if it wins or a negative reward it if it looses.
The input size for the NN is 18 for the pieces on the board, which represent the state. Hidden layer is also 18.
9 for the AI pieces and 9 for opponents pieces. A 0 value is for no piece and 1 is a piece at the index. The output size is 9. The max value of the values is the preferred action. The back propagation is only for one output neuron, the chosen action.
Because I don’t know the next state for the Q-learning, the states are ‘delayed’ in the Q-learning algorithm.
It uses prev state and current state instead.
Winning results after 70.000 training games:
1000 games played.
Hero (ai) = 616
Villain = 304
Draw = 80
Hero (ai) = 475
Villain = 449
Draw = 76
Hero (ai) = 807
Villain = 109
Draw = 84
Eclipse project download url options
Local download, but rename the doc file to a zip file