Tuesday, March 31, 2020

NN --> BT


I have started sketching more seriously on ideas for converting a neural network output to a solid behavior tree (BT). For reference, here is the GitHub repo I'm working in https://github.com/marcus1337/BTMaker. The repo is a bad brew of different GitHub repos + the new code that I am making (in the /src folder). So, it might be hard to try it for yourself. In short it uses "BehaviorTree.CPP", my NEAT project and some non-official boost plugin "Levenshtein-distance".

Anyways the number of output nodes from the NN can be adjusted before the code starts. My idea so far is to use 2 output nodes per BT node. The first of 2 output nodes classify which type of node it is, I.e. None, Action, Interior or Condition. The 2nd output node then maps to an integer between [0, #nodes] telling which node implementation from an array should be used. By assuming that all nodes’ positions in the BT go from left to right, I will just define the outputs position from left to right iteratively. I will let the root node always be a sequence (interior) node to make things easier. Next I will iterate all output nodes one at a time, if it is an interior node I "step into it" and continue from there. If not, I just add the node to the rightmost position possible given the currently selected interior node. However, a special case is if the node is a condition, then it will be placed 1 step to the left of whatever the current position is, and if that means breaking out of the currently selected interior node, then so be it. Finally, I let an evaluation function check if it is a valid BT, if not I will probably just trash it completely and give the NN a bad reward value. Each time a node has the "none" value, that means to pop out of the current interior to whatever the parent is, one step to the right. This means that if the last output node is an interior node, then the produced tree won’t be valid. I plan to call a tree finished once "none" is called while the root node is selected. Ideally one would not want to have too many output nodes in a NN, or too few nodes when classifying something. I feel like this plan to create a BT from a NN makes decent compromises between having a rational amount of output nodes with the fact that I will need to classify the output. So instead of using 4 output nodes per classification I will try with only 1.

For initial tests I wish to just train a network with a bias to output any valid BT structure and see if it can do so consistently. Naturally I will use my NEAT implementation to give rewards to networks that produce valid BTs and eliminate bad networks.

Tuesday, March 10, 2020

Niching

While that I felt that my C++ NEAT implementation technically works in most areas, I still had to fix the problem handling complexifying. Complexifying is the process how keeping innovations when developing the neural networks. So far, I have only tried to blindly implement the solution described by this online paper https://www.cs.cmu.edu/afs/cs/project/jair/pub/volume21/stanley04a-html/node3.html#SECTION00033000000000000000.

The papers suggestion is making my code way slower though, and also, the paper is from 1999. This makes me wonder if there exists better genetic algorithms today. I will probably look into other ways to save innovation. But if the extra slowness is disregarded I guess then main remaining problems include deciding on specific parameters used in the code as well as adding more ways to create new generations. Examples being different elitism methods or roulette-selection. Finally when all these sub-issues are resolved good enough I must get better acquainted with behavor trees, which will play a large role in my thesis work.

Thursday, March 5, 2020

Starting Out / Idea

I have a thesis idea which both an examiner as well as a supervisor at KTH have agreed to oversee. Officially, the thesis work has not started. The starting phase will be in "period 4", which earliest is on March 16. Most likely, my idea will be okay so I may as well start with the work now, rather than waiting for an official notice. The basis for the thesis idea will likely remain unchanged, altough the direction may go towards some specific area.

Title: Adapting Behavior Trees using NEAT

Description:

Behavior trees are used a lot in the video games industry for creating AIs. A BT consist of nodes, also known as leaves. The leaves contain code that execute arbitrary actions. In a BT all leaves are executed orderly, unless certain conditions make some be skipped. It can go fast for developers to create trivial BTs manually. Only utilizing manual tweaking sets limitations to AIs. The reason is that it’s impossible to hard-code actions to handle all possible events that can occur in most games. Recently, attempts have been made to mix the process of hard-coding BTs with machine learning. This is done as a compromise between hard-coding and handling as many different events as possible. In this thesis I want to use a genetic algorithm called NEAT with pre-defined leaves to automatically generate advanced BTs. How well the final solution solves a set of problems will be measured, along with other performance tests. By gathering data, it becomes possible to compare the prototype solution made for my thesis with other AIs and find out whether there are pragmatic reasons for favoring the use of NEAT when making BTs.

Connection to existing research/development:

The research is interesting to people that create AIs for video games. In games you do not always want the AI to perform so well that no human can win. On the other side you do not want an AI that is too bad either. I believe that answering the research question may help provide a tool that can make it easier to quickly prototype AIs for games. Other people, as seen in some research papers, have for the most part created highly specialized solutions when combining BTs with genetic algorithms or machine learning. They leave some questions unanswered in favor of potential future work. The full capabilities of using a NEAT therefore remains unknown. Existing research has shown that genetic algorithms can be used for creating an AI for certain types of games, such as RTS games. However, in the RTS game mentioned the agent was given complete information about the world. Allowing an agent to have full information is in many ways like cheating, and something that should be avoided if possible. On the plus side though, by having complete information it can be easier to filter out redundant information in order to deal with the curse of dimensionality.

Resources:

I have previous code for a NEAT (https://github.com/marcus1337/NEAT) that I can continue to work on. There seems to exist free and open source solutions for BTs on the internet.

I also got myself the book "Behavior Trees in Robotics and AI" by Michele Colledanchise and Petter Ă–gren. The book brings up different topics related to particularly BTs, but also machine learning. It would have been useful if the book had more pages, but by having it being short it makes it more likely that I will actually read all of it. The book have many visalizations and images (in color) that makes it easier to understand the topics than if it only had contained raw text.



Current Progress:

There were some apparent bugs in my existing NEAT code so I have spent the last few days fixing those and also refactored the code a lot. The code obviously use heap memory way more than what's necessary. What I will work on in the coming days is to make even more refactoring to the code in order to reduce calls to "new" and "delete" in C++. Not that I do that manually, but the STL data-structures does. A funny solution that I have implemented to mitigate this so far is that I have tried switching out the standard memory allocator to a better one called minimalloc (https://github.com/microsoft/mimalloc).

The goal of my NEAT code is to train neural networks. I tested the speed of training 50 neural networks 10 times by measuring the time in milliseconds with std::chrono. From the tests one can see improvements when the code is run in release mode before, and after minimalloc is used below;

[1] [2] [1] [2] 840 739 585 551 799 714 538 665 1009 745 615 819 1740 1195 768 973 3557 2240 1369 1325 8942 4614 1983 2240 23994 11048 1273 4947 <stop> 38007 2342 5246 <stop> <stop> 10940 13440

Here the columns 1,2 means test run 1 & 2. Each row is one function call consisting of 50*10 mutations as well as 50*10 quick random fitness-value assignments to the NEATs. Judging from these tests, minimalloc increase the speed of the current code-base more than 100%. As mentioned before, the code can still be optimized further, which is something that I am working on. Using the double-buffer pattern for creating new generations seemed to improve the speed a little bit. Hopefully, implementing two large arrays that are allocated only once will help even more. This is primarily because arrays improve cache-locality.