Here I present some of the self-study mini-projects I’ve set myself in an effort to grasp some machine learning concepts. I’ve written a few papers on them, and there’s even some code to look at if you like. But if you’ve never been here before, you might want to check out a few health warnings.
Applying an n-step Sarsa to the cart-pole balance problem
A classic machine learning problem where the learner is trying to keep a pole upright.
The animation above is the actual result from one experiment. Have a read of the pdf paper to get a feel for what’s going on, and if you’re really feeling adventurous you can download the zipped Excel file “Cart pole learner”, check out the code in the macros, and even run the learner if you like.
A couple of things to be aware of:
- Everything I learnt about the underlying reinforcement learning principles applied here, and the n-step SARSA algorithm, came from the excellent Sutton & Barto book Reinforcement Learning: An Introduction shown here with my files. If nothing else, the first chapter gives a great intro to what reinforcement learning is all about.
- I don’t really give you many clues as to what’s going on in the Excel file, so don’t expect to have much of an idea without some serious commitment. That’s my fault (lack of time). But maybe you’ll be able to get the gist of how the core algorithm is being implemented…
Control algorithm for simple robot arm using evolutionary approach
Investigating the multi-layer perceptron
Optimum path problem using evolutionary algorithm and CTRNN
Training an automated Tic tac toe player using reinforcement learning
- When I set myself all these mini-projects it was never with the goal of presenting them to anyone. As a result, I haven’t made it easy for you to follow what I was up to, nor unfortunately do I have the time to do that. Everything I did in terms of documenting my approach and results was only:
- A mechanism to help me get things straight in my own head; and
- Intended as an aide memoire for me (albeit a clear and relatively extensive one).
- Similarly, I never intended anyone else using the code I wrote, so whatever code is in here may well be difficult to follow. I simply haven’t had time to document it to the point where others can immediately grasp what’s going on.
- Even if I had documented the code in such a way, it would still be difficult to follow! That is unless you have a reasonably firm grasp of the form of machine learning algorithm that’s being implemented.
- There be maths in here. Quite a lot; reasonably advanced.
What’s the point of presenting anything at all then? Well, first, this is my showcase. If I’m telling people I’ve done some fooling round with machine learning, then I need to be backing that up with something. This is the place to do that. And second, even if you have zero knowledge, I hope you might at least get a flavour of what’s going on…