Unfortunately this post is simply a terse review of the DFO algorithm; I have a busy schedule this week and don’t have a lot of time, so I will highlight some of the key features of applying it to various error surfaces.
If you want an intro to DFO, I suggest you read the second post in the series (again link is above.) Essentially, studying an algorithm like DFO is really interesting because it is about creating incredibly simple agents that interact with one another, and from that, incredibly complex behaviour arises that can be applied to solve many types of problems.
Below, blink and you’ll miss it, is DFO quickly finding the error surface’s global optimum. Here, DFO is maximising the problem surface, denoted as reward in the graphs title. The problem surface is at it’s maximum when it is red and minimum when it is blue. The population of flies are the small black dots and the best fly of the iteration can be found below by the larger white dot.
What if the problem is harder?
Let us assume the problem surface is moving and multimodal. This presents a considerably harder challenge. If we apply DFO naively in the same manner above we will experience sub-par performance, which we can see below.
Above, we can see that the reward for the current iteration of optimisation is rarely optimal and furthermore, the exploration of the surface is less than satisfactory. There are a few issues with the above naive solution; the method of randomisation; lack of elitism; most importantly, the method of updating the flies position.
A minor improvement can be made by changing how the flies position coordinates are reset when the uniform distribution is sampled and the result is less than the disturbance threshold. As it stands in the naive algorithm, each dimension of the position for each fly is independently updated or randomly reset, meaning often the flies are constrained to being reset along a vertical line or horizontal line. You can see this around the border of the above gif. Changing the reseting method, so that it completely randomises the flies position if a single sample drawn from a uniform distribution is less than the disturbance threshold will enable a more random position reset. It is then important to multiply the disturbance threshold by the number of dimensions — in our case two, to ensure that a comparable amount of random resets are retained.
The next improvement is to add some elitism. This rule simply states that the n-best flies are not updated or reset. This greedily ensures we do not degrade our best performing flies.
The final modification is the most important one. As it stands, for each fly, if we are not disturbing their position and randomising them, we are updating them. The current manner of updating assigns the fly the position of its best neighbour, and moves it towards the globally best fly. This implicitly assumes a unimodal distribution, and is not optimal for our multimodal problem. We can simply fix this by instead updating the fly to its best neighbour, and move it in the same vector leading from it’s original position to it’s best neighbour’s position.
The result of the aforementioned improvements is the above — a much more effective tracking of the multimodal problem.
Thanks very much for reading, I hope to grab your attention again in the future!