# A traveling salesperson heuristic in NlogN time

## By repurposing a common machine learning algorithm, we can get a fast solution to a notoriously difficult problem.

A version of this blog post is available on my website.

# Is this heuristic worthwhile?

Though, instead of taking hours or days to complete, each were solved in less than 5 minutes. The time to complete each of these countries are plotted on the following chart:

This algorithm has not been compared to other NlogN algorithms (suggestions welcome); however, it has been compared to the Smallest Insertion (SI) algorithm, which take N² time. Running 1000 tests on random nodes when N=100, recursive clustering is better than SI more than 59 percent of the time, despite SI taking considerably longer.

My testing can be found in this Jupyter notebook.

# Software Implementation

`import recursive_clustering as rc# Create 100 random points and draw solution with K = 4rc.solve_random(N=100, K=4)# Given a numpy array, draw the solutionrc.solve_array(all_nodes, K=4)# given a two-column CSV of X and Y, draw the solutionrc.solve_file('testFiles/usa115475.csv', K=5, draw=False)`

All these functions return a dictionary, the tour length, and the wait time. The dictionary has the following structure:

`path_d = {     ID0: {          'center': [X, Y],          'connections': [ID#, ID#],          'subnodes': [[x, y]....[x ,y]],          'hasBeenSplit': True     },     ID1: ...}`

This dictionary contains all clusters and their subnodes. If you wanted to access only the final nodes, you’d look only for keys whose `subnodes` list has a length one or haven’t been split. Or you can go to the last ID and follow the connections in a circle. Suggestions for better implementations are welcome.

# Can we make it better?

Additionally, although I haven’t yet run any tests, this algorithm would theoretically run in NlogK(N) time on higher-dimensional TSP as well.

Questions? Did you find mistakes? Contact me and see more projects on my website.

Written by