Plan optimal trips automatically with Python and Operations Research: from whiteboard to dashboard

A step-by-step guide to framing, modeling, analyzing and solving tourism logistics problems by applying optimization and artificial intelligence, the agile way

17 min readMay 9, 2023

Image generated by DALL·E 3 with author’s prompt: “Advanced optimization of touristic trips with data analytics and AI”

👁️ This is the first article in a series covering the project “An Intelligent Decision Support System for Tourism in Python”. Reading this series will take you on a hands-on journey during which you will learn to design and implement a software solution—based on optimization models — that addresses a complex tourism planning problem. By framing it as an Operations Research project and approaching it from the lens of agile software development, we’ll solve the problem incrementally, in a series of sprints. Each sprint will be covered in a separate article, with every article covering the creation of a “component” of the solution.

1. The trip that sparked a journey

1.1 The inception: The holidays that got me working
1.2 The problem: One-time-only plannings do not exist
1.3 The solution: Automate decision making

2. Sprint 1: Framing the problem as an Operations Research project

2.1 Making things as simple as possible, and then simpler
2.2 Identifying the Minimum Valuable Problem
2.3 Problem-solving from first principles

3. The project timeline

3.1 Sprint 2: Build a model of the Traveling Salesman Problem (TSP)
3.2 Sprint 3: Code the TSP model in Python and analyze its results
3.3 Sprint 4: Generalize the distance data generation for an arbitrary set of sites
3.4 Sprint 5: Build an optimizer class that can be fitted to location data to solve TSPs automagically
3.5 Sprint 6: Visualizing Routes on Interactive Maps with Python: Part 1
3.6 Sprint 7, 8, 9, … k: In the making… subscribe to get an e-mail when it’s out! 📬

4. Conclusion (or planning for next sprint)

Manifesto

The true purpose of this series is not to help you plan your trips more efficiently — although that’s an unavoidable side benefit of building a software system to do that. The real purpose is to guide you through the whole process of building an optimization-based solution from the ground up, so you can learn the craft and apply it elsewhere. I chose a “trip planning” problem because, as it’s a ubiquitous problem that every person who has ever traveled has faced, I hope you will relate to it and find personal value in it.

What can you expect from reading this series? My attempt, at least, is that you learn these three main skills in the “same package”:

Analytical thinking: recognize the essential characteristics of a problem, reason critically, posit simplifying assumptions and verify their validity, do cost-benefit analyses to evaluate alternatives, do incremental modeling, etc. In general, apply the scientific method to a seemingly mundane problem that doesn’t seem scientific at first glance.
Model building in Pyomo: bring all that cognitive work to life by designing and implementing optimization models with the library Pyomo, also embedding them in a Python application that takes care of everything the models need: collecting the raw data, displaying the final results in a friendly dashboard, and everything in between.
Pragmatic agile development: learn to decompose problems into multiple subproblems that can be approached with more confidence and less vagueness. Design a component to solve each subproblem without losing sight of the system as a whole (systems thinking), and integrate the components gradually so that the whole system remains useful at all times. Divide and conquer, but then unite and prosper.

If you read (and code) along, you will experience the flow of a real-life data science project, learning to solve problems effectively with the powerful tool of optimization modeling under your belt. And, as a bonus, you’ll make better trip plans. I hope you enjoy the ride! 😊

1. The trip that sparked a journey

1.1. The inception: The holidays that got me working

A few months ago, my girlfriend and I were planning a trip to Paris. We knew in advance that we were going to spend 5 days there, so it was just a matter of choosing the places to visit during those days. After consulting some travel blogs, some YouTube videos on “what to do in Paris”, and some googling around, we arrived at a more or less definite list of places we wanted to visit. But, the job was not finished yet, for we needed to also decide which places to visit each day, and in which order.

So my girlfriend asked me to take on that task. As she had done the majority of the “research” about what is worth visiting in Paris, now it was my job to come up with a “more precise plan” for all those places worth visiting. In order to design a reasonable plan for the trip, I had to do my own research and take into account many things: how far apart each site was from our hotel, how far apart sites were from each other, what the opening hours were for some places like museums or churches, what the average prices were for some restaurants, … you get the idea. But the more I looked into it, the harder it got: apart from the obvious distance- or money-related limitations, there were also time-related restrictions, like “a breakfast at restaurant X must come before cocktails at pub Y”. So, as you can notice, the problem was getting pretty big. Of course, I used Google Maps extensively to compile all the necessary information about all the sites. I even considered dynamic information like the different travel times of different transportation modes, like subway, bus, or walking, to see which was the best way to get from A to B.

But during that search, I did not just compile information, I also made some decisions, many of them unconsciously, that were specific to the information I was gathering about the fixed list of sites we had. Google Maps helped me form a rough mental picture of the urban layout of Paris, and with that “picture” in mind, I was making decisions on the go. These decisions were “strongly coupled” with the information I was seeing, and once made, they were made. I hardly revisited them after gathering more data, and even if I had, the decisions would still be based on informed gut feeling. Informed, yes, but gut feeling after all.

Even though I knew the decisions I was making “on the go” were absolutely dependent on the particular sites we had picked, I didn’t think it was a problem. “A perfect plan doesn’t exist, so it’s okay if the final trip schedule was not very robust”, I thought. I just needed to do some research, craft some reasonable trip plan, and then stick to it. Right?

1.2. The problem: one-time-only plannings do not exist

When I finally finished the trip plan, I presented it to my girlfriend to get the final approval. I discovered I was late. Quite unexpectedly, she had been exploring what Paris had to offer a bit further, and lots of new, interesting places popped up. Some of the new cool sites she discovered had climbed up to the top of our fixed list, displacing others that were no longer necessary (but which appeared in my plan). Sigh. The schedules I had prepared now contained obsolete places that we no longer needed to visit, and lacked some other places that we just now had to visit.

I went back to work. Not all was lost, true, but those new “unexpected changes” forced me to go over things again, repeat work, and invest more time than I had initially thought. My programmer instinct started whispering: “There has to be a better way...”, but it was simply too early to attempt to solve this problem with software. Besides, this was a one-time plan I had to do for our Paris trip, so doing it manually, even twice, was good enough.

Some hours later, my girlfriend told me that she had also been seeing many nice places for us to see on our next trip to Rome, Italy. That she had already written down a long list of candidate places to visit, restaurants to eat at, and attractions to go to. The geek inside me was triggered: designing a trip plan was going to be a recurring task. The time had come to automate all this work.

1.3. The solution: automate decision making

I needed a way to “generate a trip plan” automatically, as the previous situation would repeat itself for any future trip involving some minimum complexity. But it wasn’t just about automating tedious work, or saving time, it was also about not missing out on an awesome trip just because the effort required to plan it manually would be too high. Thinking about many aspects at once is hard, so we all have this tendency to settle for “good-enough” plans. And I didn’t want this tendency, up to this moment unavoidable, to prevent us from discovering prettier places, cheaper deals, or better trips in general. What’s more, I didn’t want to go through the same planning process all over again whenever a new trip was due.

I wanted a solution that was both automatic and general so as to produce trip plans that were optimal, fast, and adaptable to changes in the data.

At first glance, I thought of throwing some machine learning into the problem. After all, it was a data problem. It sounded nice, but the moment I started digging into it, I noticed there was nothing to predict per se, nothing to infer a pattern from. So ML discarded.

My next thought was to go old-school and use a rule-based program. But when it comes to adding complex, intertwined requirements, such a rule-based program would pretty quickly become complicated and hard to modify. There was also the risk of the code becoming too coupled with the peculiar requirements of any given city or trip, so I dismissed that approach, too.

Then it hit me:

“Wait a minute! This problem is all about planning. It’s about what decisions I should make, given the things I know, and the resources and desires I have. This is an optimization problem!”

2. Sprint 1: Framing the problem as an Operations Research project

2.1. Making things as simple as possible, and then simpler

At this point, the problem that I had in my head was too complex, and too vaguely defined, to be implemented successfully in a program at one shot. I knew what information I could use, and I kind of knew what my goals were. The needed information was available because I had been compiling it during the “Google Maps research phase”, but my goals were just abstract aspirations like “maximize the total value of the trip”, “use money as wisely as possible”, or “visit the best sites without spending too much time in transportation”, etc. This level of description was too vague to be translated into mathematical expressions. Concepts like “value”, “wisely”, “best” or “too much” had to be defined in concrete terms before they could be expressible in mathematical terms. But there was a deeper obstacle: the problem was too complex to begin with. I needed to simplify it to be able to start thinking of a workable solution.

2.2. Identifying the Minimum Valuable Problem

The key question to ask ourselves is:

What is the simplest problem I can find, that relates to my actual problem in some relevant way, and that, if solved, would provably take me closer to the solution of my actual problem? The answer is the so-called Minimum Valuable Problem. Once we find such an essentialist version of our problem, we can start working on firm grounds. We can start designing a minimum viable product that will be also valuable.

To plan the whole schedule for the trip to Paris, these are the “inputs” I had to work with:

A list of sites that we wanted to visit during the trip.
The hotel we were going to stay at for the duration of the trip.
The number of days that the trip was going to last.

Here, a “site” is the general name for any place or thing that we would like to go to. Apart from these 3 things, all other information I had was derived from these basic inputs, such as the distance between sites, the average price of eating at a certain restaurant, the opening hours of a certain museum, etc. But this is all derived information that comes from the basic inputs. Considering all this data (distances, prices, time periods, etc.) at once doesn’t render our problem essential, so it wouldn’t be a minimum valuable problem. We need to find an even simpler problem to solve and build upon.

When I was doing the manual “routing and planning”, I simplified the creation of a “trip schedule” by decomposing it into two steps:

First I chose a subset of all the sites in the list that we could feasibly visit in a day.
Then I decided in which order we should visit those chosen sites, according to some sensible criteria.

The criteria I used for choosing the subset of sites was “geographic proximity”. Intuitively, this made sense: if some sites in the list happened to be sort of “clustered” in a certain area, it was reasonable to go to that area and spend the day visiting those sites, then return to the hotel, and repeat the next day for another “cluster” of sites. So I would input the whole list of sites into Google Maps and, just by looking at the arrangement of points in the map, try to categorize each site in a different cluster based on proximity. Once I had done that, I would take only the sites of a given group, and try to come up, visually again, with a reasonable route for that day. The route had to satisfy these two “must-have” requirements:

We had to visit each site only once.
We had to return to the same place where we started: the hotel.

And, ideally, the route should satisfy these “nice-to-have” requirements as well:

We wanted to spend as little time as possible going from site to site overall.
We wanted to end at a site close enough to a subway station that made it convenient to get back to the hotel.

Stated like this, the problem is much simpler. But there are two sources of complexity that make the problem still hard: one is the need for the final site to be “close enough to a convenient” subway station. The other is that there exist several “transportation modes” (subway, taxi, train, walking) we could choose from to move around. So to simplify the problem further, we need to cut out some “details of the problem”. This is done by introducing (reasonable) approximations.

The first approximation is to get rid of the “subway station requirement”, as it is hard to define explicitly, and the second is to limit ourselves to only one mode of transportation. We’ll select “walking”, as it’s possible to go anywhere in a city just by walking. Besides, choosing walking as the “mode of transport” simplifies the problem even further, as the goal of “spending as little time as possible in transportation” becomes now equivalent to the goal of “traversing as little distance as possible”. This equivalence rests on the assumption that we maintain a constant speed when walking from one site to the next. But thankfully, this is a reasonable assumption (with minor exceptions right after lunch).

Now the problem got much more manageable. It has a clear goal, “walk as little distance as possible”, and a set of precise and simple requirements to be satisfied: visit all sites only once and return to the original departure site. This is our minimum valuable problem.

The next step is to design an MVP, or Minimum Viable Product, which is nothing more than a program tailored to solve our minimum valuable problem. Where should we start?

2.3. Problem-solving from first principles

If you look around the internet for clues on this problem for a little bit, you will discover that there exists an exact replica of our “MVP problem” in the field of Operations Research (OR). It is the well-known Traveling Salesman Problem (TSP). Because of this, you can find online several (different) mathematical formulations for it. We could just pick one, and implement it at face value to solve our problem, yes, but we won’t. I’ll explain why.

Some of the formulations you can find online for the TSP are harder to interpret than others, but most, if not all, you will find as prepackaged models, without any explanation of how those models came to be, or what modeling decisions were made and why during the formulation process. If you were to blindly implement any of those formulations “as-is”, it would work, granted. But soon afterward, if your problem changed, or you wanted to improve the solution, you would be empty-handed and wouldn’t know how to start. This is not desirable, because in this article series I will extend this proof-of-concept way beyond the Traveling Salesman Problem, so I need you to follow along when we carry out all those model extensions.

Using a “pre-packaged” model is of no help if you don’t understand what “the math says”. But it is only by understanding what the math says that you will be able to adapt or expand a prepackaged model to suit your unique needs. Needs that always arise in real life.

That is why I will not just provide you, out of a hat, with a complete mathematical model already cooked in its final form. That would be easy for me, but useless for you.

💡 There is no royal road to optimization modeling
The purpose of the whole article series is teaching you to do modeling, not giving you a pre-packaged model for you to plug-and-play. Of course, the final model(s) will be there, available for people in a hurry, but I take my time to explain my thought-process, as the reader I have in mind is one seeking to develop the skill of modeling. Therefore, all the modeling decisions we’ll make will be done from first principles. Every step will be properly explained, with no leaps of faith that leave you confused, nor any unreasoned claims that make you doubtful. That way, you will gradually—but surely — get familiar with the patterns, approaches and strategies involved in modeling, and you’ll develop the intuitions that will equip you to tackle any problem, not just the ones covered in this article series.

I want you to see modeling in the making, and cultivate critical thinking in you, my dear reader, during the process. That is why I will be reasoning things out loud (with all my wetware limitations) while we do problem-solving together. Even if this is your first exposure to mathematical modeling or optimization in general, you will be able to follow along and learn the tactics. That’s the power of first principles reasoning.

I will have succeeded in my quest if, by the end of this whole series, you understand the thought processes that underlie all projects under the rubrics of “prescriptive analytics”, “optimization modeling”, “decision intelligence”, or any new fancy label that people use now to refer to this branch of Mathematics. And even if I don’t succeed in that front, at the very least, I hope I’m able to help you plan better trips for yourself, when you make use of the final models and interactive web application that this article series culminates in.

Let’s jump in.

3. The project timeline

In this article we have covered the “first stage” of the [Agile Operations Research (AOR) workflow]: the “Problem Description” stage. It is represented in the green circle below:

**Figure 1.** Minimalist workflow to problem-solving in OR. (Image by author)

But this is just the inception of the project. The next stages (and more not shown in the workflow above) will be illustrated in subsequent articles. Since this is a complex tourism problem we have chosen to crack here, the work involved in building gradually a solution for it is organized as a series of “sprints”, which correspond to different articles of this series:

Sprint 1: Description of the Minimum Valuable Problem for a tourist

You just completed it by reading this article.

Sprint 2: Build a model of the Traveling Salesman Problem (TSP)

In which a formulation of a mathematical model for the TSP, our minimum valuable problem, is explained from first principles and derived from scratch, which you can read at:

Modeling the Traveling Salesman Problem from first principles

A concepts-first, math-second approach to modeling the most well-known routing problem in Operations Research

towardsdatascience.com

Sprint 3: Code the TSP model in Python and analyze its results

The mathematical model developed in “sprint 2” is implemented in Python with Pyomo, then solved for some sample data of a set of sites in Paris. Afterward, the preliminary results are analyzed and (spoiler alert) a glitch is found with the formulation, which is then corrected, re-implemented, and verified. Check it out at:

Implementing, solving and visualizing the Traveling Salesman Problem with Python

Learn how to translate an optimization model from Math to Python, optimize it, and visualize the solution to gain quick…

towardsdatascience.com

Sprint 4: Generalize data generation for an arbitrary set of sites

Being now comfortable with the model we built, we shift our focus to the data collection process. Up to this moment, the needed data was given for a fixed list of sites, but as we want our prototype to be general, now we take charge of it and create a class that produces the distance data we need for an arbitrary list of sites. This is a key step towards having an MVP. You can see how it’s done at:

Compute the Distance Matrix of a Set of Sites from Their Coordinates in Python

Estimate the distance between any pair of sites from their geographical coordinates as a stepping stone to solving…

towardsdatascience.com

Sprint 5: Build an optimizer class that can be fitted to location data to solve TSPs automagically

With the two basic ingredients ready — a general optimization model and a mechanism to compute the distance matrix for arbitrary sites — we are in a position to assemble the two together into a powerful optimizer, an object that abstracts away the details and lets us solve TSPs in one-liners. Learn how it’s done and see the versatility of such an optimizer for yourself at:

A classy approach to solving Traveling Salesman Problems effectively with Python

Implementing the TSP model in a scikit-learn-like fashion to ease the building and solving of routing optimization…

towardsdatascience.com

Sprint 6: Visualizing Routes on Interactive Maps with Python: Part 1

The TSP optimizer is now ready, and it produces the results we want, but we can’t consider the MVP ready without a way to visualize in a meaningful manner the results produced by it. Displaying the optimal tours produced by the optimizer is essential for final verification and assessment of the results, something which is explained, step-by-step, at:

Visualizing Routes on Interactive Maps with Python: Part 1

A pragmatic guide to interactive data visualization for transportation problems with Folium

towardsdatascience.com

Sprint 7: Modeling the Traveling Salesman Problem with Optional Visits in Python

Having an MVP that generates optimal tours for travelers wishing to visit some sites, the time is due to add a more realistic functionality: in real life, travelers cannot possibly visit all the sites they desire to, and we need to take that fact into account when modeling the problem. In this article, we extend the TSP model to permit optional visits in a context where the total tour time is limited, while still producing minimal-distance tours that visit as many sites as possible, something closer to the traveler’s actual plans.

Coming 🔜…

Sprints 8, 9, 10, …, k: [🔐Classified (in the backlog, really)]

Follow and subscribe to not miss the upcoming “releases”! 📬😉

4. Conclusion (or planning for next sprint)

As you now know, this stage of adequately describing the “business problem” is crucial for setting up a good start. We have taken a real-life problem, with all of its nuances and complexities, and we have described it precisely. Then, we derived a simplified version from it that we could tackle, the TSP, by cutting out unneeded difficulties. But these difficulties were superfluous only at the beginning. They will return, gradually, as we expand our proof-of-concept to encompass ever-increasing details of the original real-life problem. This gradual expansion of the proof-of-concept is covered in the next articles of the series. Along those articles, we will be breaking down hard things into not-so-hard things, solving them, and putting them back together into systems that solve the initial hard things. Having said that, the wisest step we can take right now into our adjacent possible is modeling the Traveling Salesman Problem for our specific Paris sites (and then solving it). Stay tuned!

Modeling the Traveling Salesman Problem from first principles

A concepts-first, math-second approach to modeling the most well-known routing problem in Operations Research

towardsdatascience.com

Thanks for reading! 😊 Please feel free to follow me, ask me questions, give me feedback in the comments, or contact me on LinkedIn.

Plan optimal trips automatically with Python and Operations Research: from whiteboard to dashboard

A step-by-step guide to framing, modeling, analyzing and solving tourism logistics problems by applying optimization and artificial intelligence, the agile way

Table of contents

Manifesto

1. The trip that sparked a journey

2. Sprint 1: Framing the problem as an Operations Research project

3. The project timeline

4. Conclusion (or planning for next sprint)

Manifesto

1. The trip that sparked a journey

1.1. The inception: The holidays that got me working

1.2. The problem: one-time-only plannings do not exist

1.3. The solution: automate decision making

2. Sprint 1: Framing the problem as an Operations Research project

2.1. Making things as simple as possible, and then simpler

2.2. Identifying the Minimum Valuable Problem

2.3. Problem-solving from first principles

3. The project timeline

Sprint 1: Description of the Minimum Valuable Problem for a tourist

Sprint 2: Build a model of the Traveling Salesman Problem (TSP)

Modeling the Traveling Salesman Problem from first principles

A concepts-first, math-second approach to modeling the most well-known routing problem in Operations Research

Sprint 3: Code the TSP model in Python and analyze its results

Implementing, solving and visualizing the Traveling Salesman Problem with Python

Learn how to translate an optimization model from Math to Python, optimize it, and visualize the solution to gain quick…

Sprint 4: Generalize data generation for an arbitrary set of sites

Compute the Distance Matrix of a Set of Sites from Their Coordinates in Python

Estimate the distance between any pair of sites from their geographical coordinates as a stepping stone to solving…

Sprint 5: Build an optimizer class that can be fitted to location data to solve TSPs automagically

A classy approach to solving Traveling Salesman Problems effectively with Python

Implementing the TSP model in a scikit-learn-like fashion to ease the building and solving of routing optimization…

Sprint 6: Visualizing Routes on Interactive Maps with Python: Part 1

Visualizing Routes on Interactive Maps with Python: Part 1

A pragmatic guide to interactive data visualization for transportation problems with Folium

Sprint 7: Modeling the Traveling Salesman Problem with Optional Visits in Python

Sprints 8, 9, 10, …, k: [🔐Classified (in the backlog, really)]

4. Conclusion (or planning for next sprint)

Modeling the Traveling Salesman Problem from first principles

A concepts-first, math-second approach to modeling the most well-known routing problem in Operations Research

Written by Carlos J. Uribe