Merchant Navy’s Top 6 Lessons for Machine Learning
The scale at which they operate can teach you a lot about designing your ML Pipelines
Recently, I had the opportunity to talk to Capt. Ravi Budhwar, a captain works in the Merchant Navy. Captain Budhwar has spent many years on the seas and is an expert in all the intricacies of Maritime Trade and Shiping. He is currently sailing with Fleetship Management, a leader in the space. He has also won awards throughout his career. Talking to him was very insightful because as a land dweller, it can be easy to forget about Shipping. However, this is a crucial industry. In fact, the global merchant fleet in 2020 shipped 2 Billion Deadweight Tons. Without shipping, modern supply chains would not exist.
As Captain Ravi explained the scale of the shipping industry, I was surprised. A middle-sized ship can weigh over 90,000 tons and is on the water constantly for months on end. The crew on board has to encounter challenges such as rough weather, constantly changing seascape, 24 hour navigation, changing time zones. If the ship has problems mid-voyage, repairs can be challenging. Most impressive is how they can manage all this with small crews (the average size of the crew on these ships is between 20 and 30).
The conversation about the scale of his operations and the efficiency of his crew had me very interested in the systems that made this possible. I wanted to know how the ships and the industry had created systems to solve such complex problems. The ships have several fantastic lessons in organization and pipeline design that we can use for our Machine Learning Systems. I will go over the ones I found most interesting.
Clearly Defined Responsibility + Priorities
One of the most striking features of a ship is that every member of the crew knows exactly what they have to do. Each crew member is allocated a specific shift, and their duties are written down in painstaking detail. This extends even to emergency responsibilities. The high level of detail has many benefits. The most obvious is in cases of emergencies, where instead of wasting time, the crew can spring into action immediately.
Even on the most mundane days, this plays a crucial role. Since everyone knows exactly what they need to do at different times, there is very little time wasted in idling. The tasks are done like clockwork, in a timely and effective manner. This also leads to increased accountability. If a task isn’t done, the crew knows exactly who was responsible and this makes diagnosing the problem much faster.
This is something that a lot of organizations could use. In my experience, the management often has half-baked ideas for projects. A lot of time is wasted in retrieving datasets (no one knows which ones will be relevant), ironing out the details with stakeholders, and redoing entire segments because requirements suddenly change. I once spent a whole week waiting to learn what I had to do. A more clearly defined and rigorous division of labor with clearly defined roles and responsibilities would serve companies/organizations well.
Leading off the last point, I noticed that the Merchant Navy is …a little extreme about documenting things. Every day, all the happenings and weather conditions are recorded in a logbook. Aside from this, the crew compiles all the work done in the day and sends a presentation to the captain. There are separate weekly and monthly reports aside from this.
This is still somewhat sane. The navy, however, is slightly neurotic about the documentation. Every door is labeled. You can tell what room you’ll enter, before even entering it. The elevators have a directory of all the floors, so you can figure out which floor you need to go to. There are huge files to keep note of every little decision made on the ship.
Fortunately, we don’t have to be this militant to avail of the benefits. Sensible documentation covering the projects, the data sources used, features selected and dropped, protocols used, etc are enough. Unfortunately, most firms (including big names) need to invest a lot more into the documentation aspect. Otherwise, there is a lot of time wasted redoing procedures and tests already completed by the previous team. This is time spent reinventing the wheel. Some upfront investment in documentation will save tons more resources in the future.
Planned Maintenance System
Looking at the scale of the ships, I was curious about one thing: “How do you stop things from breaking?” This is especially important for ships. Remember each of these ships is at sea, where repairs in case of problems are not easy. The size of these giants also means that dragging them from deep waters would be a challenge (cargo ships easily dwarf Aircraft Carriers). And there’s the cost. A delay by a day can cause a loss of Millions of Dollars.
There is a safety issue as well. Failure of ships in the deep sea can cause serious danger to the safety of the crew. Based on the cargo, it will also adversely impact the marine biology of the surrounding area. Thus there is no margin for error.
So how does the crew of 20 maintain a ship of thousands of tons, intricately designed with thousands of pipes, drains, and other moving parts? Especially when some of these parts are bigger than the crew themselves (the engine in these ships is bigger than most houses). As a tech person, I was expecting tons of high-tech monitoring systems and AI monitoring. The solution was much simpler.
While the team does have tons of monitoring equipment, the secret lies in the proactive approach taken by the crew. Instead of waiting for signs that something is wrong, the team instead does regular checks of the parts, regardless of functioning. These checks, called Planned Maintenance Systems are planned well in advance. The precise division of labor allows the crew to work fast and check equipment much bigger than they are. PMS allows for long-distance shipping to be safe and cost-effective.
To quote a cliche, prevention is better than cure. This holds true for Machine Learning as well. That is why integrating multiple error metrics, and monitoring for data drift is crucial in your own projects. These steps are ignored by most people
Safety is the first priority for all sailors. This shows itself in various ways, from Stop Works (any crew member can stop their work if they find it unsafe) to the insistence on extreme attention to detail. The most glaring example of this is the redundancy built into the ships.
Over long distances, errors in one degree can throw you off course by thousands of miles. This is why the ships have multiple radars that are in use at any moment. Crew members use the readings from all of these to fine-tune their navigations. They also have different kinds of compasses, and of all else fails, tools to navigate using the stars.
Each ship also has 2 lifeboats. The total capacity of these lifeboats is mandated (by international law) to be 200% of the ship's capacity. In case of power failure, the ship has multiple generators (and backups to them).
How can you apply this to Machine Learning? First thing: cross-validate everything. Any time you use an ML model, you need to use cross-validation with it. It’s simple, it’s effective, and it is seriously underutilized. After that, start trying to test for different occurrences. Machine Learning is quite fragile and you should never call it quits after one configuration. The article, Why You Need to Spend More Time Evaluating your Machine Learning Models presents more details about how seemingly arbitrary decisions in setting up your Deep Learning/AI models can significantly skew the results.
Along with integrating randomness and variance into your data, using a more diverse set of evaluation protocols will also allow your models to have a more diverse “perspective”. Writing better Machine Learning evaluation protocols (Regression) gives the code skeleton you could use.
While it may be tempting to rush towards testing models, make sure you cover your bases first. A good model might improve your performance 5%, but if you have to retrain and rebuild constantly, your system will lose money. It’s not sexy to spend a lot of time on tons, and cross-validation/multiple splits, but it is crucial.
Money lies in scale
We have spent a lot of time talking about how big the shipping industry is. The world would not exist if not for the existence of large-scale shipping. The size and scale of these ships make them very valuable. Filled with cargo, a medium-sized ship is worth over a Billion Dollars. There are 20 countries with a lower GDP than one of these ships. This is also reflected in the compensation of the crew.
When you’re trying to work in Machine Learning, remember this. Your solutions won’t be truly valuable till they can operate at scale. When I was working with ICICI Bank, our dataset added 10 million samples every day. My work with ForeOptics was on a global scale (supply chain analysis). Johns Hopkins University required me to evaluate the health policy of an entire state.
The reason I stress robustness, generalization, and cost-effectiveness is precisely this. When working with very complex, high-dimensional data and/or tons of samples, using complex techniques will spiral your costs out of control. As shown many times throughout my articles and videos, simpler low-cost techniques will provide the best ROI. Google can afford to sink 10,000 training hours on multiple servers. Your group can’t.
A captain can do everything (doctor, navigation, etc etc)
This was probably the most surprising aspect. Captain Ravi told me that the ship captains also double up as Medical Professionals, Navigators, and whatever else is needed. They have the authority to sign off on birth, death, and marriage certifications. They also have to deal with customs and port authorities. They have ultimate authority (and responsibility) over the ship. If one of the crew members has any kind of problem (including technical issues), the captain steps in for them.
To do their job well, Captains must be able to do everything well. This makes captains like Capt. Ravi real-life heros, masters of multiple skills. It allows Captains to not only step in when needed but also truly understand the challenges faced by their crew. A well-rounded skill set also allows captains to truly anticipate problems and be proactive in creating solutions.
This approach works very well in tech. There is often a huge disconnect between management and developers, which leads to unreasonable expectations and very unclear instructions. There are only losers in such an arrangement. A manager with technical skills in a large scope of the project will be much better for leading. This doesn’t even require expertise in the domains, just a solid understanding.
This is also why I talk about the importance of Math, Programming, and Computer Science fundamentals in my article, How to learn Machine Learning in 2022. All of these will give you perspectives that crucial to ML (Machine Learning/Deep Learning/Artificial Intelligence are intersections of these fields). If you’re someone looking to develop your skills in the last 2 aspects, check out Coding Interviews Made Simple. It’s a weekly coding newsletter created by me. It is a proven way to use the discoveries I made from tutoring to help you easily boost your performance. Learn more about it here
Developing your skills in a variety of domains will allow you to connect the dots and exponentially improve your outcomes. Remember, Machine Learning is mostly a problem of decisions, and knowing more improves your decision-making.
The article is already running long, so I will stop here. Of course, this article doesn’t even scratch the surface of the various complexities of the shipping business and all there is to learn there. To learn more about that, feel free to reach out Capt. Ravi Budhwar. He is an experienced Mariner who has mentored many sailors, both personally and professionally. He is also involved with, The Plenum School, an educational institution revolutionizing education. He is always happy to talk to interested parties.
If you liked this article, check out my other content. I post regularly on Medium, YouTube, Twitter, and Substack (all linked below). I focus on Artificial Intelligence, Machine Learning, Technology, and Software Development. If you’re preparing for coding interviews check out: Coding Interviews Made Simple, my free weekly newsletter.
To help me write better articles and understand you fill out this survey (anonymous). It will take 3 minutes at most and allow me to improve the quality of my work.
Feel free to reach out if you have any interesting jobs/projects/ideas for me as well. Always happy to hear you out.
For monetary support of my work following are my Venmo and Paypal. Any amount is appreciated and helps a lot. Donations unlock exclusive content such as paper analysis, special code, consultations, and specific coaching:
Reach out to me
You can reach out to me on any of the platforms, or check out any of my other content. If you’d like to discuss tutoring, text me on LinkedIn, IG, or Twitter. Check out the free Robinhood referral link. We both get a free stock (you don’t have to put any money), and there is no risk to you. So not using it is just losing free money.
Check out my other articles on Medium. : https://rb.gy/zn1aiu
My YouTube: https://rb.gy/88iwdd
Reach out to me on LinkedIn. Let’s connect: https://rb.gy/m5ok2y
My Instagram: https://rb.gy/gmvuy9
My Twitter: https://twitter.com/Machine01776819
If you’re preparing for coding/technical interviews: https://codinginterviewsmadesimple.substack.com/
Get a free stock on Robinhood: https://join.robinhood.com/fnud75