Maritime Professionals: Top 5 Python Libraries

Most of my professional life was spent at sea. I list the top five Python libraries I find most useful when communicating concepts within the maritime industry.

Jordan Taylor
Shipping Intel
6 min readJun 27, 2024

--

Dall-E

Introduction

I love expressing maritime concepts programmatically. It brings together subject matter that I feel I have command over, and may not have been unpacked fully by the broader tech community. Every time I sit with my development environment I feel like an explorer seeing things for the first time.

For those new to coding, and are involved in the maritime industry, I list my top five non-native packages in Python. I scanned past project requirements and below is a list of what I found the most of:

#5: PyMuPDF

https://pymupdf.readthedocs.io/en/latest/#

PyMuPDF is an easy-to-use document handling tool. If you have a bunch of portable documents, and you want to quickly search them in order to extract information, PyMuPDF is a good choice.

When I was on tankers, we had stacks of papers as part of our pre- and post-transfer meetings, scanned copies of which would eventually make its way to the home office. Some of the documents contained valuable commercial information. PyMuPDF is your tool to digitize and organize this information for further analysis.

What I like about PyMuPDF is the transcription ability in terms of spelling and grammar in the context of maritime technical corpora.

To start import fitz:

pip install fitz

As an example I can scan ASBATANKVOY (found using a cursory Google search) as a portable document (PDF) apply it to my local directory.

import fitz  # This is PyMuPDF

def extract_text_from_pdf(pdf_path):
document = fitz.open(pdf_path)
text = ""
for page in document:
text += page.get_text()
document.close()
return text

# Replace 'asbatankvoy.pdf' with the path to your PDF file
pdf_path = '/asbatankvoy.pdf'
pdf_text = extract_text_from_pdf(pdf_path)

The output will result in a text file in your local path. The text file is then ready for further analysis.

#4: Folium

My primary duty as a watch officer is knowing where I am, and where I will be at any particular moment. For this you 1. need a position and 2. a chart to put the position on. Most maritime information systems have this capability.

To start import folium and pandas:

pip install folium pandas

As an example I will extract the latitude and longitude of the Houston ship channel passing Galveston: Latitude 29° 21' N and longitude 094° 45.4' W. This can be obtained by using Google Maps and right clicking on the map.

import pandas as pd
import folium

# Save coordinates to Feather
data = {'latitude': [29.351049238059087], 'longitude': [-94.75464998805768]}
df = pd.DataFrame(data)
df.to_feather('coordinates.feather')

# Load coordinates from Feather
loaded_df = pd.read_feather('coordinates.feather')

# Create a map centered around the coordinates
map = folium.Map(location=[loaded_df['latitude'][0], loaded_df['longitude'][0]], zoom_start=12)

# Add a marker for the location
folium.CircleMarker(location=[loaded_df['latitude'][0], loaded_df['longitude'][0]], radius=1, color='black').add_to(map)

# Display the map
map.save('map.html')

Opening map.html in your local directory will result in:

Credit: Folium

#3: Requests

Whenever I mention application program interfaces (API) to non-tech people they immediately tune out. An API is not complicated: it’s simply a way to get information automatically from a trusted source. It’s sort of like looking at a GPS and finding a position.

Requests is your tool for importing APIs.

Using ShippingIntel’s API connection, we typically call APIs as so:

curl -X POST https://www.shippingintel.com/api/calculate_distance -d "port1=Houston" -d "port2=Rotterdam"

Which can be converted to Python using requests:

import requests

# Define the URL
url = 'https://www.shippingintel.com/api/calculate_distance'

# Set up the data payload
data = {
'port1': 'Houston',
'port2': 'Rotterdam'
}

# Make the POST request
response = requests.post(url, data=data)

# Print the response text (or JSON)
print(response.text) # If response is JSON, use response.json()

The above code will result in around 5,000 nautical miles.

#2: Searoute

Thank you Julian Gaffuri and Gent Halili for making Eurostat’s MARNET available to developers. You guys are awesome.

Searoute will find the oceangoing route from any two points on the earth.

pip install searoute

To find the distance from Rotterdam to China, using Halali’s example in his documentation:

import searoute as sr

# Define origin and destination points:
origin = [0.3515625, 50.064191736659104]
destination = [117.42187500000001, 39.36827914916014]

# Calculate the route in nautical miles:
route = sr.searoute(origin, destination, units="nm")

# Print the route length in nautical miles:
print("{:.1f} {}".format(route.properties['length'], route.properties['units']))

Results in 14,576 nautical miles. Searoute can also return hours transited when given speed, and waypoints.

You can use Searoute in combination with Folium!

#1: Pandas

Dall-E

What’s the mariner’s gateway drug to data science? It’s Excel of course. Pandas is Excel for Python.

When I first investigated Pandas, I hated it. It seemed like another language embedded into Python. Who has time for that? But as I got further along I saw that Pandas is helpful when working with large maritime data, and indispensable with large AIS data. If you are a serious maritime data analyst, you will need to get to know Pandas.

To import:

pip install pandas

As an example, I want to sort vessels by age. I want to know which ones are less than 15 years old and which ones are older than 15 years.

import pandas as pd

# Create a DataFrame
data = {'Vessel': ['AlphaBravo', 'CharlieDelta', 'EchoIndia'], 'Age': [5, 20, 20], 'Owner': ['DK Shipping', 'NL Shipping', 'US Shipping']}
df = pd.DataFrame(data)

# Display the DataFrame
print("Original DataFrame:")
print(df)

# Filter the DataFrame to find ships over 15 years old
filtered_df = df[df['Age'] > 15]

# Display the filtered DataFrame
print("\nFiltered DataFrame (Age > 15):")
print(filtered_df)

Which results in:

/usr/local/bin/python3.11 /Users/jordantaylor/scratch.py 
Original DataFrame:
Vessel Age Owner
0 AlphaBravo 5 DK Shipping
1 CharlieDelta 20 NL Shipping
2 EchoIndia 20 US Shipping

Filtered DataFrame (Age > 15):
Vessel Age Owner
1 CharlieDelta 20 NL Shipping
2 EchoIndia 20 US Shipping

Process finished with exit code 0

When mentioning Pandas, Feather should be mentioned as well. You can find a maritime-related article on Feather here.

Honorable Mention: Flask

It can be lonely coding maritime stuff and not be able to show it to your colleagues or friends.

Flask allows you to make a fully functioning web application using the above tools. A bit too much to get into in this article, but once you have middling competency in Python I would encourage you to move on to Flask.

Nope: Shapely, Geopy

Why not mention geographic information tools?

Because much of what we do in the maritime domain is basic math. The cost involved in using dedicated packages like shapely versus numpy or math is not worth the effort, in my opinion.

For example, when locating a point within a polygon, the speed of a ray casting algorithm is 0.00003 seconds over a dozen iterations. The speed of Shapely was 0.12 seconds over a dozen iterations, or over 4,000 times slower than the ray casting algorithm. Third party GIS tools are expensive.

Conclusion

My top five Python libraries are PyMuPDF, Folium, Requests, Searoute, and Pandas. Flask is excellent when you want to represent your findings in a browser, or allow input from a user to work with your models.

Geographic information (GIS) tools in my opinion aren’t that useful. Just code out your GIS solutions using the native math function or numpy.

If you are new to Python fear not. It’s easy to learn and, in my opinion, enjoyable. Here is an article on how to start.

References

Eurostat. (n.d.). European Commission. Retrieved June 26th, 2024, from https://ec.europa.eu/eurostat

Halili, G. (2022). Searoute. Retrieved June 26th, 2024, from https://pypi.org/project/searoute/

Kuhl, J. & McKay, R. (n.d.). PyMuPDF. Retrieved from https://pymupdf.readthedocs.io/en/latest/

Gaffuri, J. (2021). Eurostat/Searoute: Compute shortest maritime routes between ports. GitHub. Retrieved June 26th, 2024, from https://github.com/eurostat/searoute

Grinberg, M. & et al. (n.d.). Flask. Retrieved from https://flask.palletsprojects.com/en/latest/

McGibbon, R. & et al. (n.d.). Folium: Python Data. Leaflet.js Maps. Retrieved from https://python-visualization.github.io/folium/

McKinney, W. & others (n.d.). pandas: powerful Python data analysis toolkit. Retrieved from https://pandas.pydata.org/pandas-docs/stable/index.html

Reitz, K. & et al. (n.d.). Requests: HTTP for Humans. Retrieved from https://docs.python-requests.org/en/latest/

Shipping Intel. (2024). Shipping Intelligence. Retrieved from https://www.shippingintel.com/

--

--

Jordan Taylor
Shipping Intel

Merchant marine officer with a B.S. in Marine Transportation and a M.S. in Transportation Management.